ArticlePDF Available

Convergence of a particle approximation for the quasi-stationary distribution of a diffusion process: Uniform estimates in a compact soft case

Authors:

Abstract

We establish the convergences (with respect to the simulation time t; the number of particles N; the timestep γ\gamma) of a Moran/Fleming-Viot type particle scheme toward the quasi-stationary distribution of a diffusion on the d-dimensional torus, killed at a smooth rate. In these conditions, quantitative bounds are obtained that, for each parameter (tt\rightarrow \infty, NN\rightarrow \infty or γ0\gamma\rightarrow 0) are independent from the two others. p, li { white-space: pre-wrap; }
ESAIM: PS 26 (2022) 1–25 ESAIM: Probability and Statistics
https://doi.org/10.1051/ps/2021017 www.esaim-ps.org
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE
QUASI-STATIONARY DISTRIBUTION OF A DIFFUSION PROCESS:
UNIFORM ESTIMATES IN A COMPACT SOFT CASE
Lucas Journel1and Pierre Monmarch´
e2,*
Abstract. We establish the convergences (with respect to the simulation time t; the number of
particles N; the timestep γ) of a Moran/Fleming-Viot type particle scheme toward the quasi-stationary
distribution of a diffusion on the d-dimensional torus, killed at a smooth rate. In these conditions,
quantitative bounds are obtained that, for each parameter (t ,N or γ0) are independent
from the two others.
Mathematics Sub ject Classification. 65C35, 65C40, 60J22.
Received November 8, 2019. Accepted December 17, 2021.
1. Introduction
1.1. The problem
Start from the diffusion on the d-dimensional periodic flat torus Td
dZt=b(Zt)dt+ dBt(1.1)
with b C1(Td), where (Bt)t>0is a d-dimensional Brownian motion. Add a killing rate λ C(Td) and, given a
standard exponential random variable Eindependent from (Zt)t>0, define the death time
T= inf t>0, E 6Zt
0
λ(Zs)ds.(1.2)
Then a probability measure νon Tdis said to be a quasi-stationary distribution (QSD) associated to the SDE
(1.1) and the rate λif
Law(Z0) = ν t>0,Law(Zt|T > t) = ν .
Keywords and phrases: Quasi-stationary distribution, interacting particle system, Wasserstein distance, couplings, propagation
of chaos.
1LJLL, Sorbonne Universit´e.
2LJLL and LCT, Sorbonne Universit´e.
*Corresponding author: pierre.monmarche@sorbonne-universite.fr
c
The authors. Published by EDP Sciences, SMAI 2022
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
2L. JOURNEL AND P. MONMARCH ´
E
In our case, there exists a unique QSD νand, whatever the initial distribution η0of Z0,
Law(Zt|T > t)
t→∞ ν
see e.g. [10], Thm. 2.1) or Corollary 2.15 below.
The present work is dedicated to the proof of convergence of an algorithm designed to approximate ν.
This is classically done through a system of Ninteracting particles whose empirical measure converges to
Law(Zt|T > t) as N , where killed particles are resurrected in a suitable way in order to keep constant
the size of the system (while a naive Monte Carlo simulation would see the sample shrink along time). This
question has already been addressed by many authors in various contexts, see the discussion in Section 1.4
below. Before introducing the algorithm, stating our results and comparing them with previous works, for now,
let us simply highlight the main specificities of the present work.
The first novelty is that we take into account the time-discretization of the continuous-time diffusion. That
way, we establish error bounds between the theoretical target QSD and the empirical measure indeed obtained
with an actual implementation of the algorithm. There are three sources of errors: first, the continuous-time
SDE (1.1) has to be discretized with some time-step parameter γ > 0. Second, as will be detailed below, a
non-linearity in the theoretical algorithm has to be approximated by a system of Nparticles. This leads to the
definition of an ergodic Markov chain whose invariant measure is close, in some sense, for large N, to the QSD
of the time-discretization of the diffusion. But then this Markov chain is only run for a finite simulation time
t=,mN. A third error term then comes from the fact that stationarity is not fully achieved. We will
obtained quantitative error bounds in γ,Nand t.
A second specifity is that the bound obtained for each parameter will be uniform in the other two. For
instance, the only other work in which the long-time convergence of the chain is proven to be, under some
(restrictive) conditions, uniform in N, is [13] in a finite state space. Besides, our work is quite close in spirit to
this work of Cloez and Thai. The question of the dependency or uniformity of the estimates in other previous
works will be further discussed in Section 1.4.
Finally, although it was not the primary motivation of the present work, it seems that the particular definition
of the system of interacting particles considered here, in particular the rebirth mechanism, was not considered
in previous works (where, basically, killed particles are resurrected at the position of one of the other particles).
Our variant is initially motivated by the property stated in Proposition 1.1 below, which has been indicated to
the second author by Bertrand Cloez. Yet, this variant has the unintended advantage to be both discrete in time
and non-failable, in the sense that it is well-defined for all times, even though all particles die simultaneously
from time to time (see also [37] on this question).
Note that we restrict the study to a compact state space. Moreover, we only consider soft killing at some
continuous rate, and no hard killing which would correspond to the case where Tis the escape time from some
sub-domain (see e.g. [4,21]). Finally, as will be seen below, as far as the long-time behaviour of the process
is concerned we will work in a perturbative regime, namely we will assume that the variations of λare small
with respect to the mixing time of the diffusion (1.1) (while kλkitself is not required to be small). These
very restrictive conditions, which rule out many cases of practical interest, have to be considered in light of our
very strong results (Thms. 1.2 and 1.3 below and all the corollaries of Sect. 2.5, gathered in Fig. 1). In fact,
although already interesting by itself, this restricted framework can be thought as a toy model motivated in
particular by the case that arises in the parallel replica algorithm [29]. In that case, Tis the escape time for
(1.1) from a bounded metastable domain, so that the lifespan of the process is expected to be larger than its
mixing time (and to depend little from the initial condition, given it is far enough from the boundary). Hence,
the compact and perturbative assumptions are consistent with this objective. The restriction to smooth killing
rate, however, is made to avoid additional difficulties in the hard case where, even in the metastable case, the
probability to leave the domain is high (and exhibits high variations) when the process is close to its boundary.
The initial motivation of the present study was to test the general strategy of the proof (via coupling aguments)
in a first simple case, with the goal of extending it later on to the metastable hard case by combining it with
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 3
some Lyapunov arguments to control the variations of the killing rate near the boundary. This is postponed for
future work.
This work is organized as follows. The algorithm and main results are presented in Section 1.3, and the
relation with previous works is discussed in Section 1.4. Section 2contains the proofs, and more precisely: a
general coupling argument, which is the central tool for all our results, is presented in Section 2.1; the basic
bounds in terms of t+,N+and γ0 are then stated and proven respectively in Sections 2.2,2.3
and 2.4; finally, these basic results are combined in Section 2.5, concluding the proofs of the main theorems and
inducing a number of corollaries.
1.2. Notations and conventions
We respectively denote P(F) and B(F) the set of probability measures and of Borel sets of a Polish space F.
Functions on Tdare sometimes identified to [0,1]d-periodic functions, and similar non-ambiguous identifications
are performed, for instance if xTdand Gis a d-dimensional standard gaussian random variable, x+Ghas to
be understood in Td, etc. A Markov kernel Qon Fis indiscriminately understood as, first, a function from Fto
P(F), in which case we denote Q:x7→ Q(x, ·) (where Q(x, ·) denotes the probability A B(F)7→ Q(x, A)
[0,1]); second, a Markov operator on bounded measurable functions on F, in which case we denote Q:f7→ Qf
(where Qf(x) = Rf(w)Q(x, dw)); third, by duality, a function on P(F), in which case we denote Q:µ7→ µQ (so
that µ(Qf)=(µQ)f). In particular, Q(x, ·) = δxQfor xF. If µ P(F) and kN, we denote µk P(Fk)
the law of a k-uplet of independent random variables with law µ. Similarly, if Qis a Markov kernel on F, we
denote Qkthe kernel on Fksuch that Qk(x, ·) = Q(x1,·) ·· · Q(xk,·) for all x= (x1, . . . , xk)Fk. We
denote E(1) the exponential law with parameter 1, U(I) the uniform law on a set Iand N(m, Σ) the Gaussian
law with mean mand variance matrix Σ. We use bold letters for random variables in TdN and decompose them
in d-dimensional coordinates, like X= (X1, . . . , XN) with XiTd, or X1= (X1,1, . . . , XN,1).
1.3. The algorithm and main result
Starting from the diffusion (1.1) killed at time Tgiven by (1.2), we introduce two successive approximations.
The first is time discretization. For a given time step γ > 0 and a sequence (Gk)kNof independent random
variables with law N(0, Id), we consider the Markov chain on Tdgiven by ˜
Z0=Z0and
kN,˜
Zk+1 =˜
Zk+γb(˜
Zk) + γGk(1.3)
and, given E E(1) independent from (Gk)kNand Z0,
˜
T= inf (t=, n N, E 6γ
n
X
k=1
λ(˜
Zk)).
From classical results for Euler schemes of diffusions (see e.g. [33]), it is quite clear that, for any A B(Td) and
all t>0,
P˜
Zbt/γcA, ˜
T < t
γ0
P(ZtA, T < t),
from which, for all t>0,
Law ˜
Zbt/γc|˜
T < t
γ0Law (Zt|T < t)
(we will prove this, see Cor. 2.11 below). Note that, from the memoryless property of the exponential law, given
a sequence (Uk)kNof independent variables uniformly distributed over [0,1] and independent from (Gk)kN
4L. JOURNEL AND P. MONMARCH ´
E
and Z0, then (( ˜
Zn)nN,˜
T) has the same joint distribution as (( ˜
Zn)nN,ˆ
T) with
ˆ
T= inf nt=, n N, Un6p(˜
Zn)o
where p(z)=1exp(γλ(z)) is the probability that, arriving at state z, the chain is killed.
A naive Monte Carlo sampler for the QSD would be to simulate Nindependent copies of the chain (1.3) killed
with probability z7→ p(z) and to consider after a large number of iterations the distribution of the copies that
have survived. However, after a long time, most copies (possibly all) would have died and the estimator would
be very bad. To tackle this issue, we have to introduce a rebirth mechanism to reincorporate dead particles in
the system.
Denote K:Td P(Td) the Markov kernel associated with the transition (1.3), i.e.
Kf (x) = (2π)d/2ZRd
f(x+γb(x) + γy)e1
2|y|2dy .
For µ P(Td), let Qµbe the Markov kernel such that, for all xTd,Qµ(x, ·) is the law of the random
variable Xdefined as follows. Let (Xk, Uk)kNbe a sequence of independent random variables such that, for
all kN,Xkand Ukare independent, Uk U([0,1]) and X0K(x, ·) while, for k>1, XkµK. Let
H= inf{kN, Uk>p(Xk)}, and set X=XH. Since λis bounded, pis uniformly bounded away from 1 and
thus His almost surely finite, so that Qµis well-defined.
In other words, a random variable XQµ(x, ·) may be constructed through the following algorithm (in
which new means: independent from all the variables previously drawned).
1. Draw X0 N (x+γb(x), γId) and a new U0 U([0,1]).
2. If U0>p(X0), set X=X0in Td(in that case, we say the particle has moved from xto X0without dying).
3. If U0< p(X0) then set i= 1 and, while Xis not defined, do:
(a) Draw a new X0
idistributed according to µ, a new Xi N (X0
i+γb(X0
i), γId) and a new Ui U([0,1]).
(b) If Ui>p(Xi), set X=Xiin Td(in that case, we say the particle has died, resurrected at X0
i, moved
to Xiand survived).
(c) If Ui< p(Xi), set ii+ 1 (in that case, we say the particle has died, resurrected at X0
i, moved to Xi
and died again) and go back to step (a).
From this, we define a chain (Yk)kNas follows. Set Y0=Z0and suppose that Ykhas been defined for some
kN. Let ηk=Law(Yk), and draw a new Yk+1 Qηk(Yk,·). This somewhat intricate definition is motivated
by the following results (whose proof is postponed to Sect. 2):
Proposition 1.1. For all nN
ηn=Law ˜
Zn|˜
T > .
In particular, as n , the law ηnof Ynconverges toward the QSD of ˜
Z. Unfortunately, it is impossible to
sample (Yk)kNin practice since this would require to sample according to ηkfor any kN. This is a classical
case of a time-inhomogeneous Markov chain which is interacting with its own law or, similarly, of a measure-
valued sequence (ηk)kNwith a non-linear evolution. Such processes arise in many applications, see e.g. [15,16]
and references within. Motivated by the Law of Large Numbers, we are lead to a second approximation, which
is to use mean-field interacting particles. For a fixed NNand for x= (xi)i[[1,N ]] TdN , we denote
π(x) := 1
N
N
X
i=1
δxi P(Td)
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 5
the associated empirical distribution. Then we define the Markov operator Ron TdN as
R(x, ·) = Qπ(x)(x1,·) · ·· Qπ(x)(xN,·).
In other words, a random variable YQ(x, ·) is such that the Yi’s are independent with YiQπ(x)(xi,·). In
order to specify the parameters involved, we will sometimes write RN for R.
Let us informally describe the transitions of such a Markov chain (Xk)kN: the ith particle follows the
transition given by (1.3) independently from the other particles until it dies. If it dies at a step kN, then it
is resurrected on another particle XJ,k1with Juniformly distributed over [[1, N]] (in particular and contrary
to most previous works on similar algorithms, J=iis not excluded, although it doesn’t change much since its
probability vanishes as N ) and immediatly performs a step of (1.3); if it dies again after this unique step,
it is resurrected again and performs a new step, and so on until it is not killed after a resurrection and an Euler
scheme step. Then this is the new value Xi,k from which the particle follows again the transitions (1.3) until its
next death, etc.
Note that there is no problem of simultaneous death since at step kthe particles are resurrected on positions
at step k1, which are well-defined even if all particles die at once at step k.
It is easily seen that Radmits a unique invariant measure toward which the law of the associated Markov
chain converges exponentially fast (in the total variation sense for instance), but a naive argument yields a
convergence rate that heavily depends on N(and possibly γ). Similarly, classical studies can be conducted for
the limits N and γ0 but again with estimates that are typically exponentially bad with respect to
the total simulation time (see the references in Sect. 1.4 or Props. 2.5 and 2.8). In the following we will focus
on a somewhat perturbative regime under which we will establish estimates for each of these limits that are
uniform with respect to the other parameters. Even for the continuous-time process (corresponding to γ= 0 ,
see Sect. 2.4 for the definition), such uniform results are new (see Cors. 2.10 and 2.12).
Recall that the W1Wasserstein distance between µ, ν P(Td) is defined by
W1(µ, ν) = inf {E(|XY|) : Xµ, Y ν}.
More generally, for ρa distance on some Polish space F, denote Wρthe corresponding Wasserstein distance on
P(F), defined by
Wρ(µ, ν) = inf {E(ρ(X, Y )) : Xµ, Y ν}.(1.4)
If Xµand Yν, we call (X, Y ) a coupling of µand ν. If (X, Y ) is a coupling for which the infimum in
(1.4) is attained, we say that it is an optimal coupling. From ([39], Cor. 5.22), such an optimal coupling always
exists.
Our first main result is a long-time convergence rate uniform in N:
Theorem 1.2. There exist c1, c2, γ0>0and a distance ρon Tdequivalent to the Euclidean distance, that
depend only on the drift band the dimension d, such that, if λis Lipschitz with a constant Lλand
κ:= c1c2Lλeγkλk,(1.5)
then the following holds: for all γ(0, γ0],NNand all µ, ν P(TdN ), considering the distance ρN(x, y) =
PN
i=1 ρ(xi, yi)for x, y TdN ,
WρN(µRN,γ , ν RN,γ)6(1 γκ)WρN(µ, ν).
6L. JOURNEL AND P. MONMARCH ´
E
As a consequence, there exists C > 0that depends only on band dsuch that for all γ(0, γ0],m, N Nand
all µ, ν P(TdN ),
W1µRm
N,γ , ν Rm
N,γ 6CN (1 γκ)m.
This means that, with respect to the metric ρN,RN,γ has a Wasserstein curvature of γκ in the sense of [28].
Theorem 1.2 is proven in Section 2.2. From this first result, similar bounds can be obtained for large Nand
small γ(see Sects. 2.3 and 2.4). Combining all these results eventually yields a quantitative bound on the error
made in practice by approximating νby the empirical distribution of the particular particle system:
Theorem 1.3. Under the conditions of Theorem 1.2, suppose that κgiven by (1.5)is positive. There exists
C > 0such that for all NN,γ(0, γ0],t>0and µ0 P(TdN ), if (Xk)kNis a Markov chain with initial
distribution µ0and transition kernel RN,γ ,
EW1π(Xbt/γc), ν 6Cγ+α(N) + eκt ,
where
α(N) =
N1/2if d= 1 ,
N1/2ln(1 + N)if d= 2 ,
N1/d if d > 2.
All the constants in Theorems 1.2 and 1.3 (and all other results stated in this work) are explicit. More
precisely, c1and γ0come from Corollary 2.2 of [32] (see Prop. 2.1) where an explicit value is given, and all the
other constants involved in our results can be tracked by following the explicit computations.
In Theorem 1.3, the speeds of the different convergences (exponential in the simulation time, with the square-
root of the timestep and with αof the number of particles) are optimal since they are optimal for non-interacting
diffusions (i.e. the case λ= 0), see in particular [26] for the large Nasymptotic.
Other intermediary results will be established in the rest of the paper that are interesting by themselves:
propagation of chaos (i.e. N ) and continuous-time limit at a fixed time (even without the condition
κ > 0) respectively in Propositions 2.5 and 2.8. From that, results for the continuous-time process (γ= 0), the
equilibria (t=) or the non-linear process (N=), or when two parameters among three are sent to their
limits, are then simple corollaries, see Section 2.5. All these results are summarised in Figure 1at the end of
this work.
Note that exp(γλ(x)) is the probability that the chain is not killed when it arrives at state x. The time
step γshould be chosen in such a way that this probability is relatively large, say at least one half. In that case,
exp(γkλk) is typically close to 1. In other words, the positivity of κgiven by (1.5) is mostly a condition about
Lλbeing small enough.
This perturbation condition is different from the one considered in [34], where kλkrather than Lλis
supposed to be small (while our main arguments are a direct adaptation of the coupling arguments of [34]).
This difference comes from the fact that, in the present study, we work with the W1distance rather than the total
variation one (which is a Wasserstein distance but associated to the discrete metric d(x, y) =
1
x6=y). Indeed, in
our coupling arguments, we need to control |λ(x)λ(y)|the difference between the death rates of two processes
at different locations, which is bounded here by Lλ|xy|and in [34] by 2kλk
1
x6=y. In fact our argument
for the long-time convergence may easily be adapted to the total variation distance framework, following [34].
Nevertheless this would be more troublesome in the study of the limit N . Then, one needs to couple ηk
(that admits a density with respect to the Lebesgue measure) with π(Xk) (which is a sum of Dirac masses),
so that the total variation distance is not adapted. This may be solved by considering W1total variation
regularization results for (Euler schemes of) diffusions, that can be established by coupling arguments again.
Nevertheless, in order to focus on the other difficulties of the problem and for the sake of clarity, we decided to
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 7
stick to the W1distance in all the different results of this work. Similar Wasserstein coupling arguments have
been used in [13] on a similar problem (see next section) and in [41] for a different kind of mean-field interacting
particle system (also with a similar perturbative condition corresponding to the fact σin ([41], Prop. 3.1) has
to be positive, i.e. the interaction should be small with respect to the independent mixing).
Notice that, among all possible discretization schemes, we only considered the explicit Euler-Maruyama
one. This choice was made for simplicity, but the proofs could be extended to other usual schemes. The main
ingredient required is a Wasserstein curvature of order γfor a modified W1distance (see Prop. 2.1, based on ([32],
Cor. 2.2)). Similarly, we only considered the case of an elliptic diffusion process with a constant diffusion matrix
for simplicity (since we use ([32], Cor. 2.2) which covers this case), although a similar Wasserstein contraction
certainly holds in a much more general framework (even hypoelliptic non-elliptic, as in continuous-time settings
[22]). As stated in the introduction, the present paper does not aim at the broadest generality, and by avoiding
technicalities we want to highlight the main issue (i.e. the question of the uniformity of bounds in the various
parameters).
1.4. Related works
The use of particle systems with death and re-birth to approximate the QSD of a Markov process has been
introduced in [7], for two-dimensional Brownian motions killed at the boundary of a box. This work refers
to the system as a Fleming Viot process. However, in the lecture notes of Dawson [14], a (continuous-time)
system of Nparticles that move independently according to some Markov dynamics and interact through a
sampling-replacement mechanism is called a Moran particle process, while the term Fleming-Viot process refers
to a measure-valued (continuous-time) process that can be obtained as the limit of the Moran particle system
as the number of particles goes to infinity. Besides, with these definitions, the empirical measure of a Moran
particle process is nothing but a Fleming-Viot process in the particular case where the initial condition is the
sum of NDirac measures. Both the initial works of Moran [35] and Fleming and Viot [25] are motivated by
population genetics models.
The seminal work [7] is a numerical study so that, although a continuous-time continuous density-valued
process is targeted, what is really implemented is in fact a discrete-time particle system. From then, the use
of similar processes in numerical schemes (for killed processes or more general non-linear problems such as
non-linear filtering, rare events analysis and so on [15,16]) have been widely studied. Although the term Moran
particle system is used in [18] and a few other works, most studies concerned with quasi-stationary distributions
refer to Fleming-Viot particle system, see e.g. [2,9,11,13,21,23,27,30,31] (for continuous-time processes,
which thus corresponds to the process introduced in 2.4 i.e. the limit γ0).
Different frameworks have been considered: finite space, discrete infinite space, compact and non-compact
continuous space; continuous and discrete time; hard or soft killing, or more general non-linear evolutions. There
are also some variants on the precise definition of the rebirth mechanism (as mentioned in the introduction, our
specific scheme, where killed particles at step kare resurrected on a position of a particle at step k1 and then
perform an Euler step conditioned not to be killed again, seems to be new). Disregarding these differences, let
us discuss the kind of results established in the existing literature.
A first set of works, starting shortly after the initial numerical study of [7], are concerned with finite-time
propagation of chaos, either for the marginal laws or at the level of a trajectory in the Skorohod topology
[2,8,18,23,27,40], possibly with a precise CLT [9,21]. Similarly, propagation of chaos and/or CLT as N+
at stationarity (i.e. for the invariant measure of the Fleming-Viot particle system) are established e.g. in [2,30].
Uniform in time propagation of chaos is established in [17,19,38]. Contrary to our results, this uniformity in
time is not obtained with a long-time convergence of the particle system at a speed independent from N, but
rather from the long-time convergence of the limit (N= +) non-linear process. This long-time convergence for
the non-linear process (or, equivalently according to Proposition 1.1, for the process conditionned to survival)
has recently been studied in general settings, in particular in a series of works by Champagnat, Villemonais
and coauthors, see e.g. [3,10,12,20] and references within. The idea to combine a finite-time convergence
as N+with a long-time convergence of the limit process to obtain a uniform in time convergence with
8L. JOURNEL AND P. MONMARCH ´
E
Ntraces back at least to [36]. Remark that, combining uniform in time propagation of chaos estimates with
long-time convergence of the limit process, it is possible to obtain results in the spirit of Theorem 1.3,i.e. that
gives an error bound between the empirical measure of the chain simulated in practice with the target QSD,
even if no long-time convergence of the particle system is available.
Contrary to the present paper, most of these previous works do not consider a perturbative framework where
a condition similar to the positivity of κgiven by (1.5) would be considered. Such a condition is considered in
[13], which together with the present paper is the only one that establishes a long-time convergence rate uniform
in Nfor the particle system. Remark that geometric ergodicity for the particle system, stated for instance in
[37], is usually easy to obtain (for a fixed N) from classical tools on Markov chains. Obtaining a rate that
is uniform in Nis much harder (hence the perturbative framework). Besides, for another class of mean-field
particle systems, the McKean-Vlasov diffusions (for which interaction is induced by an interaction potential
force in the drift of the diffusion), it is well-known that there are cases (in non-convex confining potential with
convex interaction for instance, at low temperature, for instance) where the non-linear limit system has several
equilibria and the convergence rate of the particle system (which has a unique invariant measure for all N) goes
to +with N. In fact we don’t expect this to happen for the Fleming-Viot particle system in our context
(compact space, elliptic diffusion, smooth killing) where the QSD is unique and the long-time convergence of the
limit process and the uniform in time propagation of chaos have been established in non-perturbative cases (for
a discussion on other cases where the QSD may not be unique, we refer to [1,11]). This may indicate that the
uniform in Nlong-time convergence could hold in much more general cases, far from the perturbative regime
around the non-interacting case. In that case, our perturbative condition (and the one of [13]) would just come
from the particular non-optimal proof. However, the long-time convergence of the limit-process does not imply
directly the uniform convergence for the interacting system, so this is still an open question, and our results and
those of [13] are the only of their kind. At least, we can say that we have no reason to think that our condition
κ > 0, with the explicit expression of κ, is sharp in any way.
The differences of our work with [13] are the following. The latter is concerned with a continuous-time
Markov chain on a finite state space rather than a diffusion on the torus. Moreover, it requires a strong Doeblin
condition: the parameter λin [13] is required to be positive, which implies that, for any pair of states i, i0, there
is a probability to go either from ito i0, or from i0to i, or from both iand i0to some third state j. Related to
this, in [23], although uniform in Nlong-time convergence is not stated or proven, a coupling argument similar
to ours or to [13] is used (in ([23], Prop. 3.1)) to obtain uniform in time propagation of chaos estimates. This
work is concerned with countable state space under an even more restrictive Doeblin condition than [13] (there
exists at least one state ifor which the transition rate from jto iis uniformly bounded for all other j), and the
uniform in time result requires a perturbative condition (α > C in [23] corresponding to λ > 0 in [13] and κ > 0
for us). We remark that, in a countable discrete state space, our arguments can be easily adapted to obtain a
uniform long-time convergence under a condition of positive Wasserstein curvature for some distance (similarly
to Prop. 2.1), which is much less restrictive than the Doeblin conditions of [13,23]. Of course in that case we
still need λto be Lipschitz (with respect to the distance for which the Wasserstein contraction holds) with a
Lipschitz constant sufficiently small (with respect to the curvature).
As far as the time discretization error is concerned, we are not aware of results similar to ours in previous
works but we refer the reader to [24] for weak error studies `a la Talay-Tubaro for some non-linear evolutions,
and references within for more details.
Finally, let us mention another related set of works, [46], based on self-interacting processes. Indeed, the
reason we introduced a system of Ninteracting particles was to approximate some non-linearity in the evolution
of a measure-valued process. Yet, actually, when it comes to the approximation of the QSD, we are not really
interested in the non-linear evolution, but only in its long-time limit. A classical idea in the field of stochastic
algorithms in that case is to construct a chain similar to (Yk)kNexcept that, in the rebirth mechanism, the
unknown law ηnis replaced by the occupation measure of the past trajectory (which, by ergodicity, is expected
to converge to equilibrium), i.e. there is only one particle and, when it dies at time t > 0, it is resurrected at
its position at time suniformly distributed over [0, t]. Although the algorithms are quite similar (and can be
combined), their theoretical studies rely on quite distinct arguments.
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 9
2. Proofs
Let us first establish the preliminary result stated in the introduction:
Proof of Proposition 1.1.For nN, denote
ηn=Law(Yn), νn=Law ˜
Zn|˜
T>.
Since ν0=η0, suppose by induction that νn=ηnfor some nN. Keeping the notations introduced of the
definition of the kernel Qµ, consider the events Bk={Uk>p(Xk)}. Then, for all bounded measurable f,
Qµf(x) = E(f(X))
=E f(X)X
kN
1
Bk(Tk1
j=0 Bc
j)!
=Ef(X0)
1
U0>p(X0)+X
k>1
Ef(Xk)
1
Uk>p(Xk)
k1
Y
j=0
PBc
j
=K[f(1 p)] (x) + X
k>1
µK [f(1 p)] (µKp)k1K p(x).
In particular, integrating with respect to µ, we obtain
µQµf=µK [f(1 p)] X
kN
(µKp)k=µK [f(1 p)]
µK [1 p].
Applied with µ=ηn, this reads
ηn+1f=E(f(Yn+1 )) = E(E(f(Yn+1)|Yn)) = ηnQηnf=ηnK[f(1 p)]
ηnK[1 p].
On the other hand,
Ef(˜
Zn+1)
1
˜
T >(n+1)γ=Ef˜
Zn+1
1
˜
T >nγ
1
Un>p(˜
Zn+1)
=Ef˜
Zn+11p˜
Zn+1
1
˜
T >nγ
=P˜
T > νnK[f(1 p)] ,
frow which
νn+1f=
Ef(˜
Zn+1)
1
˜
T >(n+1)γ
P˜
T > (n+ 1)γ=
P˜
T > νnK[f(1 p)]
P˜
T > νnK[1 p]
=νnK[f(1 p)]
νnK[1 p],
which concludes.
2.1. The basic coupling
The long-time estimates needed to prove convergence toward equilibrium and uniform in time estimates in
Nand γare based on the fact that, as long as particles don’t die, they follow the chain (1.3) which, like its
10 L. JOURNEL AND P. MONMARCH ´
E
continuous-time counterpart (1.1), have some mixing properties. In order to quantify the latters, we start by
stating ([32], Cor. 2.2) in a suitable way in our context.
Proposition 2.1. There exists c1, a, γ0>0(that all depend only on the drift bof (1.1)and on the dimension
d) such that, denoting ρ(x, y) = (1 exp(a|xy|))/a for x, y Td, then ρis a metric on Tdwith
γ(0, γ0],µ, ν P(Td),Wρ(µK, ν K)6(1 c1γ)Wρ(µ, ν).
Proof. This is ([32], Coro. 2.2) applied to a diffusion with smooth drift on the torus, in which case the distance
on Tdfor which the contraction holds is ρ.
In the rest of the paper, ρis the metric and c1, a, γ0are the constants given by Proposition 2.1. Remark that
ρis equivalent to the Euclidian metric, with
β|xy|6ρ(x, y)6|xy|for β= 2(1 ead/2)/(ad),
where we used that the diameter of Tdis d/2 and that r7→ (1 exp(ar))/a is a concave function with
derivative 1 at zero. In particular, W1and Wρare equivalent.
Now, in this particle system, the contraction property of the chain (1.3) may be counterbalanced by the
death/resurrection mechanism through which particles interact. Indeed, considering two systems of Ninteracting
particles, for i[[1, N ]] the previous result means that we can couple the ith particles of both systems to get
closer one to the other (on average), as long as they don’t die. But then, one of the two particle can die and
resurrect far from the other, or even if they die simultaneously they may resurrect far apart one from the other.
That being said, first, the closer they get, the easier it is to couple them in order to die simultaneously, and
second, when they die simultaneously, keeping the particles close one to the other amount to do a suitable
coupling of the laws from which the particles are resurrected. This is quantified in the following proposition.
In all the rest of the paper, we suppose that λis Lλ-Lipschitz (but not necessarily that κgiven by (1.5) is
positive).
Proposition 2.2. Let µ0, µ1, µ0
0, µ0
1 P(Td)and let (X0, X0
0)(resp. (X1, X0
1)) be a coupling of µ0Kand µ0
0K
(resp. µ1Kand µ0
1K). Then
Wρµ0Qµ1, µ0
0Qµ0
16hE(ρ(X0, X0
0)) + q0
1q1
E(ρ(X1, X0
1))
where
h= 1 min p+ ()1γLλ
and, considering U U([0,1]) independent from (X0, X0
0, X1, X0
1),
qi=P(U < p(Xi)p(X0
i)) , i = 0,1.
Proof. Let (Xk, X 0
k, Uk)kNbe a sequence of independent triplet of random variables such that, for all kN,
Uk U([0,1]) is independent from (Xk, X0
k), which are such as defined in the proposition for k= 0 and 1
and, for j > 1, have the same distribution as (X1, X0
1). Set H= inf{nN, Un< p(Xn)}and H0= inf{n
N, Un< p(X0
n)}. Then, by considering the law of (Xk, Uk)kNalone, it is clear that XHµ0Qµ1and, similarly,
X0
H0µ0
0Qµ0
1, so that
Wρµ0Qµ1, µ0
0Qµ0
16E(ρ(XH, X0
H0)) .
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 11
Different cases are distinguished depending on the value of Hand H0. In the simplest case, none of the particles
dies:
E(ρ(XH, X0
H0)
1
H=H0=0) = Eρ(X0, X 0
0)
1
U0>p(X0)p(X0
0)
6E(ρ(X0, X0
0)
1
U0>min p)
6(1 min p)E(ρ(X0, X0
0)) ,
where we used the independence between U0and (X0, X0
0). In the second case, only one particle dies: using that
kρk61/a,
E(ρ(XH, X0
H0)
1
HH0=0<HH0)6a1P(U0[p(X0)p(X0
0), p(X0)p(X0
0)])
=a1E(|p(X0)p(X0
0)|)
6a1γLλE(|X0X0
0|)
6()1γLλE(ρ(X0, X0
0)) .
In the third case, both particles die k>1 times:
E(ρ(XH, X0
H0)
1
H=H0=k) = E
ρ(Xk, X0
k)
1
Uk>p(Xk)p(X0
k)
k1
Y
j=0
1
Uj<p(Xj)p(X0
j)
6q0qk1
1E(ρ(Xk, X0
k)
1
Uk>min p)
6q0qk1
1(1 min p)E(ρ(X1, X0
1)) .
Finally, combining the computations of the last two cases, the fourth one reads, for k>1,
E(ρ(XH, X0
H0)
1
HH0=k<HH0)6a1q0qk1
1P(Uk[p(Xk)p(X0
k), p(Xk)p(X0
k)])
6()1q0qk1
1γLλE(ρ(X1, X 0
1)) .
Summing these four cases concludes.
2.2. Long-time convergence
For NNdenote ρNthe metric on TdN given by
ρN(x, y) =
N
X
i=1
ρ(xi, yi).
The following result is similar to the results of [13,34,41] and based on the same coupling argument.
Proposition 2.3. There exists c2>0(that depends only on the drift bof (1.1)and on the dimension d) such
that for all γ(0, γ0]NN, and all µ, ν P(TdN ),
WρN(µRN,γ , ν RN,γ)6(1 γκ)WρN(µ, ν).
with κgiven by (1.5).
Proof. It is in fact sufficient to prove this for µ=δxand ν=δyfor any x, y TdN . Indeed, assuming the result
proven for Dirac masses, in the general case, considering (X0,Y0) an optimal coupling of µand νand (X1,Y1)
12 L. JOURNEL AND P. MONMARCH ´
E
an optimal coupling of R(X0,·) and R(Y0,·), then X1µR and Y1νR, so that
WρN(µR, νR)6E(ρN(X1,Y1))
=E(E(ρN(X1,Y1)|(X0,Y0)))
=E(E(WρN(δX0R, δY0R)|(X0,Y0)))
6(1 γκ)E(ρN(X0,Y0))
= (1 γκ)WρN(µ, ν ).
Hence, in the following, we fix x, y TdN . Let (Xi, Yi)i[[1,N ]] be independent pairs of random variables in
Tdwhere, for all i[[1, N ]], (Xi, Yi) is an optimal coupling of Qπ(x)(xi,·) and Qπ(y)(yi,·). Then (X,Y) is a
coupling of R(x, ·) and R(y, ·), so that
WρN(δxR, δyR)6E(ρN(X,Y)) =
N
X
i=1
E(ρ(Xi, Yi))
=
N
X
i=1 WρQπ(x)(xi,·), Qπ(y)(yi,·).
We want to apply Proposition 2.2 with µ1=π(x), µ0=δxi,µ0
1=π(y) and µ0
0=δyi. To do so, for all i[[1, N ]],
we consider ( ˜
Xi,˜
Yi) an optimal coupling of K(xi,·) and K(yi,·). From Proposition 2.1,
Eρ˜
Xi,˜
Yi6(1 c1γ)ρ(xi, yi).(2.1)
Moreover, if J U([[1, N ]]) is independent from the ( ˜
Xi,˜
Yi)’s, we remark that ( ˜
XJ,˜
YJ) is a coupling of π(x)K
and π(y)K. Proposition 2.2 applied with these couplings reads, for all i[[1, N ]],
WρQπ(x)(xi,·), Qπ(y)(yi,·)6hEρ(˜
Xi,˜
Yi)+qi
1q
Eρ(˜
XJ,˜
YJ)(2.2)
where, if U U([0,1]) is independent from the previous variables,
qi:= PU < p(˜
Xi)p(˜
Yi)
and, conditionning on the value of J,
q:= PU < p(˜
XJ)p(˜
YJ)=1
N
N
X
i=1
qi.
Summing (2.2) over i[[1, N]] and applying (2.1) yields
WρN(δxR, δyR)6h (1 c1γ)
N
X
i=1
ρ(xi, yi) + Nq
1q
Eρ(˜
XJ,˜
YJ)!.
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 13
Applying Proposition 2.1 again,
Eρ(˜
XJ,˜
YJ)=1
N
N
X
i=1
Eρ(˜
Xi,˜
Yi)61
N(1 c1γ)
N
X
i=1
ρ(xi, yi),
and the previous inequality becomes
WρN(δxR, δyR)6h(1 c1γ)
1q
ρN(x, y).
Bounding 1 q>1max p>exp(γkλk) and max pmin p6pd/2γLλyields
h(1 c1γ)
1q
6(1 c1γ)1max p+ max pmin p+ ()1γLλ
1max p
61c1γ+γLλeγkλk()1+d,
which concludes.
As a direct consequence, Proposition 2.3 gives
mN,WρN(µRm, νRm)6eκmγWρN(µ, ν ),
with κthat does not depends on Nnor γ. Provided κ > 0, and since P(TdN ) is complete for W1(hence for
WρN) the Banach fixed-point theorem implies then that Radmits a unique invariant measure toward which it
converges at rate γκ.
In the rest of the paper, κis given by (1.5) (but is not necessarily assumed positive).
Proof of Theorem 1.2.The first part of the theorem has already been proven in Proposition 2.3. The last
statement then follows from the first part, the equivalence between the Euclidean distance and ρNand the fact
WρN(ν, µ)6kρNk=Nd/2 for all ν, µ P(TdN ).
2.3. Propagation of chaos
Recall that ηkis the law at time kof the non-homogeneous Markov chain (Yk)kNon Tdintroduced in
Section 1.3 with transition kernels Qηkand initial condition η0, and that R=RN,γ is the transition kernel of
the Markov chain (Xk)kNon TdN .
Lemma 2.4. There exist C1>0such that for all NN,γ(0, γ0],η P(Td)and µ P(TdN ),
WρNµR, µQN
η6γN C1ZTdN Wρ(π(x), η)µ(dx).
Proof. Similarly to the proof of Proposition 2.3, we start with the case µ=δxfor some xTdN . Let
(Xi, Yi)i[[1,N]] be Nindependent pairs of random variables such that for all i[[1, N ]], (Xi, Yi) is an optimal
coupling of Qπ(x)(xi,·) and Qη(xi,·). Then (X,Y) is a coupling of RN(x, ·) and QN
η(x, ·), so that
WρNδxRN, δxQN
η6E(ρN(X, Y ))
=
N
X
i=1
E(ρ(Xi, Yi)) =
N
X
i=1 WρδxiQπ(x), δxiQη.
14 L. JOURNEL AND P. MONMARCH ´
E
From Proposition 2.2 (bounding q06max p6γkλkand 1 q1>1max p>exp(γ0kλk))
WρδxiQπ(x), δxiQη6γkλk1+()1γ0Lλeγ0kλkWρ(π(x), η)
:= γC1Wρ(π(x), η).
Now in the general case where µis not a Dirac mass, considering Z0µ, and (Z1, Z2) an optimal coupling of
R(Z0,·) and QN
η(Z0,·) and conditioning with respect to Z0,
WρNµRN, µQN
η6E(ρN(Z1, ZN)) 6γN C1E(Wρ(π(Z0), η)) .
Proposition 2.5. There exist C2, C3>0such that for all NN,γ(0, γ0],mNand η0 P(Td), first,
WρNηN
0Rm, ηN
m6C2Nα(N)γ
m
X
s=1
(1 γκ)s1,(2.3)
and second, if (Xk)kNis a Markov chain with initial distribution ηN
0and transition kernel R, then
E(Wρ(π(Xm), ηm)) 6C3α(N) 1 + γ
m
X
s=1
(1 γκ)s1!.(2.4)
Remark that when κ > 0, γPm
s=1(1 γκ)s161 so that (2.3) and (2.4) yield uniform in time estimates.
On the contrary, when k < 0, the estimates are exponentially bad in t=.
Proof. We start with the proof of (2.3), for m>1 (the case m= 0 being trivial). From the triangular inequality,
Proposition 2.3 and Lemma 2.4,
rm:= WρNηN
0Rm, ηN
m6WρNηN
0Rm, ηN
m1R+WρNηN
m1R, ηN
m1QN
ηm1
6(1 κγ)rm1+γNC1ZTdN Wρ(π(x), ηm1)ηN
m1(dx).
Since Wρ6W1, estimating the last term is a classical question, that is to bound the expected Wasserstein
distance between the empirical measure of a sample of Nindependent and identically distributed random
variables and their common law. From ([26], Thm. 1) (and since on the torus the moments of probability
measures are uniformly bounded), there exists some C0>0 independent from η0,m,Nand γsuch that
ZTdN W1(π(x), ηm1)ηN
m1(dx)6C0α(N).
Since r0= 0, a direct induction concludes the proof of (2.3).
To prove (2.4), let (X,Y) be an optimal coupling of ηN
0Rmand ηN
m. Considering J U([[1, N ]])
independent from (X,Y) then, conditionally to (X,Y), (XJ, YJ) is a coupling of π(X) and π(Y), so that
Wρ(π(X), π(Y)) 6E(ρ(XJ, YJ)|(X,Y)) = 1
NρN(X,Y).
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 15
Taking the expectation in
Wρ(π(X), ηk)6Wρ(π(X), π(Y)) + Wρ(π(Y), ηk),
we conclude with (2.3) and ([26], Thm. 1) again.
Corollary 2.6. With the notations of Proposition 2.5, for all k[[1, N]],
WρkLaw(X1,m, . . . , Xk,m), ηk
m6C2(N)γ
m
X
s=1
(1 γκ)s1.
Proof. Let (X,Y) be an optimal coupling of ηN
0Rmand ηN
m, and let σbe uniformly distributed over the set
of permutations of Nelements, independent from (X,Y). Since the laws of Xand Yare exchangeable, Xσ=
(Xσ(1), . . . , Xσ(N)) has the same law as X, in particular (Xσ(1), . . . , Xσ(k)) has the same law as (X1, . . . , Xk).
The same goes for Yσ, and
E k
X
i=1
ρ(Xσ(i), Yσ(i))!=kEρ(Xσ(1), Yσ(1) )=k
NE(ρN(X,Y))
=k
NWρNηN
0Rm, ηN
m,
and Proposition 2.5 concludes.
Corollary 2.6 means that, for any fixed kN, as Ngoes to infinity, the k-marginals of the system of
particles converge toward the law of kindependent non-linear chains, which is the so-called propagation of
chaos phenomenon.
2.4. Discrete to continuous time
We start by defining (Yt)t>0and (Xt)t>0the continuous-time analoguous of the chains (Yk)kNon Tdand
(Xk)kNon TdN defined in Section 1.3. We start with the non-linear process. For t>0, let
ηt=Law(Zt|T > t)
where Zsolves (1.1) with initial distribution η0and Tis given by (1.2). We define (Yt)t>0as follows. Set
Y0=Z0η0,T0= 0 and suppose that Tnand (Yt)t[0,Tn]have been defined for some nN. Let (Bt)t>0be
a new Brownian motion on Tdand E E(1), independent one from the other. Let ˜
Ybe the solution of
d˜
Yt=b(˜
Yt)dt+ dBt
for t>Tnwith ˜
YTn=YTnand let
Tn+1 = inf t>Tn, E 6Zt
Tn
λ(˜
Ys)ds.
For t(Tn, Tn+1 ), set Yt=˜
Yt. Finally, draw a new YTn+1 according to ηTn+1 . By induction Tnand (Yt)t[0,Tn]
are then defined for all nN. Since λis bounded, Tnalmost surely goes to infinity when n so that (Yt)t>0
is defined for all t>0. Similarly to Proposition 1.1, it can be established that Law(Yt) = ηtfor all t>0.
Now, as in Section 1.3, from the non-linear process (Yt)t>0, the interacting particles (Xt)t>0are obtained
by replacing ηtby the empirical distribution of the system when particles die and are resurrected.
16 L. JOURNEL AND P. MONMARCH ´
E
More precisely, let (Ei,k , Bi,k, Ji,k )i[[1,N ]],kNbe a family of independent triplet of independent random
variables where, for all i[[1, N]] and kN,Ei,k E(1), Ji,k U([[1, N ]]) (except if k= 0, in which case
Ji,k =ialmost surely) and Bi,k = (Bi,k,t)t>0is a d-dimensional Brownian motion. From these variables, we
simultaneously define by induction the process and its death times (Ti,k)i[[1,N ]],kNas follows. First, set X0=x
and Ti,0= 0 for all i[[1, N]]. For all i[[1, N]], set ˆ
Xi,0,0=xiand for k>1, set
ˆ
Xi,k,Ti,k = lim
t<
Ti,k
XJi,k,t .(2.5)
For all kN, for t>Ti,k , let ˆ
Xi,k solve
dˆ
Xi,k,t =bˆ
Xi,k,tdt+ dBi,k,t ,
set
Ti,k+1 =Ti,k + inf t>0, Ei,k 6Zt
0
λˆ
Xi,k,sds
and for all t[Ti,k, Ti,k+1), set Xi,t =ˆ
Xi,k,t.
Then Xt= (X1,t, . . . , X N,t) is well-defined for all t>0. Indeed, it is well defined for all t < S1:=
min{Ti,1, i [[1, N ]]}the first death time of some particle, and is equal on this interval to ( ˆ
X1,0,t,..., ˆ
XN,0,t),
which is continuous on [0, S1]. Hence, the limits involved in (2.5) are well defined for k= 1 and all i[[1, N]]
such that Ti,1=S1. Then the algorithm above similarly defines the process up to the second time some particles
die, etc.
Remark that most of the times (2.5) simply reads ˆ
Xi,k,Ti,k =XJi,k,Ti,k (at its kth death time, the ith particle
is resurrected at the current position of the Jth
i,k particle). Indeed, the only case when this is not true is when
the Jth
i,k particle dies at time Ti,k. Since the probability that two or more particles die simultaneously is zero,
this almost surely only occurs if Ji,k =i,i.e. if the particle is resurrected at its own position.
Denote (Pt)t>0the Markov semi-group associated with (Xt)t>0,i.e. for all t>0, Ptis the Markov kernel
given by
Ptf(x) = E(f(Xt)|X0=x).
We sometimes write Pt=PN ,t to specify the number of particles.
Lemma 2.7. There exist C4>0such that for all NN,γ(0, γ0]and µ P(TdN ),
WρN(µRN,γ , µPN,γ )6N C4γ3/2.
Proof. As in the proof of Lemma 2.4, it is sufficient to treat the case µ=δxwith a fixed xTdN . Let (Xt)t>0
be defined as above from random variables (Ei,k, Bi,k, Ji,k)i[[1,N ]],kN. In particular, XγδxPγ.
To define X1δxR, for all i[[1, N ]] and kN, consider ( ˜
Xi,k,t)t>0the solution to ˜
Xi,k,0=xJi,k and
d˜
Xi,k,t =b˜
Xi,k,0dt+ dBi,k,t .
Denoting
Hi= inf nkN, Ei,k >γλ ˜
Xi,k,γ o,
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 17
set X1:= ( ˜
X1,H1 ,..., ˜
XN,HN ).
Then (X1,Xγ) is a coupling of R(x, ·) and Pγ(x, ·), so that
WρN(R(x, ·), Pγ(x, ·)) 6EρN(X1,Xγ)=
N
X
i=1
Eρ(Xi,1, Xi,γ).
We now distinguish four cases, considering the events
Bi,1={Hi= 0 and Ti,1> γ}
Bi,2={Hi= 1 and Ti,16γ < Ti,2TJi,0,1}
Bi,3={Hi= 1 and Ti,1> γ}∪{Hi= 0 and Ti,16γ}
Bi,4={Hi>2}∪{Ti,26γ}∪{Ti,1TJi,0,16γ},
that is, respectively: none of the two ith particles dies; both the ith particles die exactly once; one particle dies
but not the other; at least two deaths are involved for one of the two particle. For all i[[1, N ]], = 4
j=1Bi,j ,
so that
Eρ(Xi,1, Xi,γ)6Eρ(Xi,1, X i,γ)
1
Bi,1+
1
Bi,2+
1
Bi,3+
1
Bi,4.
Conclusion follows by gathering the four cases.
Case 1. It reduces to the classical case of diffusions, since
E|Xi,1Xi,γ |
1
Bi,1=E|˜
Xi,0 ˆ
Xi,0 |
1
Bi,16E|˜
Xi,0 ˆ
Xi,0 |.
Then
|˜
Xi,0,t ˆ
Xi,0,t|=Zt
0b(xi)bˆ
Xi,0,s
ds
6k∇bkZt
0|˜
Xi,0,s ˆ
Xi,0,s|+|xi˜
Xi,0,s|ds
By the Gronwall Lemma, for all t>0, almost surely,
sup
s[0,t]|˜
Xi,0,t ˆ
Xi,0,t|6k∇bketk∇bkZt
0|xi˜
Xi,0,s|ds . (2.6)
Since ˜
Xi,0,s is a Gaussian variable with mean xi+sb(xi) and variance s,
E|xi˜
Xi,0,s|6sb(xi) + E|xi+sb(xi)˜
Xi,0,s|6kbks+s . (2.7)
As a consequence, for γ6γ0,
E|˜
Xi,0 ˆ
Xi,0 |6k∇bkeγ0k∇bkZγ
0
E|xi˜
Xi,0,s|ds63/2.(2.8)
18 L. JOURNEL AND P. MONMARCH ´
E
Case 2. We bound
E|Xi,1Xi,γ |
1
Bi,26E|˜
Xi,1 xJi,0|+|ˆ
Xi,1 xJi,0|
1
Bi,2.
Similarly to (2.7),
E|˜
Xi,1 xJi,0|
1
Bi,26E|˜
Xi,1 xJi,0|
1
Ei,06γkλk63/2,
where we used the independence of Ei,0from Ji,1and ( ˜
Xi,1,t)t>0. Denote (X0
i)t>0the solution of
dX0
i,t =bX0
i,tdt+dBJi,0,0,t for t<Ti,0
dBi,1,t for t>Ti,0.
with X0
i,0=xJi,0. Under the event Bi,2,ˆ
Xi,1 =X0
i,γ . Moreover, Ji,0,BJi,0,0and Bi,1are independent from
Ti,0and thus, by the strong Markov property, (X0
i,t)t>0is independent from Ti,0and conditionally to Ji,0it has
the same distribution as ˆ
XJi,0,0,t (namely it is a diffusion solving (1.1) with initial condition xJi,0). Hence,
E|ˆ
Xi,1 xJi,0|
1
Bi,26E|X0
i,γ yJi,0|
1
Ei,06γkλk6c0γ3/2.
Case 3. We bound
Eρ(Xi,1, Xi,γ)
1
Bi,361
aP(Bi,3)
61
aPZγ
0
λ(ˆ
Xi,0,s)dsγλ(˜
Xi,0 )6Ei,06Zγ
0
λ(ˆ
Xi,0,s)dsγλ(˜
Xi,0 )
=1
aE
exp Zγ
0
λ(ˆ
Xi,0,s)dsexp γλ(˜
Xi,0 )
61
aEZγ
0
λ(ˆ
Xi,0,s)dsγλ(˜
Xi,0 ).
Now,
Zγ
0
λ(ˆ
Xi,0,s)dsγλ(˜
Xi,0 )
6LλZγ
0|ˆ
Xi,0,s xi|ds+γ|xi˜
Xi,0 |.
Using (2.6) together with (2.7) yields
EρXi,1, Xi,γ
1
Bi,36c3γ3/2.
Case 4. We bound
EρXi,1, Xi,γ
1
Bi,461
aP(Bi,4)
6P(Ei,0Ei,16γkλk) + PEi,0EJi,0,06γkλk
621eγkλk2
62γ2kλk2
.
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 19
Proposition 2.8. There exist C5>0such that for all NN,γ(0, γ0]and η0 P(Td),
WρNµRm
N,γ , µPN,mγ 6γNC5γ
m
X
s=1
(1 γκ)s1.
As for Proposition 2.5, when κ > 0, γPm
s=1(1 γκ)s161 so that (2.3) and (2.4) yield uniform in time
estimates. On the contrary, when κ < 0, the estimates are exponentially bad in t=.
Proof. The proof is similar to Proposition 2.5. Denoting µm=µRmand νm=µP , from the triangular
inequality, Proposition 2.3 and Lemma 2.7,
rm:= WρN(µm, νm)6WρN(µm, νm1R) + WρN(νm1R, νm1Pγ)
6(1 γκ)rm1+N C4γ3/2,
and an induction concludes.
2.5. Conclusion
In this section we use the notations of the previous ones, in particular κis given by (1.5) and the constants
C2,C3and C5are those of Propositions 2.5 and 2.8. We can now gather all these previous results.
Letting either γvanish or Ngo to infinity in Proposition 2.3, we obtain long-time convergence for, respectively,
the non-homogeneous self-interacting Markov chain (Yk)kNintroduced in Section 1.3 and the continuous-time
Markov chain (Xt)t>0defined in Section 2.4.
Corollary 2.9. Let (ηn)nNbe such as defined in Section 1.3, and ηn)nNbe similarly defined but with a
different initial distribution ˜η0 P(Td). For all mNand all γ(0, γ0],
Wρ(ηm,˜ηm)6(1 γκ)mWρ(η0,˜η0).
Corollary 2.10. For all NN,t>0and µ, ν P(TdN ),
WρN(µPN,t, νPN,t )6eκtWρN(µ, ν ).
Proof of Corollary 2.9.The proof is based on the simple equality: For all NNand µ, ν P(Td),
WρNµN, νN=NWρ(µ, ν ).(2.9)
Indeed, by considering Nindependent couplings (Xi, Yi)i[[1,N]] ,
WρNµN, νN6E(ρN(X,Y)) =
N
X
i=1
E(ρ(Xi, Yi)) = NWρ(µ, ν).
Conversely, if (X,Y) is an optimal coupling of µNand νN, then
Wρ(µ, ν)6E(ρ(X1, Y1)) = 1
NE(ρN(X,Y)) = 1
NWρµN, νN.
By the triangular inequality,
WρηN
m,˜ηN
m6WρηN
m, ηN
0Rm+WρηN
0Rm,˜ηN
0Rm+Wρ˜ηN
0Rm,˜ηN
m
20 L. JOURNEL AND P. MONMARCH ´
E
6(1 γκ)mWρ(ηN
0,˜ηN
0)+2C2N α(N)γ
m
X
s=1
(1 γκ)s1.
where we applied Propositions 2.3 and 2.5. Using the equality 2.9, dividing by Nand letting Ngo to infinity
concludes the proof of Corollary 2.9.
Remark that the beginning of the proof also applies for µ, ν P(TdN ) that are exchangeable (i.e. invariant
by any permutation of the d-dimensional coordinates), in which case, denoting, µ(1) and ν(1) their d-dimensional
marginals, we get that
Wρµ(1), ν (1)61
NWρN(µ, ν).
Proof of Corollary 2.10.Similarly to the previous proof, Corollary 2.10 is a direct consequence of Propositions
2.3 and 2.8, letting mgo to infinity at a fixed tand Nin
WρN(µPN,t, νPN,t )6WρNµPN,t, µRm
N,t/m+WρNµRm
N,t/m, νRm
N,t/m+WρNνRm
N,t/m, νPN,t .
We now turn to the continuous-time limit of the non-linear chain (Yk)kN.
Corollary 2.11. There exists C6>0such that for all η0 P(Td)and all γ(0, γ0], if (ηn)nNis such as
defined in Section 1.3, and (ηt)t>0is such as defined in Section 2.4 (with η0=η0), then
W1η1, ηγ6C6γ3/2,
and for all m>1,
Wρηm, η 6γC6γ
m
X
s=1
(1 γκ)s1.
Proof. For the first inequality, we could follow the proof of Lemma 2.7, but, using the notations of the
introduction, we will rather use the fact that
η1=Law ˜
Z1|˜
T > γ, ηγ=Law (Zγ|T > γ),
where the gaussian variable G0in (1.3) is equal to Bγ/γwhere (Bt)t>0is the Brownian motion involved in
(1.1), and Tand ˜
Tare defined with the same E E(1). Recall the estimate (2.8) for the error from an Euler
scheme to its initial diffusion. Then we bound
E|˜
Z1Zγ| | T > γ, ˜
T > γ6PT > γ, ˜
T > γ1
E|˜
Z1Zγ|
61eγ0kλk13/2,
which concludes the first part of the corollary.
For the second part, denoting rm=Wρηm, η , we bound
rm6Wρηm, ηm1Qηm1+Wρηm1Qηm1, η
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 21
6(1 γκ)rm1+C6γ3/2,
where we used the first part of the corollary and Corollary 2.9. An induction concludes.
We can now prove propagation of chaos results for the continuous-time process:
Corollary 2.12. For all NN,k[[1, N ]] and all t>0, if (Xt)t>0is a Markov process with initial distribution
ηN
0associated to the semigroup (PN,t)t>0then, first,
WρkLaw(X1,t, . . . , Xk,t ), ηk
t6C2(N)Zt
0
eκsds ,
and second,
EWρπ(Xt), ηt6C3α(N)1 + Zt
0
eκsds.
Proof. As shown in the proof of Proposition 2.5, if (X,Y) is an optimal coupling of µand ν,
E(Wρ(π(X), π(Y))) 61
NWρN(µ, ν).
Thus, considering a time step γ=t/m,mN, we decompose
Wρπ(Xt), ηt6Wρπ(Xt), π(Xm)+Wρ(π(Xm), ηm) + Wρ(ηm, η t),
take the expectation, apply Propositions 2.5 and 2.8 and Corollary 2.11 and let mgo to infinity. This proves
the second point, and the proof of the first one is similar, with Corollary 2.6.
Up to now, we have sent either Nor γto their limit. When κ > 0, if we let t= go to infinity at fixed
Nand γ, we recover results on the equilibria of the processes. Indeed, note that Corollary (2.9) together with
the Banach fixed-point theorem imply that n7→ ηnadmits a limit which is independent from η0. Together with
Proposition 1.1, this is the unique QSD of the Markov chain (1.3). Denote it νγ. Similarly, Proposition 2.3 implies
that RN,γ admits a unique invariant measure. Denote it µ,N , and µ(k)
,N,γ its first kd-dimensional marginal
for k[[1, N ]] (i.e. the law of (X1, . . . , Xk) if Xµ,N ). Third, Corollary 2.10 implies that (PN,t)t>0admits
a unique invariant measure µ,N .
Corollary 2.13. If κ > 0, then for all NNand γ(0, γ0]
WρNµ,N,γ , µ,N 6γNκ1C5,
Corollary 2.14. If κ > 0, then for all NN,k[[1, N]] and γ(0, γ0], first,
Wρkµ(k)
,N,γ , ν k
γ6κ1C2(N),
and second,
Eµ,N,γ (Wρ(π(X), νγ)) 6κ1C3α(N).
Proofs of Corollaries 2.13 and 2.14.Considering any η0 P(Td) and mN,
22 L. JOURNEL AND P. MONMARCH ´
E
WρNµ,N,γ ,µ,N 6WρNµ,N,γ , η N
0Rm+WρNηN
0Rm, ηN
0Pγm +WρNηN
0Pγm ,µ,N
Apply Proposition 2.3 with µ=µ,N,γ and ν=ηN
0, Corollary 2.10 with the same νand with µ=µ,N , and
Proposition 2.8. Letting mgo to infinity concludes the proof of Corollary 2.13. The proof of Corollary 2.14 is
similar (based on Prop. 2.5 and Cor. 2.6, like Cor. 2.12).
Next, we can send two parameters to their limit. Sending Nto infinity and γto zero, we get the long time
convergence of the non-linear process (Yt)t>0introduced in Section 2.4 (or, equivalently, of the process Zsolving
(1.1) conditionned not to be dead):
Corollary 2.15. Let (ηt)t>0be such as defined in Section 1.3, and ηt)t>0be similarly defined but with a
different initial distribution ˆη0 P(Td). For all t>0,
Wρ(ηt,ˆηt)6eκtWρ(η0,ˆη0).
Proof. Thanks to Corollary 2.11, let γ=t/m vanish in Corollary 2.9.
In particular, if ˆη0is the QSD ν, by definition, ˆηt=νfor all t>0, so that Corollary 2.15 yields the
uniqueness of the QSD and the exponential convergence of Law(Zt|T > t) toward ν(which is a result in the
spirit of [3,10,12,20]).
Now, at a fixed γ > 0, letting tand Ngo to infinity, we obtain an error bound between the QSD νof the
continus process (1.1) and the QSD νγof the discrete scheme.
Corollary 2.16. If κ > 0, then for all γ(0, γ0]
Wρ(νγ, ν)6γκ1C6,
Proof. Thanks to Corollaries 2.9 and 2.15 (applied with one of the initial condition being the equilibrium), let
mgo to infinity in Corollary 2.11.
Finally, letting γvanish and tgo to infinity at a fixed NN, we obtain a propagation of chaos result at
stationarity (as established first in [2], and more recently with a CLT in [30] in the case of a finite state space)
for the continuous time system of interacting particle (Xt)t>0introduced in Section 2.4.
Corollary 2.17. If κ > 0and if Xis a random variable with law µ,N , then for all NNand k[[1, N ]],
WρNLaw(X1, . . . , Xk), νk
6κ1C2(N),
and second,
EWρπ(X), ν6C3α(N)1 + κ1.
Proof. The proof is similar to Corollary 2.12, letting tgo to infinity in Corollary 2.12 thanks to Corollaries 2.10
and 2.15.
Remark that our results at stationarity (Cors. 2.13,2.14,2.16 and 2.17) all require the perturbative condition
κ > 0. Yet, propagation of chaos at stationarity for the continuous time process follows from the works [2,17,
19,30,38] in a much broader (non-perturbative) framework (and, although it doesn’t seem to have been studied
yet, the situation should be similar for error bounds in γrather than N). As discussed in Section 1.4, error
bounds on N(and possibly γ) that are uniform in time can be obtained thanks to the long-time convergence
of the limit (N= +) non-linear process. In our case, when κ > 0, this long-time convergence follows from
the (uniform in N) long-time convergence of the particle system (whether it is possible to obtain the latter
from the former is unclear), but it holds in more general cases (see [3,10,12,20] and references within) and,
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 23
Figure 1. Summary of the different results. The label of each arrow indicates the corollary
or proposition where the corresponding quantitative convergence is stated. Vertical, horizontal
and diagonal arrows correspond respectively to t,γand Ngoing to their limit.
in those cases, results similar to Corollaries 2.13,2.14,2.16 and 2.17 should hold. The question of establishing
propagation of chaos or discretization error bounds at stationarity in a more general case (i.e. without the
condition κ > 0) is out of the scope of the present work.
All our results are summarised in Figure 1
Finally, we detail the proof of our main result.
Proof of Theorem 1.3.For η0 P(Td), let (X,Y) be an optimal coupling of µ0Rbt/γcand ηN
0Rbt/γc. As in
the proof of Proposition 2.5,
E(Wρ(π(X), π(Y))) 61
NWρNµRbt/γc, ηN
0Rbt/γc6aeκ(tγ0),
where we used Proposition 2.3 and the fact that ρN(x, y)6N a for all x, y TdN . Then, by the triangular
inequality,
Wρ(π(Y), ν)6Wρπ(Y), ηbt/γc+Wρηbt/γc, νγ+Wρ(νγ, ν).
Taking the expectation, applying Proposition 2.5 and Corollaries 2.9 (applied with ˜η0=νγ) and 2.16, the
boundedness of ρand the equivalence of Wρand W1concludes.
Acknowledgements. Pierre Monmarch´e thanks Bertrand Cloez for indicating the result stated in Proposition 1.1, and
more generally for fruitful discussions. He acknowledges partial support by the projects EFI ANR-17-CE40-0030 and
METANOLIN of the French National Research Agency.
References
[1] A. Asselah, P.A. Ferrari, P. Groisman and M. Jonckheere, FlemingViot selects the minimal quasi-stationary distribution: the
GaltonWatson case. Ann. l’Institut Henri Poincar´e, Probabilit´es Stat. 52 (2016) 647–668.
[2] A. Asselah, P.A. Ferrari and P. Groisman, Quasistationary distributions and Fleming-Viot processes in finite spaces. J. Appl.
Probab. 48 (2011) 322–332.
[3] V. Bansaye, B. Cloez and P. Gabriel, Ergodic behavior of non-conservative semigroups via generalized Doeblin’s conditions.
Acta Applicandae Mathematicae.
24 L. JOURNEL AND P. MONMARCH ´
E
[4] M. Bena¨ım, N. Champagnat and D. Villemonais, Stochastic approximation of quasi-stationary distributions for diffusion
processes in a bounded domain. Preprint arXiv:1904.08620 (2019).
[5] M. Bena¨ım and B. Cloez, A stochastic approximation approach to quasi-stationary distributions on finite spaces. Electr.
Commun. Probab. 20 (2015) 13 pp.
[6] M. Bena¨ım, B. Cloez and F. Panloup, Stochastic approximation of quasi-stationary distributions on compact spaces and
applications. Ann. Appl. Probab. 28 (2018) 2370–2416.
[7] K. Burdzy, R. Ho l yst and P. March, A Fleming-Viot particle representation of the Dirichlet Laplacian. Commun. Math. Phys.
214 (2000) 679–703.
[8] K. Burdzy, R. Holyst, D. Ingerman and P. March, Configurational transition in a Fleming - Viot-type model and probabilistic
interpretation of Laplacian eigenfunctions. J. Phys. A Math. General 29 (1996) 2633–2642.
[9] F. Cerou, B. Delyon, A. Guyader and M. Rousset, A Central Limit Theorem for Fleming-Viot Particle Systems with Soft
Killing. Preprint arXiv:1611.00515 (2016).
[10] N. Champagnat, K. Coulibaly-Pasquier and D. Villemonais, Criteria for exponential convergence to quasi-stationary
distributions and applications to multi-dimensional diffusions. Preprint arXiv:1603.07909 (2016).
[11] N. Champagnat and D. Villemonais, Convergence of the Fleming-Viot process toward the minimal quasi-stationary distribution,
ALEA, Lat. Am. J. Probab. Math. Stat. 18 (2021) 1–15.
[12] N. Champagnat and D. Villemonais, Practical criteria for r-positive recurrence of unbounded semigroups, Electron. Commun.
Probab. 25 (2020) 6.
[13] B. Cloez and M.-N. Thai, Quantitative results for the Fleming-Viot particle system and quasi-stationary distributions in
discrete space. Stochastic Process. Appl. 126 (2016) 680–702.
[14] D. Dawson, Measure-valued Markov processes (1993).
[15] P. Del Moral, Feynman-Kac Formulae. Genealogical and Interacting Particle Systems with Applications. Springer-Verlag, New
York (2004).
[16] P. Del Moral, Mean Field Simulation for Monte Carlo Integration. Chapman and Hall/CRC, New York (2013).
[17] P. Del Moral and A. Guionnet, On the stability of interacting processes with applications to filtering and genetic algorithms.
Annales de l’I.H.P. Probabilit´es et statistiques 37 (2001) 155–194.
[18] P. Del Moral and L. Miclo, A Moran particle system approximation of FeynmanKaC formulae. Stoch. Process. Appl. 86
(2000) 193–216.
[19] P. Del Moral and L. Miclo, Particle approximations of Lyapunov exponents connected to Schr¨odinger operators and Feynman-
Kac semigroups. ESAIM: PS 7(2003) 171–208.
[20] P. Del Moral and D. Villemonais, Exponential mixing properties for time inhomogeneous diffusion processes with killing.
Bernoulli 24 (2018) 1010–1032.
[21] B. Delyon, F. erou, A. Guyader and M. Rousset, A Central Limit Theorem for Fleming-Viot Particle Systems with Hard
Killing. To appear in Ann. l’IHP (Probability and Statistics) (2017) arXiv:1709.06771.
[22] A. Eberle, A. Guillin and R. Zimmer, Couplings and quantitative contraction rates for Langevin dynamics. Preprints
arXiv:1703.01617 (2017).
[23] P. Ferrari and N. Maric, Quasi stationary distributions and Fleming-Viot processes in countable spaces. Electr. J. Probab. 12
(2007) 684–702.
[24] G. Ferr´e and G. Stoltz, Error estimates on ergodic properties of discretized FeynmanKaC semigroups. Numer. Math. 143
(2019) 261–313.
[25] W.H. Fleming and M. Viot, Some measure-valued Markov processes in population genetics theory. Indiana Univ. Math. J. 28
(1979) 817–843.
[26] N. Fournier and A. Guillin, On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory
Related Fields 162 (2015) 707–738.
[27] I. Grigorescu and M. Kang, Hydrodynamic limit for a Fleming-Viot type system. Stochastic Process. Appl. 110 (2004) 111–143.
[28] A. Joulin and Y. Ollivier, Curvature, concentration and error estimates for Markov chain Monte Carlo. Ann. Probab. 38 (2010)
2418–2442.
[29] C. Le Bris, T. Leli`evre, M. Luskin and D. Perez, A mathematical formalization of the parallel replica dynamics. Monte Carlo
Methods Appl. 18 (2012) 119–146.
[30] T. Leli`evre, L. Pillaud-Vivien and J. Reygner, Central limit theorem for stationary Fleming-Viot particle systems in finite
spaces. ALEA Lat. Am. J. Probab. Math. Stat. 15 (2018) 1163–1182.
[31] J.-U. obus, A stationary Fleming-Viot type Brownian particle system. Math. Z. 263 (2009) 541–581.
[32] M.B. Majka, A. Mijatovi´c and L. Szpruch, Non-asymptotic bounds for sampling algorithms without log-concavity. Preprints
arXiv:1808.07105 (2018).
[33] G. Milstein and M.V. Tretyakov, Stochastic Numerics for Mathematical Physics (2004).
[34] P. Monmarch´e, Elementary coupling approach for non-linear perturbation of Markov processes with mean-field jump mechanims
and related problems. Preprints arXiv:1809.10953 (2018).
[35] P.A.P. Moran, Random processes in genetics. Math. Proc. Camb. Philos. Soc. 54 (1958) 6071.
[36] F.M. Norman, Ergodicity of diffusion and temporal uniformity of diffusion approximation. J. Appl. Prob. 14 (1977) 399–404.
[37] W. cafrain and D. Villemonais, Convergence of a non-failable mean-field particle system. Stoch. Anal. Appl. 35 (2017)
587–603.
CONVERGENCE OF A PARTICLE APPROXIMATION FOR THE QUASI-STATIONARY DISTRIBUTION 25
[38] M. Rousset, On the control of an interacting particle estimation of Schr¨odinger ground states. SIAM J. Math. Anal. 38 (2006)
824–844.
[39] C. Villani, Optimal transport. Old and new. Vol. 338 of Grundlehren der Mathematischen Wissenschaften [Fundamental
Principles of Mathematical Sciences]. Springer-Verlag, Berlin (2009).
[40] D. Villemonais, General approximation method for the distribution of Markov processes conditioned not to be killed. ESAIM:
PS 18 (2014) 441–467.
[41] D. Villemonais, Lower bound for the coarse Ricci curvature of continuous-time pure jump processes. Preprint arXiv:1705.06642
(2017).
This journal is currently published in open access under a Subscribe-to-Open model (S2O). S2O is a transformative
model that aims to move subscription journals to open access. Open access is the free, immediate, online availability
of research articles combined with the rights to use these articles fully in the digital environment. We are thankful to
our subscribers and sponsors for making it possible to publish this journal in open access, free of charge for authors.
Please help to maintain this journal in open access!
Check that your library subscribes to the journal, or make a personal donation to the S2O programme, by contacting
subscribers@edpsciences.org
More information, including a list of sponsors and a financial transparency report, available at:
https://www.edpsciences.org/en/maths-s2o-programme
... Far fewer references deal with convergence in Wasserstein distance of conditional distributions. In [13,22,35], the authors prove the uniform convergence in Wasserstein distance of particle systems approximating conditional distributions and deduce similar properties for the killed process. However, all these references assume strong assumptions on the killing probability (or rate in continuous time) which amounts to assume that coupling of unkilled processes is faster that killing (see Section 3.1 below). ...
... Let us now discuss related results from the literature. Namely, in [13,22,35], the authors consider an interacting particle system built on an underlying absorbed Markov process, and they prove, under a small Lipschitz norm assumption for the absorbing rate, the exponential contraction of the law of the interacting particle system, uniformly in the number of particle. Since this particle system converges toward the conditional distribution of the underlying absorbed Markov process when the number of particles goes to infinity, this entails the exponential contraction in Wasserstein distance of the conditional distribution of the underlying Markov process. ...
Article
Full-text available
For general penalized Markov processes with soft killing, we propose a simple criterion ensuring uniform convergence of conditional distributions in Wasserstein distance to a unique quasi-stationary distribution. We give several examples of application where our criterion can be checked, including Bernoulli convolutions and piecewise deterministic Markov processes of the form of switched dynamical systems, for which convergence in total variation is not possible.
... For references studying the large time behavior of this type of particle systems in continuous time, see e.g. [15,1,5,11]. ...
... Since log h and ∇ log h are assumed bounded, we obtain using Lemma 2.8 the carré-du-champs estimates (10)- (11). ...
Preprint
We study mean-field particle approximations of normalized Feynman-Kac semi-groups, usually called Fleming-Viot or Feynman-Kac particle systems. Assuming various large time stability properties of the semi-group uniformly in the initial condition, we provide explicit time-uniform LpL^p and exponential bounds (a new result) with the expected rate in terms of sample size. This work is based on a stochastic backward error analysis (similar to the classical concept of numerical analysis) of the measure-valued Markov particle estimator, an approach that simplifies methods previously used for time-uniform LpL^p estimates.
... This section is based on [40]. The goal is to introduce a numerical scheme whose aim is to sample the QSD of a killed diffusion, and to prove quantitative long-time convergence rates for this algorithm. ...
... To prove propagation of chaos, we construct a coupling of the discretized Fleming-Viot process, with a system of N independent non-linear process defined in equation (40). This coupling is done is a similare fashion as for the previous coupling of R γ,N (x, ·) and R γ,N (x, ·). ...
Article
Full-text available
We review some recent results of quantitative long-time convergence for the law of a killed Markov process conditioned to survival toward a quasi-stationary distribution, and on the analogous question for the particle systems used in practice to sample these distributions. With respect to the existing literature, one of the novelties of these works is the degeneracy of the underlying process with respect to classical elliptic diffusion, namely it can be a non-elliptic hypoelliptic diffusion, a piecewise deterministic Markov process or an Euler numerical scheme.
... Far fewer references deal with convergence in Wasserstein distance of conditional distributions. In [CT16,Vil20,JM22], the authors prove the uniform convergence in Wasserstein distance of particle systems approximating conditional distributions and deduce similar properties for the killed process. However, all these references assume strong assumptions on the killing probability (or rate in continuous time) which amounts to assume that coupling of unkilled processes is faster that killing (see Section 3.1 below). ...
... Results on convergence in Wasserstein distance for conditioned Markov processes already exist in the literature. Namely, in [CT16,Vil20,JM22], the authors consider an interacting particle system built on an underlying absorbed Markov process, and they prove, under a small Lipschitz norm assumption for the absorbing rate, the exponential contraction of the law of the interacting particle system, uniformly in the number of particle. Since this particle system converges toward the conditional distribution of the underlying absorbed Markov process when the number of particles goes to infinity, this entails the exponential contraction in Wasserstein distance of the conditional distribution of the underlying Markov process. ...
Preprint
Full-text available
For general penalized Markov processes with soft killing, we propose a simple criterion ensuring uniform convergence of conditional distributions in Wasserstein distance to a unique quasi-stationary distribution. We give several examples of application where our criterion can be checked, including Bernoulli convolutions and piecewise deterministic Markov processes of the form of switched dynamical systems, for which convergence in total variation is not possible.
... This section is based on [40]. The goal is to introduce a numerical scheme whose aim is to sample the QSD of a killed diffusion, and to prove quantitative long-time convergence rates for this algorithm. ...
... To prove propagation of chaos, we construct a coupling of the discretized Fleming-Viot process, with a system of N independent non-linear process defined in equation (40). This coupling is done is a similare fashion as for the previous coupling of R γ,N (x, ·) and R γ,N (x, ·). ...
Preprint
Full-text available
We review some recent results of quantitative long-time convergence for the law of a killed Markov process conditioned to survival toward a quasi-stationary distribution, and on the analogous question for the particle systems used in practice to sample these distributions. With respect to the existing literature, one of the novelties of these works is the degeneracy of the underlying process with respect to classical elliptic diffusion, namely it can be a non-elliptic hypoelliptic diffusion, a piecewise deterministic Markov process or an Euler numerical scheme.
... Note that this is not in contradiction with Theorem 1, as we study the convergence of the magnetization whereas [35] considers the law of the process on the entire state space S n . The idea of studying the dependency in n of the rate of the long time convergence of a particle system in order to show uniform in time propagation of chaos may also be found in [38,27,28] for instance. ...
Preprint
Many low temperature particle systems in mean-field interaction are ergodic with respect to a unique invariant measure, while their (non-linear) mean-field limit may possess several steady states. In particular, in such cases, propagation of chaos (i.e. the convergence of the particle system to its mean-field limit as n, the number of particles, goes to infinity) cannot hold uniformly in time since the long-time behaviors of the two processes are a priori incompatible. However, the particle system may be metastable, and the time needed to exit the basin of attraction of one of the steady states of its limit, and go to another, is exponentially (in n) long. Before this exit time, the particle system reaches a (quasi-)stationary distribution, which we expect to be a good approximation of the corresponding non-linear steady state. Our goal is to study the typical metastable behavior of the empirical measure of such mean-field systems, starting in this work with the Curie-Weiss model. We thus show uniform in time propagation of chaos of the spin system conditioned to keeping a positive magnetization.
... In fact, between the submission of this work and its publication, we followed this approach to study Fleming-Viot processes in[31,32]. ...
Article
Full-text available
Mean-field integro-differential equations are studied in an abstract framework, through couplings of the corresponding stochastic processes. In the perturbative regime, the equation is proven to admit a unique equilibrium, toward which the process converges exponentially fast. Similarly, in this case, the associated particle system is proven to converge toward its equilibrium at a rate independent from the number of particles.
... However, to guide the reader, we end this section with some references to the literature on mean-field particle methodologies currently used in this context. Mean field and genetic particle methodologies are discussed in the series of articles on Feynman-Kac semigroups arising in physics and nonlinear filtering [5,10,61,73,74,75,76,78,79,179,200], as well as in [34,35] in the context of Dirichlet Laplacian and in [7,8,10,54,62,63,107,138,195] in the context of quasistationary measures. ...
... [18], [1], [2], [21]) and also refined limit theorems such as theorems of central limit type have been obtained. Interesting recent work include: [13], [24], [14], [12], [22], [4] and [26]. ...
Article
Full-text available
Mean-field integro-differential equations are studied in an abstract framework, through couplings of the corresponding stochastic processes. In the perturbative regime, the equation is proven to admit a unique equilibrium, toward which the process converges exponentially fast. Similarly, in this case, the associated particle system is proven to converge toward its equilibrium at a rate independent from the number of particles.
Chapter
Full-text available
We consider general Markov processes with absorption and provide criteria ensuring the exponential convergence in total variation of the distribution of the process conditioned not to be absorbed. The first one is based on two-sided estimates on the transition kernel of the process and the second one on gradient estimates on its semigroup. We apply these criteria to multi-dimensional diffusion processes in bounded domains of Rd\mathbb {R}^d or in compact Riemannian manifolds with boundary, with absorption at the boundary.
Article
Full-text available
We consider the numerical analysis of the time discretization of Feynman-Kac semigroups associated with diffusion processes. These semigroups naturally appear in several fields, such as large deviation theory, Diffusion Monte Carlo or non-linear filtering. We present errors estimates a la Talay-Tubaro on their invariant measures when the underlying continuous stochastic differential equation is discretized; as well as on the leading eigenvalue of the generator of the dynamics, which corresponds to the rate of creation of probability. This provides criteria to construct efficient integration schemes of Feynman-Kac dynamics, as well as a mathematical justification of numerical results already observed in the Diffusion Monte Carlo community. Our analysis is illustrated by numerical simulations.
Article
Discrete time analogues of ergodic stochastic differential equations (SDEs) are one of the most popular and flexible tools for sampling high-dimensional probability measures. Non-asymptotic analysis in the L² Wasserstein distance of sampling algorithms based on Euler discretisations of SDEs has been recently developed by several authors for log-concave probability distributions. In this work we replace the log-concavity assumption with a log-concavity at infinity condition. We provide novel L² convergence rates for Euler schemes, expressed explicitly in terms of problem parameters. From there we derive nonasymptotic bounds on the distance between the laws induced by Euler schemes and the invariant laws of SDEs, both for schemes with standard and with randomised (inaccurate) drifts. We also obtain bounds for the hierarchy of discretisation, which enables us to deploy a multi-level Monte Carlo estimator. Our proof relies on a novel construction of a coupling for the Markov chains that can be used to control both the L¹ and L² Wasserstein distances simultaneously. Finally, we provide a weak convergence analysis that covers both the standard and the randomised (inaccurate) drift case. In particular, we reveal that the variance of the randomised drift does not influence the rate of weak convergence of the Euler scheme to the SDE.
Book
In the last three decades, there has been a dramatic increase in the use of interacting particle methods as a powerful tool in real-world applications of Monte Carlo simulation in computational physics, population biology, computer sciences, and statistical machine learning. Ideally suited to parallel and distributed computation, these advanced particle algorithms include nonlinear interacting jump diffusions; quantum, diffusion, and resampled Monte Carlo methods; Feynman-Kac particle models; genetic and evolutionary algorithms; sequential Monte Carlo methods; adaptive and interacting Markov chain Monte Carlo models; bootstrapping methods; ensemble Kalman filters; and interacting particle filters. Mean Field Simulation for Monte Carlo Integration presents the first comprehensive and modern mathematical treatment of mean field particle simulation models and interdisciplinary research topics, including interacting jumps and McKean-Vlasov processes, sequential Monte Carlo methodologies, genetic particle algorithms, genealogical tree-based algorithms, and quantum and diffusion Monte Carlo methods. Along with covering refined convergence analysis on nonlinear Markov chain models, the author discusses applications related to parameter estimation in hidden Markov chain models, stochastic optimization, nonlinear filtering and multiple target tracking, stochastic optimization, calibration and uncertainty propagations in numerical codes, rare event simulation, financial mathematics, and free energy and quasi-invariant measures arising in computational physics and population biology. This book shows how mean field particle simulation has revolutionized the field of Monte Carlo integration and stochastic algorithms. It will help theoretical probability researchers, applied statisticians, biologists, statistical physicists, and computer scientists work better across their own disciplinary boundaries.