Content uploaded by Andreas Petersson
Author content
All content in this area was uploaded by Andreas Petersson on Nov 14, 2018
Content may be subject to copyright.
MONTE CARLO VERSUS MULTILEVEL MONTE CARLO IN WEAK
ERROR SIMULATIONS OF SPDE APPROXIMATIONS
ANNIKA LANG AND ANDREAS PETERSSON
Abstract. The simulation of the expectation of a stochastic quantity E[Y] by Monte Carlo
methods is known to be computationally expensive especially if the stochastic quantity or its
approximation Ynis expensive to simulate, e.g., the solution of a stochastic partial differential
equation. If the convergence of Ynto Yin terms of the error |E[Y−Yn]|is to be simulated,
this will typically be done by a Monte Carlo method, i.e., |E[Y]−EN[Yn]|is computed. In
this article upper and lower bounds for the additional error caused by this are determined
and compared to those of |EN[Y−Yn]|, which are found to be smaller. Furthermore, the cor-
responding results for multilevel Monte Carlo estimators, for which the additional sampling
error converges with the same rate as |E[Y−Yn]|, are presented. Simulations of a stochastic
heat equation driven by multiplicative Wiener noise and a geometric Brownian motion are
performed which confirm the theoretical results and show the consequences of the presented
theory for weak error simulations.
1. Introduction
Weak error analysis for approximations of solutions of stochastic partial differential equa-
tions (SPDEs for short) is one of the topics that is currently under investigation within the
community of numerical analysis of SPDEs. The goal of weak error analysis is to quantify
how well we can approximate a quantity of interest that depends on the solution of an SPDE.
While weak convergence rates for equations driven by additive noise are already available
(see, e.g., [9, 1, 16, 6] and references therein), convergence rates for fully discrete approxi-
mations of SPDEs driven by multiplicative noise are still under consideration. First results
for semi-discrete approximations in space or time are available (cf., e.g., [8, 2, 7, 12]) that
suggest that one can, as in the case of additive noise, expect a weak convergence rate of twice
the order of the strong one, i.e., of mean square convergence. Nevertheless, the simulation of
weak error rates has caused problems so far and results are rarely available. First attempts
can be found in [17, 18].
There are several factors that cause problems in the simulation of weak error rates of SPDE
approximations but one of the main reasons is the computational complexity of simulating
the solution on a fine grid. To give estimates on quantities of interest, which include the
approximation of an expectation, the computational complexity is multiplied by the number
of samples in Monte Carlo type methods that are necessary to obtain a reasonable result.
1991 Mathematics Subject Classification. 65C05, 60H15, 41A25, 65C30, 65N30.
Key words and phrases. (multilevel) Monte Carlo methods, variance reduction techniques, error simulation,
stochastic partial differential equations, weak convergence, upper and lower error bounds.
Acknowledgement. The authors wish to express many thanks to Stig Larsson, Christoph Schwab, and two
anonymous referees for fruitful discussions and helpful comments. The work was supported in part by the
Swedish Research Council under Reg. No. 621-2014-3995 as well as the Knut and Alice Wallenberg foundation.
The simulations were performed on resources at Chalmers Centre for Computational Science and Engineering
(C3SE) provided by the Swedish National Infrastructure for Computing (SNIC) .
1
arXiv:1512.05317v2 [math.PR] 27 Sep 2016
2 A. LANG AND A. PETERSSON
We observed in practice that the simulated weak errors in our model problem are very small
which in turn requires a far from realistic number of samples in Monte Carlo simulations to
get acceptable results.
Motivated by this model problem, we look into the properties of the used estimators.
While we are interested in simulating the error |E[Y−Yn]|for a sequence of approximations
(Yn)n∈N0converging to the real-valued random variable Y, the quantity E[Yn] is analytically
not available. The standard approach is to approximate the expectation by a Monte Carlo
estimator EN[Yn]. We show in this manuscript that the additional error when using the
estimator |E[Y]−EN[Yn]|instead of the original error is for small errors essentially bounded
from above and below by N−1/2in mean square, where Ndenotes the number of Monte
Carlo samples. Furthermore, we consider the estimator |EN[Y−Yn]|instead and show that
the bounds improve to N−1/2Var[Y−Yn]1/2, i.e., the number of samples is multiplied by
the variance of the error, which can be seen as the strong error in the context of SPDE
approximations. Finally, we substitute the Monte Carlo estimators by the corresponding
multilevel Monte Carlo estimators and show that the additional error decreases to |E[Y−Yn]|,
i.e., to the error that we are interested in. We confirm the theoretical results in simulations
of weak errors of the stochastic heat equation driven by multiplicative noise and a geometric
Brownian motion. With the new estimators, we are to the best of our knowledge the first
to be able to show weak convergence rates for an SPDE driven by multiplicative noise in
simulations.
This manuscript is organized as follows. In Section 2 we recall Monte Carlo (MC for short)
and multilevel Monte Carlo (MLMC) estimators for real-valued random variables. Upper and
lower bounds for the approximation of |E[Y−Yn]|by different Monte Carlo type estimators
are shown. A short review on SPDEs and their approximation in space, time, and noise is
given in Section 3 and available convergence results are recalled. Finally, simulation results
of strong and weak errors using the estimators introduced in Section 2 for the stochastic
heat equation driven by multiplicative noise and a geometric Brownian motion are shown in
Section 4. These confirm the theoretical results of Section 2.
2. Monte Carlo versus multilevel Monte Carlo in error analysis
In this section we consider upper and lower bounds for the sampling errors that arise when
performing weak error simulations in practice. It turns out that it is not surprising that it
has not been possible so far to numerically implement weak error analysis for approximations
of SPDEs driven by multiplicative noise. Nevertheless, this section is not SPDE specific but
formulated more generally for real-valued, square integrable random variables. Quantities of
interest in SPDE applications are examples of the framework considered in this section.
Let (Ω,A, P ) be a probability space and let us for p≥1 denote by Lp(Ω; R) the space of
all real-valued random variables Ysuch that kYkp
Lp(Ω;R)=E[|Y|p]<+∞. We recall that the
Monte Carlo estimator ENof a real-valued random variable Y: Ω →Ris given by
(1) EN[Y] = N−1
N
X
i=1
Y(i),
where (Y(i))N
i=1 is a sequence of independent, identically distributed random variables that
have the same law as Y. Furthermore, the multilevel Monte Carlo estimator ELof a sequence
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 3
of random variables (Y`)`∈N0is defined by
(2) EL[YL] = EN0[Y0] +
L
X
`=1
EN`[Y`−Y`−1]
for L∈N, where (N`)L
`=0 consists of level specific numbers of samples in the Monte Carlo
estimators. For more details on multilevel Monte Carlo methods the reader is referred to the
large literature starting with [11, 10].
For later estimates we present the following well-known property of a Monte Carlo estima-
tor, which is a specific form of the law of large numbers and can for example be found in [5,
Lemma 4.1].
Lemma 2.1. For N∈Nand for Y∈L2(Ω; R)it holds that
kE[Y]−EN[Y]kL2(Ω;R)=1
√N
Var[Y]1/2≤1
√NkYkL2(Ω;R).
From now on let us consider a square integrable random variable Y: Ω →R, i.e.,
Y∈L2(Ω; R), and a sequence of approximations (Yn)n∈N0of Y. We assume that it is known
that (Yn)n∈N0converges to Yin the sense that
lim
n→∞ |E[Y−Yn]|= 0.
In order to estimate convergence rates, one is interested in the simulation of |E[Y−Yn]|,
which usually cannot be done exactly but has to be approximated. If one were interested in
estimating E[Y−Yn], the method of common random numbers would tell us that when Yand
Ynare positively correlated, it is better to use an estimator of the form EN[Y−Yn] rather
than E[Y]−EN[Yn], since the latter has higher variance and both are unbiased. Now, when
estimating |E[Y−Yn]|, the estimators become |E[Y]−EN[Yn]|and |EN[Y−Yn]|instead,
neither of which is in general unbiased. In the following lemmas, we therefore show upper
and lower bounds on the sampling errors
k|E[Y−Yn]|−|E[Y]−EN[Yn]|kL2(Ω;R)
and
k|E[Y−Yn]|−|EN[Y−Yn]|kL2(Ω;R),
in mean square sense.
Proposition 2.2. The sampling error of approximating |E[Y−Yn]|by |E[Y]−EN[Yn]|is
bounded from below by
k|E[Y−Yn]|−|E[Y]−EN[Yn]|kL2(Ω;R)≥ −|E[Y−Yn]|+|E[Y−Yn]|2+N−1Var[Yn]1/2
and from above by
k|E[Y−Yn]|−|E[Y]−EN[Yn]|kL2(Ω;R)≤N−1/2(Var[Yn])1/2.
Proof. To prove the proposition let us first observe that
E[(E[Y]−EN[Yn])2]
=E[(E[Y−Yn]+(E[Yn]−EN[Yn]))2]
= (E[Y−Yn])2+E[(E[Yn]−EN[Yn])2]+2E[Y−Yn]E[E[Yn]−EN[Yn]],
4 A. LANG AND A. PETERSSON
which implies with Lemma 2.1 and since EN[Yn] is an unbiased estimator of E[Yn] that
E[(E[Y]−EN[Yn])2] = |E[Y−Yn]|2+N−1Var[Yn].
Using this observation we obtain for the squared sampling error that
k|E[Y−Yn]|−|E[Y]−EN[Yn]|k2
L2(Ω;R)
=E|E[Y−Yn]|2+|E[Y]−EN[Yn]|2−2|E[Y−Yn]|·|E[Y]−EN[Yn]|
= 2|E[Y−Yn]|2+N−1Var[Yn]−2|E[Y−Yn]|E[|E[Y]−EN[Yn]|]
=N−1Var[Yn]−2|E[Y−Yn]|E[|E[Y]−EN[Yn]|−|E[Y−Yn]|].
To find the lower bound, we observe that
E[|E[Y]−EN[Yn]|−|E[Y−Yn]|]≤E[||E[Y]−EN[Yn]|−|E[Y−Yn]||]
≤E(|E[Y−Yn]|−|E[Y]−EN[Yn]|)21/2
by the properties of the expectation and H¨older’s inequality. Setting
En,N =k|E[Y−Yn]|−|E[Y]−EN[Yn]|kL2(Ω;R),
which is a positive quantity, we therefore obtain the inequality
E2
n,N + 2|E[Y−Yn]|En,N −N−1Var[Yn]≥0.
This is solved using the non-negativity of En,N by
En,N ≥ −|E[Y−Yn]|+|E[Y−Yn]|2+N−1Var[Yn]1/2,
which finishes the proof of the lower bound.
For the upper bound we apply the reverse triangle inequality to obtain that
k|E[Y−Yn]|−|E[Y]−EN[Yn]|k2
L2(Ω;R)≤ kE[Yn]−EN[Yn]k2
L2(Ω;R)=N−1Var[Yn],
where the last step follows from Lemma 2.1.
Having shown that for |E[Y−Yn]| N−1/2the sampling error is essentially bounded from
below and above by N−1/2in terms of the number of Monte Carlo samples when simulating
|E[Y]−EN[Yn]|, we continue with the sampling error for |EN[Y−Yn]|. It turns out that
this decays for a fixed number of Monte Carlo samples with the square root of the variance
of Y−Yn.
Proposition 2.3. The sampling error of approximating |E[Y−Yn]|by |EN[Y−Yn]|is bounded
from below by
k|E[Y−Yn]|−|EN[Y−Yn]|kL2(Ω;R)
≥ −|E[Y−Yn]|+|E[Y−Yn]|2+N−1Var[Y−Yn]1/2
and from above by
k|E[Y−Yn]|−|EN[Y−Yn]|kL2(Ω;R)≤N−1/2(Var[Y−Yn])1/2.
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 5
Proof. The proof of the lower bound is performed in the same way as that of Proposition 2.2.
The only difference is that we have to simplify E[(EN[Y−Yn])2] instead of E[(E[Y]−EN[Yn])2],
which we do in what follows. Therefore, let us observe that due to the properties of the
variance and the unbiasedness of the Monte Carlo estimator
Var[EN[Y−Yn]] = E[(EN[Y−Yn])2]−(E[EN[Y−Yn]])2
=E[(EN[Y−Yn])2]−(E[Y−Yn])2.
The independence of the random variables in the Monte Carlo estimator and the Bienaym´e
formula imply that
Var[EN[Y−Yn]] = N−1Var[Y−Yn],
which overall leads to
E[(EN[Y−Yn])2] = N−1Var[Y−Yn] + |E[Y−Yn]|2.
The proof of the lower bound is finished by applying this formula in the proof of Proposition 2.2
and calculating accordingly.
For the upper bound we observe that
k|E[Y−Yn]|−|EN[Y−Yn]|k2
L2(Ω;R)≤ kE[Y−Yn]−EN[Y−Yn]k2
L2(Ω;R)
=N−1Var[Y−Yn]
again by the reverse triangle inequality and Lemma 2.1, which finishes the proof.
We first remark that the upper bounds in Proposition 2.2 and Proposition 2.3 were already
obtained in the context of weak errors for SPDE approximations in [19, Proposition 5.4].
From the upper bound in Proposition 2.2 we learn that the sampling error will not be
worse than the Monte Carlo error. At the same time, under the assumption that the quantity
of interest |E[Y−Yn]| N−1/2is very small, we also see that the lower bound implies that
we are not able to do better. Therefore, the sampling error is essentially bounded from
below and above by the Monte Carlo error. This is not surprising but proves how heavily
the simulation relies on the number of Monte Carlo samples for small errors |E[Y−Yn]|.
For relatively cheap computations of samples of Ynfor arbitrarily large n∈N, this is no
problem. Nevertheless, in our context, where Yand Ynare functionals of the solution to an
SPDE and its approximation, respectively, the computation is very expensive and the errors
|E[Y−Yn]|are usually very small compared to their variance. Therefore, it is of no surprise
that weak error simulations for SPDEs are still missing in the literature or that many people
have failed to simulate them by Monte Carlo methods. One example of a failure with the
standard estimator is shown in Section 4.
Looking into Proposition 2.3, we see that we obtain similar upper and lower bounds but
instead of (Var[Yn])1/2, the Monte Carlo error N−1/2is multiplied by (Var[Y−Yn])1/2, which
is usually smaller than Var[Yn]1/2and also decrease in nif (Yn)n∈N0is a sequence of approx-
imations that converges in L2(Ω; R) to Y. Therefore we expect faster convergence for the
estimator in Proposition 2.3 than in Proposition 2.2. This is tested and confirmed in Sec-
tion 4 for a stochastic heat equation driven by multiplicative noise and a geometric Brownian
motion.
As a second step in the error analysis, we now consider upper and lower bounds for the
sampling error if multilevel Monte Carlo estimators are used instead of the corresponding
singlelevel ones, which we discussed above. The results are obtained in a similar way as
before.
6 A. LANG AND A. PETERSSON
Proposition 2.4. The sampling error of approximating |E[Y−YL]|by |E[Y]−EL[YL]|is
bounded from below by
k|E[Y−YL]|−|E[Y]−EL[YL]|kL2(Ω;R)
≥ −|E[Y−YL]|+ |E[Y−YL]|2+N−1
0Var[Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1]!1/2
and from above by
k|E[Y−YL]|−|E[Y]−EL[YL]|kL2(Ω;R)≤ N−1
0Var[Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1]!1/2
.
Proof. The lower bound is again proven in the same way as in Proposition 2.2, where the only
difference is in the computation of E[(E[Y]−EL[YL])2], which we include for completeness. We
obtain by the independence of the Monte Carlo estimators on different levels, its unbiasedness,
and by Lemma 2.1 that
E[( E[Y]−EL[YL])2]
= (E[Y−YL])2+E[(E[Y0]−EN0[Y0])2] +
L
X
`=1
E[(E[Y`−Y`−1]−EN`[Y`−Y`−1])2]
=|E[Y−YL]|2+N−1
0Var[Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1].
For the upper bound we apply after the reverse triangle inequality the same arguments as in
the previous computation which yield
k|E[Y−YL]|−|E[Y]−EL[YL]|k2
L2(Ω;R)≤ kE[YL]−EL[YL]k2
L2(Ω;R)
=N−1
0Var[Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1].
This finishes the proof of the proposition.
Here the performance of the upper and lower bound depends on the choice of the sample
sizes for the different levels of the multilevel Monte Carlo estimator. In Theorem 1 in [18]
it is assumed that there are upper bounds for |E[Y−Y`]|and Var[Y`−Y`−1]. If we assume
that we know the errors exactly, we can set a`=|E[Y−Y`]|and a2η
`=Var[Y`−Y`−1] in the
notation of that theorem in [18] and choose the sample sizes accordingly. This is made precise
in the following corollary that states that the correct choice of samples leads to a sampling
error of the same size up to a constant as |E[Y−YL]|which we would like to observe.
Corollary 2.5. Choosing for a fixed level L∈Nthe sample sizes in the multilevel Monte
Carlo estimator N0=d|E[Y−YL]|−2eand N`=d|E[Y−YL]|−2Var[Y`−Y`−1]`1+efor
`= 1, . . . , L and some > 0it holds that
k|E[Y−YL]|−|E[Y]−EL[YL]|kL2(Ω;R)≥(2−1/2(3 + Var[Y0])1/2−1)|E[Y−YL]|
and
k|E[Y−YL]|−|E[Y]−EL[YL]|kL2(Ω;R)≤(Var[Y0] + ζ(1 + ))1/2|E[Y−YL]|,
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 7
where ζdenotes the Riemann zeta function. Therefore,
k|E[Y−YL]|−|E[Y]−EL[YL]|kL2(Ω;R)' |E[Y−YL]|,
i.e., the sampling error converges with the same rate as |E[Y−YL]|.
Proof. Let us first observe that for x≥1 we have that x≤ dxe ≤ 2x. This implies with the
given choices of N`,`= 0, . . . , L, that
N−1
0Var[Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1]≤ |E[Y−YL]|2 Var[Y0] +
L
X
`=1
`−(1+)!
as well as
N−1
0Var[Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1]≥2−1|E[Y−YL]|2 Var[Y0] +
L
X
`=1
`−(1+)!.
We observe next that
1≤
L
X
`=1
`−(1+)≤ζ(1 + )
for all L∈Nand plug the obtained inequalities into the equations in Proposition 2.4 to finish
the proof of the corollary.
For completeness we include the equivalent statement to Proposition 2.3 for the multilevel
Monte Carlo estimator, but we remark that it is of no practical interest. This is due to
the fact that in particular EN0[Y−Y0] has to be computed, i.e., many samples of the exact
solution must be generated, which is computationally too expensive and destroys the idea of
multilevel Monte Carlo methods.
Proposition 2.6. The sampling error of approximating |E[Y−YL]|by |EL[Y−YL]|is bounded
from below by
k|E[Y−YL]|−|EL[Y−YL]|kL2(Ω;R)
≥ −|E[Y−YL]|+ |E[Y−YL]|2+N−1
0Var[Y−Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1]!1/2
and from above by
k|E[Y−YL]|−|EL[Y−YL]|kL2(Ω;R)≤ N−1
0Var[Y−Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1]!1/2
.
Proof. The proof is again performed in the same way as that of Proposition 2.2, where it is
essential to derive
E[(EL[Y−YL])2] = |E[Y−YL]|2+N−1
0Var[Y−Y0] +
L
X
`=1
N−1
`Var[Y`−Y`−1].
Due to the repetition in techniques and the rather theoretical nature of the claim we leave
further details of the proof to the interested reader.
8 A. LANG AND A. PETERSSON
3. Approximation of mild SPDE solutions
In this section we employ the framework of [17] in a simplified setting and recall some of the
results of that monograph. We provide a noise approximation result for a stochastic evolution
equation with multiplicative noise in the very end of this section. Let H=L2([0,1]; R) be the
space of square integrable functions on the unit interval [0,1] with inner product h·,·iHgiven
by hv, wiH=R1
0v(x)w(x) dx, which is a real separable Hilbert space with orthonormal ba-
sis (ej)j∈N, where ej(x) = √2 sin(jπx). Let Q∈L(H), where L(H) is the space of all bounded
linear operators from Hto H, be a self-adjoint, positive definite operator of trace class. We
denote by H0=Q1/2(H) the Hilbert space with inner product h·,·iH0=hQ−1/2·, Q−1/2·iH,
where Q−1denotes the pseudo-inverse of Q. Furthermore, we let (Ω,A,(F)t∈[0,T], P ) be the
extension of the probability space in Section 2 with a normal filtration. We assume that
W= (W(t))t≥0is an (F)t∈[0,T ]- adapted Q-Wiener process. In this framework we consider
for t∈[0, T ] the stochastic partial differential equation
(3) dX(t)−∆X(t) dt=G(X(t)) dW(t)
with initial condition X(0) = X0, which we refer to as the one-dimensional heat equation
driven by multiplicative Wiener noise. Here we denote by ∆ the Laplace operator with zero
boundary conditions. It holds that −∆ has eigenbasis (ej)j∈Nwith eigenvalues λj=j2π2and
∆ generates a C0-semigroup of contractions denoted by S= (S(t))t≥0on H. The fractional
operator (−∆)r/2:˙
Hr→Hhas domain ˙
Hr={v∈H:kvk2
r=P∞
j=1 λr
jhv, eji2
H}for r≥0.
It holds that ˙
Hris a separable Hilbert space when equipped with the inner product
h·,·ir=D(−∆)r
2·,(−∆)r
2·EH.
We impose further assumptions on the parameters of (3) in what follows, which are stronger
than Assumptions 2.13 and 2.17 in [17, Chapter 2] and hence guarantee the existence and
uniqueness of a mild solution
(4) X(t) = S(t)X0+Zt
0
S(t−s)G(X(s)) dW(s).
Assumption 3.1. Assume that the parameters of (3) satisfy:
(i) The trace class operator Qis defined through the relation Qej=µjejwhere µj=
Cµj−ηfor two constants Cµ>0 and η > 1.
(ii) Fix a parameter r∈[0,1). The mapping G:H→LHS (H0;H) satisfies for a constant
C > 0
(a) G(v)∈LHS(H0;˙
Hr) for all v∈˙
Hr,
(b) k(−∆)r/2G(v)kLHS(H0;H)≤C(1 + kvkr) for all v∈˙
Hr,
(c) kG(v1)−G(v2)kLHS(H0;H)≤Ckv1−v2kHfor all v1, v2∈H, and
(d) kG(v)ejkH≤C(1 + kvkH) for all basis vectors ej∈Hand v∈H,
where LHS(H0;H) denotes the space of Hilbert–Schmidt operators from H0to H.
(iii) Assume that X0∈˙
H1+ris a deterministic initial value.
We remark that in the notation of Assumption 3.1(i), we may write Win terms of its
Karhunen–Lo`eve expansion
(5) W(t) =
∞
X
j=1
µ
1
2
jβj(t)ej,
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 9
where (βj)j∈Nis a sequence of independent, real-valued Wiener processes.
In order to be able to simulate realizations of the mild solution (4), we approximate it
by a Galerkin finite element method in space and an implicit Euler–Maruyama scheme in
time. For this, let (Vh)h∈(0,1] be the nested sequence of finite-dimensional subspaces, where
Vh⊂˙
H1⊂His given for each hby the family of continuous functions that are piecewise
linear on the intervals [xj, xj+1] of an equidistant partition (xj)Nh
j=0 of [0,1] defined by xj=jh
for j= 0,1...,Nhand zero on the boundary, where Nh= 1/h is assumed to be an integer.
We define the discrete operator −∆h:Vh→Vhon each vh∈Vhby letting −∆hvhbe the
unique element of Vhsuch that
h−∆vh, whiH=hvh, whi1=h−∆hvh, whiH
for all wh∈Vh.
For the discretization in time we define a uniform time grid (tj)Nk
j=1 with time step size
k∈(0,1] by tj=jk for j= 0,1, . . . , Nk, where Nk=T /k is again assumed to be an integer.
To be able to implement this approximation scheme on a computer, one must also consider
a noise approximation, i.e., an approximation of the Q-Wiener process W. One way of doing
this is to truncate the Karhunen–Lo`eve expansion (5), which has earlier been considered for
example in [15, 14, 4, 3, 13]. This leads to the Wiener process
Wκ(t) =
κ
X
j=1
µ
1
2
jβj(t)ej.
The fully discrete implicit Euler-Maruyama approximation Xj
κ,h of X(tj) is then given in
recursive form by
(6) Xj
κ,h −Xj−1
κ,h −k(∆hXj
κ,h) = PhG(Xj−1
κ,h )(Wκ(tj)−Wκ(tj−1))
for j= 1, . . . , Nkwith initial condition X0
h=PhX0, where Phdenotes the orthogonal pro-
jection onto Vh. This scheme converges strongly with O(k1/2) to the mild solution, which is
stated in the following theorem that combines [17, Theorem 3.14] with an additional noise
approximation.
Theorem 3.2. Under Assumption 3.1 for fixed r∈[0,1) and the couplings k1/2'h1+r,
κ'h−βwith β(η−1) = 2(1 + r), it holds that for all p≥2there exists a constant C > 0
such that for all k∈(0,1] and j= 1, . . . , Nk
kX(tj)−Xj
κ,hkLp(Ω;H)≤Ck1/2.
Proof. The proof is the same as that of [19, Theorem 4.3] except that
kGκ,h(s)ejk2
H≤C(kXj
κ,hk2
H+ 1)
is used in the estimate of IIbin that proof, which does not change the final bound on IIb.
While strong approximations are well understood and proven for the considered framework,
weak convergence rates in the sense of bounds of the error
|E[φ(Xj
h)] −E[φ(X(tj))]|,
where φis a smooth functional, are still missing. Nevertheless, results on the convergence of
approximations of SPDEs driven by additive noise (cf., e.g., [1, 16]) as well as semi-discrete
approximations of either space (cf., e.g., [8, 2, 7]) or time (cf., e.g., [12]) suggest that the weak
10 A. LANG AND A. PETERSSON
convergence rate is twice the strong one. This is in accordance with the results obtained in
our simulations in Section 4, where we consider the case φ=k·k2
H. Our choice of the test
function φensures that the weak error is bounded by the strong error (cf., [17, Chapter 6]
and [19, Chapter 5]).
4. Simulation
In this section simulation results that combine the theory of Section 2 and Section 3 are
presented, i.e., weak errors of an SPDE approximation are computed with Monte Carlo and
multilevel Monte Carlo methods and compared. In the setting of Assumption 3.1, we now
fix the parameters T= 1, X(0, x) = x−x2and Cµ=η= 5 and consider two choices of the
operator G:H→LHS(H0;H).
The first operator G1is defined for v∈Hand v0∈H0by
G1(v)v0=
∞
X
j=1 hv, ejiHhv0, ejiHej,
while the so called Nemytskii type operator G2is defined for x∈[0,1] by
(G2(v)v0)[x] = sin(v(x))v0(x).
These operators satisfy Assumption 3.1(ii) for r∈[0,∞) and r∈[0,1/2) respectively, shown
in [17, Section 6.4] and [17, Example 2.23].
The choice G=G1admits an analytical solution Xof (3) and for this the identities
X(t) =
∞
X
j=1 hX0, ejiHexp −(λj+µj
2)t+µ
1
2
jβj(t)ej,(7)
kX(t)k2
H=
∞
X
j=1 hX0, eji2
Hexp −(2λj+µj)t+ 2µ
1
2
jβj(t),(8)
E[kX(t)k2
H] =
∞
X
j=1 hX0, eji2
Hexp ((−2λj+µj)t)(9)
hold for all t∈[0, T ]. The numerical approximations XNk
κ,h of X(T) are now computed by first
setting X0
κ,h =IhX0and then recursively solving the numerical equation
Xj
κ,h −Xj−1
κ,h −k(∆hXj
κ,h) = Gh
i(Xj−1
κ,h ) (Wκ(tj)−Wκ(tj−1)) for j= 1, . . . , Nk,
where the interpolation operator Ih:H→Vhis defined by
Ihf(x) =
Nh−1
X
j=1
f(xj)Φj(x)
and for Gh
i:Vh→LHS(H0;Vh), i∈ {1,2}, we set
Gh
1(vh)v0=
κ
X
j=1 hvh, IhejiHhv0, ejiHIhej
as well as
Gh
2(vh)v0=IhG2(vh)v0.
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 11
The replacement of the operator Phwith Ihmirrors to a large extent the setting of [17,
Chapter 6], where the author notes that this is quite common in practice, and we note
that the simulation results below indicate that the order of convergence is not affected. For
refinement levels `∈Nwe set k`= 2−2`and, as a shorthand notation, we write ˆ
X`=XNk`
κ`,h`
for the end time evaluation of (6) with h`=k1/2
`and κ`=k−1/2
`. With these choices, by
Theorem 3.2, we expect a strong convergence rate of order 1/2 in time and by the usual rule
of thumb that the weak rate of convergence is twice the strong one, we expect a weak rate of
order 1.
The following simulations were performed on the Glenn cluster at Chalmers Centre for
Computational Science and Engineering (C3SE) using the MATLAB Distributed Computing
ServerTM. In all of them, we approximate kvk2
Hfor v∈Hby N−1
h`PNh`
j=1 v(xj)2.
In Figure 1(a) an approximation of the strong error kX(T)−ˆ
X`kL2(Ω;H), i.e.,
ENX(T)−ˆ
X`(T)
2
H1
2
,
is calculated for levels `= 1,...,7, where we replace the exact solution Xwith a reference
solution ˜
XL. For G=G1the reference solution ˜
XLis given by (7) truncated at j=κL= 2L,
while for G=G2we let ˜
XL=ˆ
XL, since we do not have access to an analytic solution. We
let L= 8 and take N= 12 ·103samples. It should be noted that the same realizations of the
N Q-Wiener processes are used for the error computations on all levels. The observed error
rate is asymptotically O(k1/2) and therefore consistent with Theorem 3.2.
100101102103104
10−11
10−10
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
time steps
strong errors
Order 1/2
G1 strong errors
G2 strong errors
(a) Strong error for `= 1,...,7 with N= 12 ·103
samples.
100101102103104
10−11
10−10
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
time steps
weak errors
Order 1
G1 weak errors I
G1 weak errors II
G2 weak errors I
G2 weak errors II
(b) Weak error for `= 1,...,5 with N= 3 ·103sam-
ples.
Figure 1. Approximations of strong and weak errors using singlelevel Monte
Carlo estimators.
Next, we estimate the weak error and compare the performance of the singlelevel Monte
Carlo estimators of Propositions 2.2 and 2.3. In Figure 1(b) the weak error is approximated
with N= 3 ·103samples and `= 1,...,5. For the weak error approximation according to
12 A. LANG AND A. PETERSSON
Proposition 2.2 with Y=kX(T)k2
Hand Y`=kˆ
X`k2
Hwe set
e1,` =EkX(T)k2
H−ENhkˆ
X`k2
Hi,
which we refer to as error of type I in what follows. Here we calculate ENhkˆ
X`k2
Hi,
`= 1,...,5, using separate sets of realizations of N Q-Wiener processes for each level `.
Furthermore, we replace EkX(T)k2
Hby (9) evaluated at t=Tand truncated at j= 106
in the case of G=G1. In the case of G=G2, we replace it instead by a reference solution
ENhkˆ
XLk2
Hiwith L= 8 and N= 104, which is calculated on an independent set of Q-Wiener
processes. For the weak error approximation according to Proposition 2.3
e2,` =ENhkX(T)k2
H− k ˆ
X`k2
Hi,
called error of type II in what follows, the samples of kX(T)k2
H− k ˆ
X`k2
H,`= 1,...,5, are
computed on the same set of N Q-Wiener processes. In the case of G=G1, we replace the
exact solution kX(T)k2
Hwith (8) evaluated at t=Tand truncated at j=h−1
L. For G=G2
we use again a reference solution kˆ
XLk2
H. In said figure, i.e., Figure 1(b), we show the average
of these estimators
eM
i,` =M−1
M
X
j=1
ej
i,`
for i= 1,2, where (ej
i,`)M
j=1 is a sequence of independent copies of ei,` and M= 10, to see
how they perform in general. While the errors of type II supersede and then approach twice
the strong order of convergence, the errors of type I do not. This is due to the limitation
of the convergence by the number of Monte Carlo samples from below as shown in Proposi-
tion 2.2, that is to say, with a constant sample size Nwe get a sampling error proportional
to Var(kˆ
X`k2
H). This indicates that the observation of weak convergence results with a naive
Monte Carlo estimator cannot be computed satisfactory in an acceptable time even for such
a relatively easy example, where details on the computational times are collected for all
estimators at the end of the example.
For the errors of type II the rate of convergence seems to decrease for the last level. This
is explained by Proposition 2.3—in contrast to the the type I errors the sampling error is
proportional to Var(kX(T)k2
H− k ˆ
X`k2
H)1/2which is bounded by the strong error (measured
in L4(Ω; R)) and therefore the rate of convergence starts to resemble O(k1/2).
For the last set of simulations in Figure 2 we compare singlelevel type I errors to type I
weak errors obtained using the multilevel Monte Carlo estimator (2) instead of the naive
Monte Carlo approximation (1). For `∈N0we set k`=k02−2`, h`=k1/2
`, κ`=k−1/2
`to
obtain a series of fully discrete approximations ˇ
X`=XNk`
κ`,h`, where for computational reasons
we let k0= 2−2. In the errors
ˆeL=EkX(T)k2
H−ELkˇ
XLk2
H,
we replace E[kX(T)k2
H] by the same quantities as for the type I errors in Figure 1(b). We
note that each multilevel estimate ELkˇ
XLk2
His generated independently of one another
so that the type I errors become a natural comparison. Figure 2 shows the multilevel error
approximations for L= 1,...,5 and the corresponding singlelevel errors of type I with sample
sizes N= 3 ·103. We again show an average of these and let M= 100. From the figure we
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 13
observe that the multilevel Monte Carlo estimators show the weak convergence rate, while the
errors of type I fail. The latter errors are dominated by the sampling error as before, but for
the multilevel estimator we know from Corollary 2.5 that the sampling error is bounded from
above and below by the weak error. This explains why this approach succeeds in showing
the expected weak convergence rates. The fact that the errors of type I are in total smaller
than those obtained by the multilevel Monte Carlo simulation is due to the larger constant
in the error estimates of Corollary 2.5 which can be reduced by enlarging the overall number
of samples in the multilevel Monte Carlo method.
100101102103104
10−11
10−10
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
time steps
weak errros
Order 1
G1 weak errors I
G1 weak errors MLMC I
G2 weak errors I
G2 weak errors MLMC I
Figure 2. Average of 100 realizations of MLMC computed at levels L=
1,...,5 with the corresponding MC estimates of the weak error.
To give the reader an idea of the computational complexity of the shown convergence plots,
we include the computing times, rounded off to the nearest hour, for 8 computing nodes with
a total of 128 cores on the Glenn cluster of C3SE. The strong error plot Figure 1(a) cost
40 hours, while the weak error simulation of type II in Figure 1(b) took 102 hours. The
reference solution used for the weak singlelevel errors of type I and the multilevel Monte
Carlo errors for G=G2were computed in 13 hours. The costs for the weak errors of type
I in Figure 1(b) are negligible (i.e., less than one hour). The computation of the multilevel
errors in Figure 2 took 32 hours, while it took just 2 hours for the singlelevel errors. It is
important to note that the computation of the type I errors was quite cheap since we could
reuse the reference solution for G=G2. One should also be aware that we would have needed
to increase the number of samples by at least a factor of 212 to see the weak convergence for
the type I error of G=G1, which would have increased the computational time to more than
8000 hours.
14 A. LANG AND A. PETERSSON
In conclusion we have seen in this section that the simulation of weak errors of SPDE
approximations causes severe problems which we already expected out of the theory in Sec-
tion 2. The use of a multilevel Monte Carlo estimator and a modified Monte Carlo estimator
finally led to the expected weak convergence plots due to a faster convergence of the sampling
error caused by the approximation of the expectation shown theoretically in Section 2. It is
important to point out at this point here that these are to our knowledge the first successful
simulations of weak errors for SPDEs driven by multiplicative noise.
Due to the limitations in computational complexity, we further illustrate the theoretical
results of Section 2 with the simulation of an ordinary stochastic differential equation. The
relative cheapness of such a simulation allows us to make the consequences of Section 2 even
clearer than above. Let us therefore consider in what follows the easy example of a geometric
Brownian motion in one dimension, i.e., the stochastic differential equation
(10) dX(t) = µX(t) dt+σX(t) dW(t)
with initial condition X(0) = X0∈Rand t∈[0, T ], where µ, σ ∈Rand Wdenotes a one-
dimensional Brownian motion. The solution to the geometric Brownian motion is known to
be
X(t) = X0exp((µ−σ2/2)t+σW (t))
and the second moment can be computed explicitly to be
E[|X(t)|2] = X2
0exp((2µ+σ2)t).
For the approximation let us consider an equidistant time discretization (tj)Nk
j=1 with time
step size k∈(0,1] by tj=jk for j= 0,1, . . . , Nk, where Nk=T /k is assumed to be an
integer. The Euler–Maruyama scheme is then given by the recursion
Xj= (1 + kµ +σ(W(tj)−W(tj−1)))Xj−1,
and X0=X0, where Xjdenotes the approximation of X(tj). It is known that this scheme
converges for the geometric Brownian motion with strong order O(k1/2), i.e.,
E[|X(tj)−Xj|2]1/2≤Ck1/2,
and with weak order O(k), i.e., for sufficiently smooth test functions φ:R→Rit holds that
|E[φ(X(tj)) −φ(Xj)]| ≤ Ck,
where j= 0,1, . . . , Nk. Here the constant Cdoes not depend on k.
In the simulation of the four estimators from Section 2, let us consider the geometric
Brownian motion (10) with µ=−0.5, σ= 1, X0= 1, and T= 0.5. Furthermore, let
N= 104.5be the number of samples in the Monte Carlo estimator and M= 20 in the notation
of the previous example with the same estimators as before . Then we obtain on time grids
with 2j+1 grid points for j= 1,...,8 the convergence plots for the simulated strong and weak
errors which are presented in Figure 3. In the weak error simulation, the function φ(x) = |x|2
is used. We observe that the strong error in Figure 3(a) converges as expected with O(k1/2).
In Figure 3(b) one sees that the type I estimator of Proposition 2.2 |E[|X(T)|2]−EN[|XNk|2]|,
which just does a Monte Carlo simulation on the approximate solution, only converges on the
first two grid points with the desired order before as in the theory the Monte Carlo error
dominates. At the same time, the type II estimator |EN[|X(T)|2− |XNk|2]|, which was
considered in Proposition 2.3, behaves a lot better. It converges with the desired order up
to the last two points, where the strong order of convergence dominates as predicted by the
SPDE WEAK ERROR SIMULATION BY MC VS. MLMC 15
10 010 110 210 3
time steps
10 -4
10 -3
10 -2
10 -1
10 0
strong errors
Order 1/2
SDE strong errors
(a) Strong error for `= 1,...,8.
10 010 110 210 3
time steps
10 -4
10 -3
10 -2
10 -1
10 0
weak errors
Order 1
SDE weak errors I
SDE weak errors II
SDE weak errors MLMC I
SDE weak errors MLMC II
(b) Weak error for `= 1,...,8.
Figure 3. Approximations of strong and weak errors of the Euler–Maruyama
scheme for the geometric Brownian motion.
theory. Both the multilevel Monte Carlo estimator of type I from Proposition 2.4 and of
type II from Proposition 2.6 converge with the desired order of convergence but the absolute
errors are larger due to the larger constant in the overall error. This easy example, where
all correct values were known and could be used for the computations of the exact solutions,
shows clearly the behaviour that we expected from the theoretical upper and lower bounds
on weak error estimators in Section 2.
References
[1] Adam Andersson, Raphael Kruse, and Stig Larsson. Duality in refined Sobolev–Malliavin spaces and weak
approximations of SPDE. Stoch. PDE: Anal. Comp., 4(1):113–149, 2016.
[2] Adam Andersson and Stig Larsson. Weak convergence for a spatial approximation of the nonlinear sto-
chastic heat equation. Math. Comp., 85(299):1335–1358, 2016.
[3] Andrea Barth and Annika Lang. Milstein approximation for advection-diffusion equations driven by mul-
tiplicative noncontinuous martingale noises. Appl. Math. Opt., 66(3):387–413, 2012.
[4] Andrea Barth and Annika Lang. Simulation of stochastic partial differential equations using finite element
methods. Stochastics, 84(2-3):217–231, 2012.
[5] Andrea Barth, Annika Lang, and Christoph Schwab. Multilevel Monte Carlo method for parabolic sto-
chastic partial differential equations. BIT, 53(1):3–27, 2013.
[6] Charles-Edouard Br´ehier, Martin Hairer, and Andrew M Stuart. Weak error estimates for trajectories of
SPDEs for spectral Galerkin discretization. arXiv:1602.04057 [math.PR], February 2016.
[7] Daniel Conus, Arnulf Jentzen, and Ryan Kurniawan. Weak convergence rates of spectral Galerkin ap-
proximations for SPDEs with nonlinear diffusion coefficients. arXiv:1408.1108 [math.PR], August 2014.
[8] Arnaud Debussche. Weak approximation of stochastic partial differential equations: the nonlinear case.
Math. Comp., 80(273):89–117, 2011.
[9] Arnaud Debussche and Jacques Printems. Weak order for the discretization of the stochastic heat equation.
Math. Comp., 78(266):845–863, 2009.
[10] Michael B. Giles. Improved multilevel Monte Carlo convergence using the Milstein scheme. In Alexander
Keller, Stefan Heinrich, and Harald Niederreiter, editors, Monte Carlo and quasi-Monte Carlo methods
2006. Selected papers based on the presentations at the 7th international conference ‘Monte Carlo and
quasi-Monte Carlo methods in scientific computing’, Ulm, Germany, August 14–18, 2006, pages 343–358.
Springer, 2008.
16 A. LANG AND A. PETERSSON
[11] Stefan Heinrich. Multilevel Monte Carlo methods. In Svetozar Margenov, Jerzy Wasniewski, and Pla-
men Y. Yalamov, editors, Large-Scale Scientific Computing, volume 2179 of Lecture Notes in Computer
Science, pages 58–67. Springer, 2001.
[12] Arnulf Jentzen and Ryan Kurniawan. Weak convergence rates for Euler-type approximations of semilinear
stochastic evolution equations with nonlinear diffusion coefficients. arXiv:1501.03539 [math.PR], January
2015.
[13] Arnulf Jentzen and Michael R¨ockner. A Milstein scheme for SPDEs. Foundations of Computational Math-
ematics, 15(2):313–362, 2015.
[14] Peter E. Kloeden, Gabriel J. Lord, Andreas Neuenkirch, and Tony Shardlow. The exponential integrator
scheme for stochastic partial differential equations: Pathwise error bounds. J. Comput. Appl. Math.,
235(5):1245–1260, 2011.
[15] Mih´aly Kov´acs, Stig Larsson, and Fredrik Lindgren. Strong convergence of the finite element method with
truncated noise for semilinear parabolic stochastic equations with additive noise. Numerical Algorithms,
53(2-3):309–320, 2010.
[16] Mih´aly Kov´acs, Felix Lindner, and Ren´e Schilling. Weak convergence of finite element approximations
of linear stochastic evolution equations with additive L´evy noise. SIAM/ASA J. on Uncert. Quant.,
3(1):1159–1199, 2015.
[17] Raphael Kruse. Strong and Weak Approximation of Semilinear Stochastic Evolution Equations, volume
2093 of Lecture Notes in Mathematics. Springer, 2014.
[18] Annika Lang. A note on the importance of weak convergence rates for SPDE approximations in multilevel
Monte Carlo schemes. In Ronald Cools and Dirk Nuyens, editors, Monte Carlo and Quasi-Monte Carlo
Methods, MCQMC, Leuven, Belgium, April 2014, volume 163 of Springer Proceedings in Mathematics &
Statistics, pages 489–505, 2016.
[19] Andreas Petersson. Stochastic partial differential equations with multiplicative noise: Numerical simula-
tions of strong and weak approximation errors. Master’s thesis, University of Gothenburg, May 2015.
(Annika Lang)
Department of Mathematical Sciences
Chalmers University of Technology & University of Gothenburg
S–412 96 G¨
oteborg, Sweden.
E-mail address:annika.lang@chalmers.se
(Andreas Petersson)
Department of Mathematical Sciences
Chalmers University of Technology & University of Gothenburg
S–412 96 G¨
oteborg, Sweden.
E-mail address:andreas.petersson@chalmers.se