Content uploaded by Wouter Edeling

Author content

All content in this area was uploaded by Wouter Edeling on Feb 28, 2019

Content may be subject to copyright.

UNCECOMP 2019

3rd ECCOMAS Thematic Conference on

Uncertainty Quantiﬁcation in Computational Sciences and Engineering

M. Papadrakakis, V. Papadopoulos, G. Stefanou (eds.)

Crete, Greece, 24-26 June 2019

REDUCED MODEL-ERROR SOURCE TERMS FOR FLUID FLOW

Wouter Edeling1and Daan Crommelin 1,2

1Centrum Wiskunde & Informatica, Scientiﬁc Computing Group

Science Park 123, 1098 XG Amsterdam, The Netherlands

e-mail: {Wouter.Edeling, Daan.Crommelin}@CWI.nl

2Korteweg-de Vries Institute for Mathematics, University of Amsterdam

Science Park 105-107, 1098 XG Amsterdam, The Netherlands

e-mail: D.T.Crommelin@uva.nl

Keywords: Model error, data-driven surrogate models, ocean ﬂow

Abstract. It is well known that the wide range of spatial and temporal scales present in

geophysical ﬂow problems represents a (currently) insurmountable computational bottleneck,

which must be circumvented by a coarse-graining procedure. The effect of the unresolved ﬂuid

motions enters the coarse-grained equations as an unclosed forcing term, denoted as the ’eddy

forcing’. Traditionally, the system is closed by approximate deterministic closure models, i.e.

so-called parameterizations. Instead of creating a deterministic parameterization, some recent

efforts have focused on creating a stochastic, data-driven surrogate model for the eddy forcing

from a (limited) set of reference data, with the goal of accurately capturing the long-term ﬂow

statistics. Since the eddy forcing is a dynamically evolving ﬁeld, a surrogate should be able to

mimic the complex spatial patterns displayed by the eddy forcing. Rather than creating such a

(fully data-driven) surrogate, we propose to precede the surrogate construction step by a proce-

dure that replaces the eddy forcing with a new model-error source term which: i) is tailor-made

to capture spatially-integrated statistics of interest, ii) strikes a balance between physical in-

sight and data-driven modelling , and iii) signiﬁcantly reduces the amount of training data that

is needed. Instead of creating a surrogate for an evolving ﬁeld, we now only require a surrogate

model for one scalar time series per statistical quantity-of-interest. Our current surrogate mod-

elling approach builds on a resampling strategy, where we create a probability density function

of the reduced training data that is conditional on (time-lagged) resolved-scale variables. We

derive the model-error source terms, and construct the reduced surrogate using an ocean model

of two-dimensional turbulence in a doubly periodic square domain.

1

Wouter Edeling and Daan Crommelin

1 INTRODUCTION

In the numerical simulation of coarse-grained turbulent ﬂow problems one has to cope with

small-scale processes which cannot be resolved directly on the numerical grid. The effect of the

unresolved eddy ﬁeld enters the resolved-scale equations as an unclosed forcing term, denoted

as the eddy forcing, which is highly complex, dynamic, and shows intricate spatio-temporal

correlations. Traditionally, the eddy forcing is approximated by deterministic closure models,

i.e. so-called parameterizations. In the context of geophysical ﬂows, such parameterizations are

based on e.g. the work of Gent-McWilliams [6], or through the inclusion of a tunable (hyper)

viscosity term meant to damp the smallest resolved scales of the model [11].

It is well known that no parameterization scheme is perfect, and attempts have been made to

improve their performance. For instance, the authors of [15] analysed the transfer of energy and

enstrophy in spectral space for a number of parameterizations, and compared their performance

to a high-ﬁdelity reference solution of a two-dimensional turbulent ﬂow case. They proposed a

deterministic ’energy ﬁxer’ scheme, based on adding a weighted vorticity pattern to the com-

puted vorticity ﬁeld. Recently, data-driven techniques have been applied as well. For instance

the recent work of [10] used artiﬁcial neural networks to learn the eddy forcing from a set of

high-ﬁdelity snapshots.

However, a general limitation of such deterministic approaches is their inability to repre-

sent the strong non-uniqueness of the unresolved scales with respect to the resolved scales

[1, 16, 12]. Since the resolved scales are generally deﬁned as the convolution of the full-scale

solution with some ﬁlter, multiple unresolved states can correspond to the same resolved solu-

tion. Thus, in general there is no one-to-one correspondence between the resolved-scale state

and the unresolved-scale state, and yet deterministic parameterizations do assume such corre-

spondence. As a result, stochastic methods for representing the unresolved scales have received

an increasing amount of attention. Early contributions to this topic in the context of ocean mod-

elling includes the work of [1], where the eddy-forcing is replaced by a space-time correlated

random-forcing process. Other notable examples include the work of [9, 20, 7], who construct

probability density functions (pdfs) of the eddy forcing using a reference solution.

In this study, we also consider a stochastic surrogate method [17, 16], and as a performance

indicator we use the degree by which it is able to capture energy and enstrophy statistics. How-

ever, we refrain from an approach that is purely data-driven, i.e. one which attempts to learn the

eddy forcing directly from reference data. Instead, we replace the eddy forcing with a simpler

’model-error’ source term, which we parameterize based on physical arguments. Speciﬁcally,

we use the energy and enstrophy transport equations to derive a source term which tracks our

chosen target statistics. The only remaining unclosed part of our model-error term is repre-

sentative of the magnitude of these target statistics, i.e. scalars. As a result, the corresponding

surrogate model needs to represent only one (or a few) scalar quantities rather than the full eddy

forcing ﬁeld. This amounts to a large dimension reduction (in this study, a reduction by four

orders of magnitude), and as a consequence a large reduction in the amount of required training

data, while retaining accuracy in the statistics.

The article is organised as follows. In Section 2 we describe the governing equations and

multiscale decomposition. The model-error source term derivation and the surrogate method

are outlined in Section 3. Initial results are shown in Section 4, and ﬁnally the conclusion and

outlook are given in Section 5.

2

Wouter Edeling and Daan Crommelin

2 GOVERNING EQUATIONS

We study the same model as in [18], i.e. the forced-dissipative vorticity equations for two-

dimensional incompressible ﬂow. The governing equations read

∂ω

∂t +J(Ψ, ω) = ν∇2ω+µ(F−ω),

∇2Ψ = ω. (1)

Here, ωis the vertical component of the vorticity, deﬁned from the curl of the velocity ﬁeld V

as ω:= e3·∇×V, where e3:= (0,0,1)T. The stream function Ψrelates to the horizontal

velocity components by the well-known relations u=−∂Ψ/∂y and v=∂Ψ/∂x. As in [18],

the forcing term is chosen as the single Fourier mode F= 23/2cos(5x) cos(5y). The system

is fully periodic in x and y directions over a period of 2πL, where Lis a user-speciﬁed length

scale, chosen as the earth’s radius (L= 6.371 ×106[m]). The inverse of the earth’s angular

velocity Ω−1is chosen as a time scale, where Ω = 7.292 ×10−5[s−1]. Thus, a simulation time

period of a single ’day’ can now be expressed as 24 ×602×Ω≈6.3non-dimensional time

units. Given these choices, (1) is non-dimensionalized, and solved using values of νand µ

chosen such that a Fourier mode at the smallest retained spatial scale is exponentially damped

with an e-folding time scale of 5 and 90 days respectively. For more details on the numerical

setup we refer to [18]. Furthermore, our Python source code for (1) can be downloaded from

[4].

Finally, the key term in (1) is the Jacobian, i.e. the nonlinear advection term deﬁned as

J(Ψ, ω) := ∂Ψ

∂x

∂ω

∂y −∂Ψ

∂y

∂ω

∂x .(2)

It is this term that leads to the need for a closure model when (1) is discretized on a relatively

coarse grid which lacks the resolution to capture all turbulent eddies.

2.1 Discretization

We solve (1) by means of a spectral method, where we apply a truncated Fourier expansion:

ωk(x, y, t) = X

k

ˆωk(t)ei(k1x+k2y),

Ψk(x, y, t) = X

k

ˆ

Ψk(t)ei(k1x+k2y).(3)

The sum is taken over the components k1and k2of the wave number vector k:= (k1, k2)T,

and −K0≤kj≤K0,j= 1,2. These decompositions are inserted in (1), and solved for the

Fourier coefﬁcients ˆωk,ˆ

Ψkby means of the real Fast Fourier Transform. To avoid the aliasing

problem in the nonlinear term (2), we use the pseudo spectral method, such that in practice the

maximum resolved wave number is K, where K≤2K0/3[14]. 1

To advance the solution in time we use the second-order accurate AB/BDI2 scheme, which

results in the following discrete system of equations [14]

3ˆωi+1

k−4ˆωi

k+ ˆωi−1

k

2∆t+ 2 ˆ

Ji

k−ˆ

Ji−1

k=−νk2ˆωi+1

k+µˆ

Fk−ˆωi+1

k,

−k2ˆ

Ψi+1

k−ˆωi+1

k= 0.(4)

1We use N×Ngrids, with an even N= 2p(e.g. p= 7), such that N= 2K0[14].

3

Wouter Edeling and Daan Crommelin

Here, ∆t= 0.01 and ˆ

Ji

kis the Fourier coefﬁcient of the Jacobian at time level i, computed with

the pseudo spectral technique, and k2:= k2

1+k2

2.

2.2 Multiscale decomposition

As in [18], we apply a spectral ﬁlter in order to decompose the full reference solution into a

resolved (R) and an unresolved component (U), i.e. we use

ˆωR

k=PRˆωk,ˆωU

k=PUˆωk,(5)

where the projection operators PRand PUare depicted in Figure 1. Note that the full projection

operator P:= PR+PUalso removes wave numbers due to the use of the pseudo spectral

method.

0 20 40 60

0

20

40

60

80

100

120

Full

0 20 40 60

0

20

40

60

80

100

120

Resolved

0 20 40 60

0

20

40

60

80

100

120

Unresolved

k1

k2

Figure 1: The spectral ﬁlter (black=1, white=0) of the full, resolved and unresolved solutions. Due to the fact that

we use the real FFT algorithm, only part of the spectrum is computed, as Fourier coefﬁcients with opposite values

of kare complex conjugates in order to enforce real ωand Ψﬁelds [14].

Applying the resolved projection operator to the governing equations (1) results in the fol-

lowing resolved-scale transport equation

∂ωR

∂t +PRJ(Ψ, ω) = ν∇2ωR+µFR−ωR(6)

As mentioned, the key term is the Jacobian (2), since due to its non linearity, PRJ(Ψ, ω)6=

PRJΨR, ωR. We therefore write

J(Ψ, ω)−JΨR, ωR=: r, (7)

such that ris the exact subgrid-scale term, commonly referred to as the ’eddy forcing’ [1]. The

resolved-scale equation (6) can now be written as

∂ωR

∂t +PRJΨR, ωR=ν∇2ωR+µFR−ωR−r. (8)

We use the notation r:= PRrfor the sake of brevity. A snapshot of the resolved vorticity

ωRand corresponding resolved eddy forcing ris depicted in Figure 2. Notice the ﬁne-grained

character of the eddy forcing compared to the vorticity ﬁeld.

4

Wouter Edeling and Daan Crommelin

Figure 2: A snapshot of the exact, reference vorticity ﬁeld ωRand the corresponding eddy forcing.

2.3 Prediction of climate statistics

Ultimately, our goal is to integrate (8) in time, such that we can compute the long-term

climate statistics of the energy ERand enstrophy ZRdensities, deﬁned as

ER:= 1

21

2π2Z2π

0Z2π

0

VR·VRdxdy=−1

2ψR, ωR,(9)

ZR:= 1

21

2π2Z2π

0Z2π

0ωR2dxdy=1

2ωR, ωR.(10)

Here VRis the two-dimensional vector of the resolved velocity components in xand ydirec-

tion. For conciseness, we use the short-hand notation

(α, β) = 1

2π2Z2π

0Z2π

0

αβ dxdy, (11)

to denote the integral of the product αβ normalized by the area of the ﬂow domain. The deriva-

tion of the last equality of (9) can be found in Appendix A.

3 EDDY-FORCING SURROGATE

We cannot integrate (8) since it is still unclosed (due to the ωand Ψdependence of (7)), a

problem which we aim to solve by creating a data-driven surrogate of r, denoted by er. For our

present purpose, we deﬁne an ’ideal’ surrogate erfor the eddy forcing as one which satisﬁes the

following set of requirements:

1. Data-driven: In absence of a single ’best’ deterministic parameterization of r, we opt for

a model inferred from a pre-computed database of high-ﬁdelity reference data.

2. Stochastic: In general, the resolved scales are deﬁned as a convolution of the full solution

with some (spatial/spectral) ﬁlter. As a result there is no longer just a single unresolved-

scale ﬁeld that is consistent with the resolved-scale solution. This ambiguity provides us

with the motivation for a stochastic model for the unresolved, small-scale ﬁelds.

3. Correlated in space and time: As demonstrated by Figure 2, the reference eddy forcing

shows complex spatial structures. A surrogate of the full eddy forcing would ideally

reﬂect these as well.

5

Wouter Edeling and Daan Crommelin

4. Conditional on the resolved variables: The resolved and unresolved scales are in reality

two-way coupled. Hence, the eddy-forcing surrogate should not be independent from the

resolved solution.

5. Pre-computed & cheap: While the reference database can be computationally expensive

to compute, the resulting data-driven surrogate must be cheap.

6. Extrapolates well: To justify the cost of creating the reference database in the ﬁrst place,

the data-driven model must be able to predict the chosen quantity of interest well, sub-

stantially beyond the (time) domain of the data.

As mentioned, we will measure the performance of a surrogate model by its ability to accu-

rately represent the statistics of (9)-(10). Thus, we do not expect from the resolved-scale model

forced by the surrogate the ability to produce individual ﬂow ﬁelds which are in absolute lock-

step with the high-ﬁdelity data, especially considering the stochastic nature of the surrogate.

One possible course of action, explored in e.g. [17, 10], is to directly create a full-ﬁeld

surrogate er(x, y;t)∈RN×N, using a database reference snapshots in time of the exact eddy

forcing (7). Here, Nis the number of grid points in one spatial direction, typically 27,28or

higher. Constructing a full-ﬁeld, dynamic surrogate of a quantity as complex as the eddy forcing

is a challenging task, and storing a potentially large amount of reference snapshots can lead to

high memory requirements [17]. We therefore propose to precede the surrogate construction

step with a procedure that signiﬁcantly compresses the training data.

3.1 Reduced surrogate

Note that our statistical quantities of interest (9) and (10) are scalars. Instead of creating

a full-ﬁeld N×Nsurrogate er(x, y;t), we will ﬁrst replace the exact rin (8) with a simpler

alternative, where the unclosed component is reﬂective of the size of the statistical quantities

we aim to approximate in the ﬁrst place. A simple option is to specify

−r(x, y;t) = τ(t)ωR(x, y;t),(12)

where τ(t)is an unknown, time-varying scalar. Clearly, this choice is arbitrary, and (12) will

not match the eddy forcing (7). Instead, we think of (12) as an example of a ’model-error term’,

meant to correct the unparameterized (r= 0) model in some sense. In our case, a deviation

from the exact eddy forcing does not pose a problem because of the freedom that integrated

quantities-of-interest give us, such that we only need our ωRand ΨRﬁelds to approximate the

truth in the weak sense of (9) and (10). We can examine the effect of (12) on the evolution equa-

tions of ERand ZR, and subsequently combine physical insight with a data-driven approach

to ﬁnd the time series of τthat constrains their evolution to the reference values. A reduced

surrogate now only needs to be constructed from this scalar time series, instead of from the

full-ﬁeld evolution of (7).

The evolution equation of ER(see Appendix A) satisﬁes

dER

dt=−ψR,∂ωR

∂t =−2νZR−2µUR−2µER+ψR, r,(13)

where we denote the integral ΨR, F /2as UR. If we insert (12) into (13), the last term on the

right-hand side becomes

ψR, r=−τψR, ωR= 2τ ER.(14)

6

Wouter Edeling and Daan Crommelin

Figure 3: The pdfs of the energy (left) and enstrophy (right), of the reduced (r=τ ωR), reference (rgiven by (7))

and unparameterised (r= 0) solution.

The last equality follows from the deﬁnition (9). Thus, the physical insight is that (12) leads

to the additional term 2τER, which either acts to produce or dissipate ERdepending on the

sign of τ. Let us denote the difference between the projected reference energy and ERas

∆E:= E−ER, where E:= −PRΨ, ω/2. Any quantity without superscript, e.g. Eor

ω, is a reference quantity computed from (1). Now, for the data-driven determination of the τ

time series, we require τto be positive when ∆E > 0, i.e. to increase production when ERis

too low, and to dissipate energy when ∆E < 0. We parameterize τvia an analytic relationship

which reﬂects this property:

τ:= τmax tanh ∆E

ER.(15)

Here, τmax is a user-speciﬁed constant, which we set to one for now. During the training period,

we can compute (15) every ∆t, building up a reference time series.

To test the validity of our approach, we run the system (8) for a simulation period of 8 years.

Besides τ, at every ∆twe also sample the energy and enstrophy of the reference, reduced and

unparameterised solution, i.e. using rgiven by (7), (12) and zero respectively2. The energy and

enstrophy probability density functions (pdfs) generated from those samples can be found in

Figure 3. By virtue of (15), the energy pdfs of the reference and the reduced solution practically

overlap. This demonstrates that it is possible to obtain statistically-equivalent energy solutions

using training data reduced by a factor of N2compared to the full-ﬁeld surrogate case3.

However, we have two quantities of interest, and (12) also has an effect on the ZRequation

(a term 2τZRappears). Since we train τto track PRE, we cannot expect a perfect ZRpdf, and

in fact, Figure 3 shows that the situation does not improve upon the unparameterised model,

which displays a large bias in ZRvalues. Rather than trying to construct a different τwhich is

some compromise between accuracy in ERand ZR, we opt for two separate time series, each

of which acts on either the energy or enstrophy evolution equation alone.

2Note that no surrogate is used yet, we are generating a large set of training data.

3In the example of Figure 3, N2= 1282= 16384.

7

Wouter Edeling and Daan Crommelin

3.2 Orthogonal patterns

We replace our initial simple choice (12) with

−r=τEΨ0+τZω0,(16)

where Ψ0and ω0are patterns of the resolved vorticity and stream function. We choose Ψ0such

that τEΨ0only acts on the ERequation, and produces no additional source term in the enstrophy

equation. The converse must be true for the τZω0term. This will allow us to train τEon ∆E

alone, and τZonly on ∆Z:= Z−ZR. Since the ERand ZRevolution equations are forced

by −ΨR, ∂ωR/∂tand ωR, ∂ ωR/∂trespectively (see (13) and appendix A), this suggests

a Gram-Schmidt type of approach to make Ψ0orthogonal to ωR,·and likewise for ω0and

ΨR,·. Setting:

Ψ0=ψR−ψR, ωR

(ωR, ωR)ωRand ω0=ωR−ψR, ωR

(ψR, ψR)ψR,(17)

yields

ωR, τEΨ0= 0 and ψR, τZω0= 0.(18)

The additional source term in the ERequation now becomes

−ψR, τEΨ0=−τEψR, ψR+τEψR, ωR2

(ωR, ωR)= 2τE"ER2

ZR−SR#:= 2τES0(19)

Here, we deﬁned the integrated square stream function as SR:= ψR, ψR/2. Since ER2/ZR−

SRhas the dimension of the squared stream function, we introduce the ﬁnal shorthand notation

S0:= ER2/ZR−SRin (19). In a similar vein, (16) produces the following source term in

the ZRequation:

2τZZ0with Z0:= ZR−ER2

SR.(20)

We parameterise τEand τZusing the same procedure as in Section 3.1, only now we need to

incorporate the sign of S0and Z0to correctly activate either the production or dissipation of ER

and ZR, i.e.

τE:= τE,max tanh ∆E

ER·sgn(S0) and τZ:= τZ,max tanh ∆Z

ZR·sgn(Z0).(21)

Again, we leave the proper estimation of parameters for a later study, and simply set τE,max =

τZ,max = 1. Furthermore, sgn(X)=1when X≥1and −1otherwise. Repeating the sim-

ulation of Section 3.1, inserting (16) in (8) yields the results depicted in Figure 4. Now, both

pdfs match the reference well. Only a very small discrepancy in the ERpdf can be observed,

which might ﬁxed by tuning τE,max. The corresponding τE,τZreference time series are shown

in Figure 5.

8

Wouter Edeling and Daan Crommelin

Figure 4: The pdfs of the energy (left) and enstrophy (right), of the reduced (r=τEΨ0+τZω0), reference (rgiven

by (7)) and unparameterised (r= 0) solution.

Figure 5: Training time series of τEand τZover 500 days. Note that there seems to be a negative correlation

between the two time series.

9

Wouter Edeling and Daan Crommelin

3.3 Surrogate construction

We will build on the resampling stategies as developed by [16, 2]. In general, these methods

model the unresolved term at time ti+1 by sampling from the conditional probability distribution

of the reference data. In our case, we keep the functional forms of (21), such that ∆Eand ∆Z

can be chosen as the unresolved terms in need of a surrogate model:

g

∆Ei+1 ∼∆Ei+1 | Ei,Ei−1,· · ·

g

∆Zi+1 ∼∆Zi+1 | Zi,Zi−1,· · · (22)

Here, g

∆Ei+1 denotes the data-driven resampling surrogate at time ti+1, whereas as ∆Ei+1 rep-

resent actual reference data from the training run, and likewise for g

∆Zi+1. The set of ’condi-

tioning variables’ Ei,Zietc contain variables from the resolved model. They can be (functions

of) ER,S0or any other (scalar) quantity, as long as we also have access to it outside the training

period. Examples of these conditional distributions are ∆Ei+1 |ER

iand ∆Zi+1 |g

∆Zi, ZR

i.

We could assume a Markov property (∆Ei+1 | Ei), or build in a larger memory. Note that by

design, (22) already satisﬁes many of the properties listed in Section 3, e.g. it is data-driven,

stochastic and conditioned on resolved variables.

The main challenges with this approach are twofold. Clearly, the ﬁrst challenge concerns

the actual formation of the conditional distribution, i.e. how to map the observed conditioning

variables to plausible subsets of ∆Ei+1 and ∆Zi+1 samples from which g

∆Ei+1 and g

∆Zi+1

can be randomly sampled. The second challenge concerns the proper choice of conditioning

variables, which is somewhat reminiscent of the choice of ’features’ in a machine-learning

context.

3.4 Building the distribution

We will illustrate the approach using ∆E, the same procedure applies for ∆Z. To map Eito

some subset of plausible ∆Ei+1 values we use the so-called ’binning’ approach of [16]. First,

consider a snapshot sequence of ∆E

∆ES

1={∆E1,∆E2,· · · ,∆Ei,· · · ,∆ES},(23)

where iis the time index. In addition, we also have snapshots of corresponding conditioning

variables

ES

1={E1,E2,· · · ,ES}.(24)

Let Cbe the total number of time-lagged conditioning variables used in (22). We then proceed

by creating C-dimensional disjoint bins4, each bin spanning a unique conditioning variable

range, and containing a number of associated ∆Evalues, see Figure 6. Note that not all bins

may contain samples, especially if two or more conditioning variables are used. If during pre-

diction an empty bin is sampled, the data of the nearest bin (in Euclidean sense) is used instead.

Once a bin is selected by Ei, the resulting subset of ∆Evalues can be sampled randomly, or one

might sample from the local bin average instead, leading to a deterministic prediction.

4We used equidistant bins, but this is not a hard requirement.

10

Wouter Edeling and Daan Crommelin

(a) Low correlation between ∆Ei+1 and Ei. (b) High correlation between ∆Ei+1 and Ei.

Figure 6: Two binning objects, with the reference ∆Ei+1 data on the vertical axis and the conditioning variable Ei

on the horizontal axis. Vertical lines separate the different bins, and the black dots represent the local bin means.

3.5 Choice of conditioning variables

Ideally we would like the conditioning variables of (22) to display some correlation with

∆Ei+1 and ∆Zi+1. In this case, the range of plausible reference values in the selected subset is

smaller. Consider the two bins depicted in Figure 6, each with 1 conditioning variable (∆Ei+1 |

Ei). The binning object of Figure 6(a) shows considerable less correlation between Eiand

∆Ei+1 than its counterpart in Figure 6(b). As a result, each bin contains a larger spread in

possible ∆Evalues, leading to more noisy g

∆Ei+1 predictions.

We continue by drawing up a list of candidate conditioning variables, and computing the

temporal correlation coefﬁcients

ρ(∆Ei+1,Ei) = Cov [∆Ei+1,Ei]

σ(∆Ei+1)σ(Ei)and ρ(∆Zi+1,Zi) = Cov [∆Zi+1,Zi]

σ(∆Zi+1)σ(Zi)(25)

from a reference time series of 500 days. Here Cov (·,·)is the covariance operator and σ(·)

is the standard deviation. Speciﬁcally, we will select individual source terms from the ERand

ZRequations as candidate Eiand Zi, the rationale being that these will also (in part) drive the

evolution equations of ∆Eand ∆Z. The complete list, including the correlation coefﬁcient

values, is shown in Table 1. Previously undeﬁned conditioning variables (occurring in the ZR

equation), are VR:= ωR, F /2and OR:= ∇2ωR, ωR/2. This strategy for selecting

candidate conditioning variables is reasonable, as many show substantial correlation with the

reference data, hovering around the ±0.5mark. Clear exceptions are ER(which correlates

much less), and τES0,τZZ0, which show very high correlation.

4 RESULTS

This section contains the initial exploratory results of the methodology outlined in the pre-

ceding sections. For validation and training purposes we ran the reference model (1) for a

simulation period of 8 years, storing reference data and conditioning variables every ∆t. Here,

is amounts to roughly 1.8×106snapshots per variable. When predicting, the training data

must be stored in memory to allow for fast resampling. If the reference snapshots are full ﬁeld,

this can lead to high memory requirements [17]. Subsampling the reference data reduces the

memory constraints, although this leads to a surrogate with an intrinsic time step that is larger

11

Wouter Edeling and Daan Crommelin

Ei,Ziρ(∆Ei+1,Ei)ρ(∆Zi+1,Zi)

ZR:ωR, ωR/20.4017 0.336

ER:−ψR, ωR/20.1401 0.0951

UR:ψR, F /20.5497 0.598

SR:ψR, ψR/2-0.5091 -0.4857

VR:ωR, F /2-0.5467 -0.5965

OR:∇2ωR, ωR/2-0.4993 -0.4394

τES0:τEER2/ZR−SR0.9484 0.8876

τZZ0:τZZR−ER2/SR0.8915 0.999

Table 1: Correlation coefﬁcients.

than the ∆tof (4), and thus can only be updated after a certain number of ∆ttime cycles [2]. A

clear advantage of our current surrogate approach, is that we can store the full 8 year reduced

training set in memory, without the need for subsampling.

We subdivide the results into tests of increasing complexity:

T1: A one-way coupled simulation where the resolved equation (8) provides the conditioning

variables, without replacing r=τE(∆E) Ψ0+τZ(∆Z)ω0in (8) with the surrogate

er=τE(g

∆E)Ψ0+τZ(g

∆Z)ω0. The surrogates g

∆Eand g

∆Zare not extrapolated, i.e. they

are constructed using the full 8 year reference data set, so no simulation outside the time

period of the training data takes place.

T2: A two-way coupled simulation, still without surrogate extrapolation.

T3: A two-way coupled simulation with surrogate extrapolation.

4.1 Results T1

T1 serves as a veriﬁcation of our code, as in this case the exact ∆Eand ∆Zare still used

in (21) to compute τEand τZ. Now, if implemented correctly, surrogates such as g

∆Ei+1 ∼

∆Ei+1 |(τES0)iand g

∆Zi+1 ∼∆Zi+1 |(τZZ0)i, must follow the reference data closely, given

the high correlations displayed in Table 1. This is conﬁrmed by the results of Figure 7.

4.2 Results T2

T2 is the ﬁrst real test of the surrogate method due to its two-way coupled nature. As a

result, trajectories of g

∆Eand g

∆Zcan no longer be expected to follow the reference data.

Discrepancies between the exact (reduced) eddy forcing (16) and its surrogate will cause the

model forced by the surrogate to develop its own dynamics. We reiterate here that our goal

is to predict the time-averaged ﬂow statistics, which might still be feasible if we are not in

absolute lockstep with ∆Eand ∆Z. Even two full-scale simulations with slightly different

initial conditions will diverge from each other (due to their turbulent nature), yet can converge

in a statistical sense.

We tested a variety of surrogates, which differed through the set of selected conditioning

variables. All were Markovian in character, conditioned on variables from the previous time

step alone. Thus far, almost all considered surrogates improved upon the ZRbias of the un-

parameterized model, although they showed some varying performance amongst each other.

12

Wouter Edeling and Daan Crommelin

Figure 7: T1 time series for ∆Eand ∆Zand their corresponding surrogates over a 50 day period. The g

∆E

surrogate is noisier due to the lower correlation with its conditioning variable (see Table 1).

Figure 8: The pdfs of the energy (left) and enstrophy (right), of the reduced surrogate (er=τEg

∆EΨ0+

τZg

∆Zω0), reference (rgiven by (7)) and unparameterised (r= 0) solution. The surrogates were both condi-

tioned on ZR, ER, U R, SRof the previous time step.

For brevity, we only show a representative sample of results. Consider the results of Fig-

ure 8, which shows the pdfs obtained using the surrogates ∆Ei+1 |ZR

i, ER

i, UR

i, SR

iand

∆Zi+1 |ZR

i, ER

i, UR

i, SR

i, with 10 bins per conditioning variable. As expected, the pdfs do

not show the same (near) perfect overlap with the reference compared to the training case

of Figure 4, but the match is still accurate. Surrogates conditioned on e.g. ZR, ER, URor

ZR, UR, SRshowed fairly similar results. Somewhat degraded performance (although over-

all still better than r= 0), is obtained when conditioning on ER, U R, SR, see Figure 9. While

the ZRbias is still corrected for, the pdfs of the surrogate underestimate the variance. The only

exception, which did not improve upon the unparameterized model, was when conditioning on

τES0and τZZ0, despite the high correlations of Table 1. A possible cause is that, when predict-

ing, we are forced to condition on τE(g

∆E)S0instead of τE(∆E)S0, as the latter is not available

outside the training period. Perhaps using conditioning variables such as τES0and τZZ0should

be viewed as some form of overﬁtting, leading to a surrogate which is unlikely to generalize

well beyond the training set. A possible remedy might be to increase the time lag [17].

13

Wouter Edeling and Daan Crommelin

Figure 9: The pdfs of the energy (left) and enstrophy (right), of the reduced surrogate (er=τEg

∆EΨ0+

τZg

∆Zω0), reference (rgiven by (7)) and unparameterised (r= 0) solution. The surrogates were both condi-

tioned on ER, U R, SRof the previous time step.

Figure 10: The pdfs of the energy (left) and enstrophy (right), of several extrapolated reduced surrogates (er=

τEg

∆EΨ0+τZg

∆Zω0), reference (rgiven by (7)) and unparameterised (r= 0) solution. The surrogates

were both conditioned on ZR, ER, U R, SRof the previous time step.

4.3 Results T3

Predictive capability outside the training set should be the goal of any data-informed numer-

ical simulation tool. In our case, this goal concerns prediction outside the time interval covered

by the training set. We take tentative steps in this direction by incrementally reducing the time

interval of the training set for the ∆Ei+1 |ZR

i, ER

i, UR

i, SR

iand ∆Zi+1 |ZR

i, ER

i, UR

i, SR

i

surrogates, while keeping the simulation time Tsim ﬁxed to 8 years. Figure 10 shows the

resulting pdfs, obtained using a training set spanning the ﬁrst Ttrain =αTsim years, with

α∈ {0.9,0.8,0.7,0.6,0.5}. No signiﬁcant deviation from the unextrapolated T2 test case is

observed, which demonstrates the predictive capability of the surrogate method.

Finally, we note that all results can replicated via the source code and corresponding input

ﬁles, available for download at [3].

14

Wouter Edeling and Daan Crommelin

5 CONCLUSION & OUTLOOK

We presented a method to create a stochastic surrogate model, conditioned on time-lagged

observable variables, from a set of training data of a multiscale dynamical system. The novelty

of our approach is found in the derivation of model-error source terms designed to track chosen

spatially-integrated statistics of interest. We denote these as ’reduced’ model error terms, as

they lead to a signiﬁcant reduction in the amount of training data. Although using less data

might seem counter productive, we argue that this leads to an easier surrogate construction.

Furthermore, our reduced framework allows us to step away from a fully-data driven, physics-

blind, surrogate, and inform part of our model-error term based on the transport equations of

the target statistics.

Future work includes further testing the extrapolative capability of the method. Another in-

teresting research option would be to contrast the performance of our conditional time-lagged

surrogate with machine-learning alternatives, such as random forests or neural nets. Recent

relevant work also considered a combination of both approaches[13]. Finally, a further inter-

esting avenue of future research is the a-priori incorporation of constraints from mathematical

physics. For instance, when rewriting the eddy forcing in tensor format, certain constraints on

the tensor shape can be found [19]. Such an approach opens up the possibility for efﬁcient,

physics-constrained uncertainty quantiﬁcation, see e.g. [5] for examples in steady ﬂow prob-

lems or [8] for large-eddy simulations.

ACKNOWLEDGEMENTS

This research is funded by the Netherlands Organization for Scientiﬁc Research (NWO)

through the Vidi project ”Stochastic models for unresolved scales in geophysical ﬂows”, and

from the European Union Horizon 2020 research and innovation programme under grant agree-

ment #800925 (VECMA project).

We also thank W.T.M. Verkley for making his vorticity equation source code available to us.

REFERENCES

[1] P.S. Berloff. Random-forcing model of the mesoscale oceanic eddies. Journal of Fluid

Mechanics, 529:71–95, 2005.

[2] D. Crommelin and E. Vanden-Eijnden. Subgrid-scale parameterization with conditional

markov chains. Journal of the Atmospheric Sciences, 65(8):2661–2675, 2008.

[3] W.N. Edeling. Tau ez - uncecomp branch (github repository). https://github.com/

wedeling/TAU_EZ/tree/uncecomp, 2019.

[4] W.N. Edeling. vorticity-solver (github repository). https://github.com/

wedeling/vorticity-solver, 2019.

[5] W.N. Edeling, G. Iaccarino, and P. Cinnella. Data-free and data-driven rans predictions

with quantiﬁed uncertainty. Flow, Turbulence and Combustion, 100(3):593–616, 2018.

[6] P.R. Gent and J.C. Mcwilliams. Isopycnal mixing in ocean circulation models. Journal of

Physical Oceanography, 20(1):150–155, 1990.

[7] I. Grooms and L. Zanna. A note on ’toward a stochastic parameterization of ocean

mesoscale eddies’. Ocean Modelling, 113:30–33, 2017.

15

Wouter Edeling and Daan Crommelin

[8] L. Jofre, S.P. Domino, and G. Iaccarino. A framework for characterizing structural uncer-

tainty in large-eddy simulation closures. Flow, Turbulence and Combustion, 100(2):341–

363, 2018.

[9] P. Mana and L. Zanna. Toward a stochastic parameterization of ocean mesoscale eddies.

Ocean Modelling, 79:1–20, 2014.

[10] R. Maulik, O. San, A. Rasheed, and P. Vedula. Subgrid modelling for two-dimensional

turbulence using neural networks. Journal of Fluid Mechanics, 858:122–144, 2019.

[11] J.C. McWilliams. The emergence of isolated coherent vortices in turbulent ﬂow. Journal

of Fluid Mechanics, 146:21–43, 1984.

[12] T. Palmer and P. Williams. Stochastic physics and climate modelling. Cambridge Univer-

sity Press Cambridge, UK, 2010.

[13] S. Pan and K. Duraisamy. Data-driven discovery of closure models. SIAM Journal on

Applied Dynamical Systems, 17(4):2381–2413, 2018.

[14] R. Peyret. Spectral methods for incompressible viscous ﬂow, volume 148. Springer Sci-

ence & Business Media, 2013.

[15] J. Thuburn, J. Kent, and N. Wood. Cascades, backscatter and conservation in numerical

models of two-dimensional turbulence. Quarterly Journal of the Royal Meteorological

Society, 679(140):626–638, 2014.

[16] N. Verheul and D. Crommelin. Data-driven stochastic representations of unresolved fea-

tures in multiscale models. Commun. Math. Sci, 14(5):1213–1236, 2016.

[17] N. Verheul, J. Viebahn, and D. Crommelin. Covariate-based stochastic parameterization of

baroclinic ocean eddies. Mathematics of Climate and Weather Forecasting, 3(1):90–117,

2017.

[18] W.T.M. Verkley, P.C. Kalverla, and C.A. Severijns. A maximum entropy approach to the

parametrization of subgrid processes in two-dimensional ﬂow. Quarterly Journal of the

Royal Meteorological Society, 142(699):2273–2283, 2016.

[19] S. Waterman and J.M. Lilly. Geometric decomposition of eddy feedbacks in barotropic

systems. Journal of Physical Oceanography, 45(4):1009–1024, 2015.

[20] L. Zanna, P. Mana, J. Anstey, T. David, and T. Bolton. Scale-aware deterministic and

stochastic parametrizations of eddy-mean ﬂow interaction. Ocean Modelling, 111:66–80,

2017.

A ENERGY AND ENSTROPHY EQUATIONS

For convenience, we reproduce certain relevant derivations regarding the ERand ZRtrans-

port equations from [18]. The energy (density) is deﬁned as

ER:= 1

21

2π2Z2π

0Z2π

0

VR·VRdxdy, (26)

16

Wouter Edeling and Daan Crommelin

where VRis the vector containing the velocity components in x and y direction. It can be

rewritten as ER=−ψR, ωR/2via

VR·VR=∇ψR· ∇ψR=∇ · ψR∇ψR−ψR∇2ψR=∇ · ψR∇ψR−ψRωR(27)

The ﬁrst equality follows from the deﬁnition VR:= −∂ψR/∂y, ∂ψR/∂xT, while the second

stems from the product rule of a scalar (ψR) and a vector (∇ψR):

∇ · ψR∇ψR=∇ψR· ∇ψR+ψR∇2ψR.(28)

Finally, the last equality of (27) simply follows from the governing equations (1). The term

∇ · ψR∇ψRdisappears when integrated over the spatial domain, after application of the

divergence theorem in combination with the doubly periodic boundary conditions. This leaves

ER=−ψR, ωR/2. To obtain the energy equation, start with

dER

dt=1

2π2Z2π

0Z2π

0

∂

∂t 1

2VR·VRdxdy=1

2π2Z2π

0Z2π

0

VR·∂VR

∂t dxdy.

(29)

Similar to the analysis above, we use the relation VR·VR

t=∇ · ψR∇ψR

t−ψRωR

t(where

the subscript tdenotes ∂/∂t) to obtain

dER

dt=−ΨR,∂ωR

∂t =

ψR, P RJψR, ωR−νψR,∇2ωR−µψR, F −ωR+ψR, r(30)

Using integration by parts and the periodic boundary conditions it can be shown that the ﬁrst

term on the right-hand side satisﬁes ψR, P RJψR, ωR=JψR, ψR, ωR= 0, since the

Jacobian of two equal arguments is zero [18]. Furthermore, using the self-adjoint nature of the

Laplace operator, we have ψR,∇2ωR=∇2ψR, ωR=ωR, ωR. This leads to

dER

dt=−νωR, ωR−µψR, F +µψR, ωR+ψR, r,(31)

which equals (13). Using the same procedure, the evolution equation for the enstrophy reads

dZR

dt=ωR,∂ωR

∂t =νωR,∇2ωR+µωR, F −µωR, ωR−ωR, r.(32)

17