Content uploaded by Arnald Puy
Author content
All content in this area was uploaded by Arnald Puy on Jul 25, 2021
Content may be subject to copyright.
A comprehensive comparison of total-order estimators for
global sensitivity analysis
Arnald Puy∗1,2, William Becker3, Samuele Lo Piano4, and Andrea Saltelli5
1Department of Ecology and Evolutionary Biology, M31 Guyot Hall, Princeton University, New Jersey 08544, USA. E-Mail:
apuy@princeton.edu
2Centre for the Study of the Sciences and the Humanities (SVT), University of Bergen, Parkveien 9, PB 7805, 5020 Bergen, Norway.
3European Commission, Joint Research Centre, Via Enrico Fermi, 2749, 21027 Ispra VA, Italy
4School of the Built Environment, JJ Thompson Building, University of Reading, Whiteknights Campus, Reading, RG6 6AF, United
Kingdom
5Open Evidence Research, Universitat Oberta de Catalunya (UOC), Barcelona, Spain.
Abstract1
Sensitivity analysis helps identify which model inputsconvey the most uncertainty to the model output.2
One of the most authoritative measures in global sensitivity analysis is the Sobol’ total-order index, which3
can be computed with several different estimators. Although previous comparisons exist, it is hard to4
know which estimator performs best since the results are contingent on the benchmark setting defined by5
the analyst (the sampling method, the distribution of the model inputs, the number of model runs, the test6
function or model and its dimensionality, the weight of higher order effects or the performance measure7
selected). Here we compare several total-order estimators in an eight-dimension hypercube where these8
benchmark parameters are treated as random parameters. This arrangement significantly relaxes the9
dependency of the results on the benchmark design. We observe that the most accurate estimators are10
Razavi and Gupta’s, Jansen’s or Janon/Monod’s for factor prioritization, and Jansen’s, Janon/Monod’s11
or Azzini and Rosati’s for approaching the “true” total-order indices. The rest lag considerably behind.12
Our work helps analysts navigate the myriad of total-order formulae by reducing the uncertainty in the13
selection of the most appropriate estimator.14
Keywords: Uncertainty analysis; sensitivity analysis; modeling; Sobol’ indices; variance decomposi-15
tion, benchmarking analysis16
1 Introduction17
Sensitivity analysis, i.e. the assessment of how much uncertainty in a given model output is conveyed by
each model input, is a fundamental step to judge the quality of model-based inferences [1–3]. Among the
∗Corresponding author
1
many sensitivity indices available, variance-based indices are widely regarded as the gold standard because
they are model-free (no assumptions are made about the model), global (they account for interactions
between the model inputs) and easy to interpret [4–6]. Given a model of the form y=f(x),x=
(x1,x2,...,xi,...,xk)∈Rk, where yis a scalar output and x1,...,xkare the kindependent model inputs, the
variance of yis decomposed into conditional terms as
V(y)=
k
!
i=1
Vi+!
i!
i<j
Vij +... +V1,2,...,k,(1)
where
Vi=Vxi"Ex∼i(y|xi)#Vij =Vxi,xj"Ex∼i,j(y|xi,xj)#
−Vxi"Ex∼i(y|xi)#
−Vxj"Ex∼j(y|xj)#
(2)
and so on up to the k-th order. The notation x∼imeans all-but-xi. By dividing each term in Equation 118
by the unconditional model output variance V(y), we obtain the first-order indices for single inputs (Si),19
pairs of inputs (Sij), and for all higher-order terms. First-order indices thus provide the proportion of V(y)20
caused by each term and are widely used to rank model inputs according to their contribution to the model21
output uncertainty, a setting known as factor prioritization [1].22
Homma and Saltelli [7] also proposed the calculation of the total-order index Ti, which measures the23
first-order effect of a model input jointly with its interactions up to the k-th order:24
Ti=1−Vx∼i"Exi(y|x∼i)#
V(y)=
Ex∼i"Vxi(y|x∼i)#
V(y).(3)
When Ti≈0, it can be concluded that xihas a negligible contribution to V(y). For this reason, total-25
order indices have been applied to distinguish influential from non-influential model inputs and reduce the26
dimensionality of the uncertain space, a setting known as factor-fixing [1].27
The most direct computation of Tiis via Monte Carlo (MC) estimation because it does not impose any28
assumption on the functional form of the response function, unlike metamodeling approaches [8,9]. The29
Fourier Amplitude Sensitivity Test (FAST) may also be used to calculate Ti, which involves transforming30
input variables into periodic functions of a single frequency variable, sampling the model and analysing the31
sensitivity of input variables using Fourier analysis in the frequency domain [10,11]. While an innovative32
approach, FAST is sensitive to the characteristic frequencies assigned to input variables, and is not a very33
intuitive method - for these reasons it has mostly been superseded by Monte Carlo approaches, or by34
metamodels when computational expense is a serious issue. In this work we focus on the former.35
MC methods require generating a (N,2k)base sample matrix with either random or quasi-random36
numbers (e.g. Latin Hypercube Sampling, Sobol’ quasi-random numbers [12,13]), where each row is a37
sampling point and each column a model input. The first kcolumns are allocated to an Amatrix and the38
remaining kcolumns to a Bmatrix, which are known as the “base sample matrices”. Any point in either39
Aor Bcan be indicated as xvi , where vand irespectively index the row (from 1 to N) and the column40
(from 1 to k). Then, kadditional A(i)
B(B(i)
A) matrices are created, where all columns come from A(B)41
2
except the i-th column, which comes from B(A). The numerator in Equation 3 is finally estimated using42
the model evaluations obtained from the A(B) and A(i)
B(B(i)
A) matrices. Some estimators may also use a43
third or Xbase sample matrices (i.e. A,B,C,...,X), although the use of more than three matrices has44
been recently proven inefficient by Lo Piano et al. [14].45
1.1 Total-order estimators and uncertainties in the benchmark settings46
The search for efficient and robust total-order estimators is an active field of research [1,7,15–20]. Although47
some works have compared their asymptotic properties (i.e. [16]), most studies have promoted empirical48
comparisons where different estimators are benchmarked against known test functions and specific sample49
sizes. However valuable these empirical studies may be, Becker [21] observed that their results are very50
much conditional on the choice of model, its dimensionality and the selected number of model runs. It51
is hard to say from previous studies whether an estimator outperforming another truly reflects its higher52
accuracy or simply its better performance under the narrow statistical design of the study. Below we extend53
the list of factors which Becker [21] regards as influential in a given benchmarking exercise and discuss54
how they affect the relative performance of sensitive estimators.55
•The sampling method: The creation of the base sample matrices can be done using Monte-Carlo (MC)56
or quasi Monte-Carlo (QMC) methods [12,13]. Compared to MC, QMC allows to more effectively57
map the input space as it leaves smaller unexplored volumes (Fig. S1). However, Kucherenko et58
al. [22] observed that MC methods might help obtain more accurate sensitivity indices when the59
model under examination has important high-order terms. Both MC and QMC have been used when60
benchmarking sensitivity indices [15,23].61
•The form of the test function: some of the most commonly used functions in SA are the Ishigami62
and Homma [24]’s, the Sobol’ G and its variants [23,25], the Bratley and Fox [26]’s or the set of63
functions presented in Kucherenko et al. [22][14,16,18,23]. Despite being analytically tractable,64
these functions capture only one possible interval of model behaviour, and the effects of nonlinearities65
and nonadditivities is typically unknown in real models. This black-box nature of models has become66
more of a concern in the last decades due to the increase in computational power and code complexity67
(which prevents the analyst from intuitively grasping the model’s behaviour [27]), and to the higher68
demand for model transparency [3,28,29]. This renders the functional form of the model similar to69
a random variable [21], something not accounted for by previous works [14,16,18,23].70
•The function dimensionality: many studies focus on low-dimensional problems, either by using test71
functions that only require a few model inputs (e.g. the Ishigami function, where k=3), or by using72
test functions with a flexible dimensionality, but setting kat a small value of e.g. k≤8(Sobol’73
[25]’s G or Bratley and Fox [26] functions). This approach trades computational manageability for74
comprehensiveness: by neglecting higher dimensions, it is difficult to tell which estimator might work75
best in models with tens or hundreds of parameters. Examples of such models can be readily found76
3
in the Earth and Environmental Sciences domain [30], including the Soil and Water Assessment77
Tool (SWAT) model, where k=50 [31], or the Modélisation Environmentale-Surface et Hydrologie78
(MESH) model, where k=111 [32].79
•The distribution of the model inputs: the large majority of benchmarking exercises assume uniformly-80
distributed inputs p(x)∈U(0,1)k[14,16,23,33]. However, there is evidence that the accuracy of Ti
81
estimators might be sensitive to the underlying model input distributions, to the point of overturning82
the model input ranks [34,35]. Furthermore, in uncertainty analysis – e.g. in decision theory, the83
analysts may use distributions with peaks for the most likely values derived, for instance, from an84
experts elicitation stage.85
•The number of model runs: sensitivity test functions are generally not computationally expensive86
and can be run without much concern for computational time. This is frequently not the case for87
real models, whose high dimensionality and complexity might set a constraint on the total number88
of model runs available. Under such restrictions, the performance of the estimators of the total-order89
index depends on their efficiency (how accurate they are given the budget of runs that can be allocated90
to each model input). There are no specific guidelines as to which total-order estimator might work91
best under these circumstances [21].92
•The performance measure selected: typically, a sensitivity estimator has been considered to outper-93
form the rest if, on average, it displays a smaller mean absolute error (MAE), computed as94
MAE =
1
p
p
!
v=1$%k
i=1|Ti−ˆ
Ti|
k&,(4)
where pis the number of replicas of the sample matrix, and Tiand ˆ
Tithe analytical and the estimated95
total-order index of the i-th input. The MAE is appropriate when the aim is to assess which estimator96
better approaches the true total-order indices, because it averages the error for both influential and97
non-influential indices. However, the analyst might be more interested in using the estimated indices98
ˆ
T={ˆ
T1,ˆ
T2,..., ˆ
Ti,..., ˆ
Tk}to accurately rank parameters or screen influential from non-influential99
model inputs [1]. In such context, the MAE may be best substituted or complemented with a measure100
of rank concordance between the vectors rand ˆr, which reflect the ranks in Tand
ˆ
Trespectively,101
such as the Spearman’s ρor the Kendall’s Wcoefficient [21,36,37]. It can also be the case that102
disagreements on the exact ranking of low-ranked parameters may have no practical importance103
because the interest lies in the correct identification of top ranks only [30]. Savage [38] scores or104
other measures that emphasize this top-down correlation are then a more suitable choice.105
Here we benchmark the performance of eight different MC-based formulae available to estimate Ti
106
(Table 1). While the list is not exhaustive, they reflect the research conducted on Tiover the last 20107
years: from the classic estimators of Saltelli et al. [1], Homma and Saltelli [7], and Jansen [15] up to108
the new contributions by Janon et al. [16], Glen and Isaacs [17], Azzini and Rosati [33] and Razavi109
4
and Gupta [20,39]. In order to reduce the influence of the benchmarking design in the assessment of110
the estimators’ accuracy, we treat the sampling method τ, the underlying model input distribution φ, the111
number of model runs Nt, the test function ε, its dimensionality and degree of non-additivity (k,k2,k3)112
and the performance measure δas random parameters. This better reflects the diversity of models and113
sensitivity settings available to the analyst. By relaxing the dependency of the results on these benchmark114
parameters1, we define an unprecedentedly large setting where all formulae can prove their accuracy. We115
therefore extend Becker [21]’s approach by testing a wider set of Monte Carlo estimators, by exploring a116
wider range of benchmarking assumptions and by performing a formal SA on these assumptions. The aim117
is therefore to provide a much more global comparison of available MC estimators than is available in the118
existing literature, and investigate how the benchmarking parameters may affect the relative performance119
of estimators. Such information can help point to estimators that are not only efficient on a particular case120
study, but efficient and robust to a wide range of practical situations.121
2 Assessment of the uncertainties in the benchmarking parameters122
In this section we formulate the benchmarking parameters as random variables and assess how the per-123
formance of estimators is dependent on them by performing a sensitivity analysis. In essence this is a124
sensitivity analysis of sensitivity analyses [42], and a natural extension of a similar uncertainty analysis in a125
recent work by Becker [21]. The use of global sensitivity analysis tools to better understand the properties of126
estimators can give insights into how estimators behave in different scenarios that are not available through127
analytical approaches.128
2.1 The setting129
The variability in the benchmark settings (τ,Nt,k,k2,k3,φ,&,δ) is described by probability distributions130
(Table 2). We assign uniform distributions (discrete or continuous) to each parameter. In particular, we131
choose τ∼DU(1,2)to check how the performance of Tiestimators is conditioned by the use of Monte-132
Carlo (τ=1) or Quasi Monte-Carlo (τ=2) methods in the creation of the base sample matrices. For133
τ=2we use the Sobol’ sequence scrambled according to Owen [43] to avoid repeated coordinates at134
the beginning of the sequence. The total number of model runs and inputs is respectively described as135
Nt∼DU(10,1000)and k∼DU(3,100)to explore the performance of the estimators in a wide range of136
Nt,kcombinations. Given the sampling constraints set by the estimators’ reliance on either a B,B(i)
A,A(i)
B
137
or C(i)
Bmatrices (Table 1), we modify the space defined by (Nt,k) to a non-rectangular domain (we provide138
more information on this adjustment in Section 2.2).139
For φwe set φ∼DU(1,8)to ensure an adequate representation of the most common shapes in the140
(0,1)kdomain. Besides the normal distribution truncated at (0,1)and the uniform distribution, we also take141
1We refer to the set of benchmarking assumptions as benchmarking parameters or parameters. This is intended to distinguish
them from the inputs of each test function generated by the metafunction, which we refer to as inputs.
5
Table 1: Formulae to compute Ti.f0and V(y)are estimated according to the original papers. For estimators 2 and 5, f0=
1
N%N
v=1f(A)v. For estimators 1, 2 and 5, V(y)=1
N%N
v=1[f(A)v−f0]2[1, Eq. 4.16, 7, Eqs. 15, 20]. For estimator 3,
f0=1
N%N
v=1
f(A)v+f(A(i)
B)v
2and V(y)=1
N%N
v=1
f(A)2
v+f(A(i)
B)2
v
2−f2
0[16, Eq. 15]. In estimator 4, &f(A)v'is the mean of
f(A)v. We use a simplified version of the Glen and Isaacs estimator because spurious correlations are zero by design. As for
estimator 7, we refer to it as pseudo-Owen given its use of a Cmatrix and its identification with Owen [40] in Iooss et al. [41], where
we retrieve the formula from. V(y)in Estimator 7 is computed as in Estimator 3 following Iooss et al. [41], whereas V(y)in Estimator
8 is computed as in Estimator 1.
NžEstimator Author
1
1
2N%N
v=1'f(A)v−f(A(i)
B)v(2
V(y)Jansen [15]
2V(y)−1
N%N
v=1f(A)vf(A(i)
B)v+f2
0
V(y)Homma and Saltelli
[7]
31−
1
N%N
v=1f(A)vf(A(i)
B)v−f2
0
V(y)Janon et al. [16]
Monod et al. [19]
41−
1
N−1%N
v=1
[f(A)v−&f(A)v']'f(A(i)
B)v−,f(A(i)
B)v-(
.V[f(A)v]V'f(A(i)
B)v(
Glen and Isaacs [17]
51−
1
N%N
v=1f(B)vf(B(i)
A)v−f2
0
V(y)Saltelli et al. [1]
6%N
v=1[f(B)v−f(B(i)
A)v]2+[f(A)v−f(A(i)
B)v]2
%N
v=1[f(A)v−f(B)v]2+[f(B(i)
A)v−f(A(i)
B)v]2Azzini et al. [18] and
Azzini and Rosati [33]
7V(y)−'1
N%N
v=12'f(B)v−f(C(i)
B)v('f(B(i)
A)v−f(A)v(3(
V(y)pseudo-Owen
8Ex∗
∼i[γx∗∼i(hi)]+Ex∗∼i[Cx∗∼i(hi)]
V(y)Razavi and Gupta [20,
39] (see SM).
into account four beta distributions parametrized with distinct αand βvalues and a logitnormal distribution142
(Fig. 1a). The aim is to check the response of the estimators under a wide range of probability distributions,143
including U-shaped distributions and distributions with different degrees of skewness.144
We link each distribution in Fig. 1a to an integer value from 1 to 7. For instance, if φ=1, the joint145
probability distribution of the model inputs is described as p(x1,. ..,xk)=U(0,1)k. If φ=8, we create a146
vector φ={φ1,φ
2,...,φ
i,...,φ
k}by randomly sampling the seven distributions in Fig. 1a, and use the i-th147
distribution in the vector to describe the uncertainty of the i-th input. This last case examines the behavior148
of the estimators when several distributions are used to characterize the uncertainty in the model input149
space.150
6
Table 2: Summary of the parameters and their distributions. DU stands for discrete uniform.
Parameter Description Distribution
τSampling method DU(1,2)
NtTotal number of model runs DU(10,1000)
kNumber of model inputs DU(3,100)
φProbability distribution of the model inputs DU(1,8)
εRandomness in the test function DU(1,200)
k2Fraction of pairwise interactions U(0.3,0.5)
k3Fraction of three-wise interactions U(0.1,0.3)
δSelection of the performance measure DU(1,2)
2.1.1 The test function151
The parameter εoperationalizes the randomness in the form and execution of the test function. Our test152
function is an extended version of Becker [21]’s metafunction, which randomly combines punivariate153
functions in a multivariate function of dimension k. Here we consider the 10 univariate functions listed in154
Fig. 1b, which represent common responses observed in physical systems and in classic SA test functions155
(see Becker [21] for a discussion on this point). We note that an alternative approach would be to construct156
orthogonal basis functions which could allow analytical evaluation of true sensitivity indices for each157
generated function; however, this extension is left for future work.158
We construct the test function as follows:159
1. Let us consider a sample matrix such as160
M=
x11 x12 ··· x1i··· x1k
x21 x22 ··· x2i··· x2k
.
.
..
.
.....
.
.....
.
.
xv1xv2··· xvi ··· xvk
.
.
..
.
.....
.
.....
.
.
xN1xN2··· xNi ··· xNk
(5)
where every point xv=xv1,xv2,...,xvk represents a given combination of values for the kinputs
161
and xiis a model input whose distribution is defined by φ.162
2. Let u={u1,u2,...,uk}be a k-length vector formed by randomly sampling with replacement the ten163
functions in Fig. 1b. The i-th function in uis then applied to the i-th model input: for instance, if k=4164
and u={u3,u4,u8,u1}, then f3(x1)=ex1−1
e−1,f4(x2)=(10 −1
1.1)−1(x2+0.1)−1,f8(x3)=sin(2πx3)
2,165
and f1(x4)=x3
4. The elements in uthus represent the first-order effects of each model input.166
7
0
3
6
9
12
0.00 0.25 0.50 0.75 1.00
x
PDF
Distribution
U(0, 1)
NT(0.5, 0.15, 0, 1)
Beta(8, 2)
Beta(2, 8)
Beta(2, 0.8)
Beta(0.8, 2)
Logitnormal(0, 3.16)
a
−0.5
0.0
0.5
1.0
1.5
0.00 0.25 0.50 0.75 1.00
x
y
Function
f1(x)=x3
f2(x)=1 if x >1 2, otherwise 0
f3(x)=(ex−1) (e−1)
f4(x)=(10 −1 1.1)−1(x+0.1)−1
f5(x)=x
f6(x)=0
f7(x)=4 (x−0.5)2
f8(x)=sin(2 π x)2
f9(x)=x2
f10(x)=cos(x)
b
Figure 1: The metafunction approach. a) Probability distributions incuded in φ.NTstands for truncated normal distribution. b)
Univariate functions included in the metafunction (f1(x)=cubic, f2(x)=discontinuous, f3(x)=exponential, f4(x)=inverse, f5(x)=
linear, f6(x)=no effect, f7(x)=non-monotonic, f8(x)=periodic, f9(x)=quadratic, f10(x)=trigonometric).
3. Let Vbe a (n,2)matrix, for n=k!
2!(k−2)!, the number of pairwise combinations between the kinputs167
of the model. Each row in Vthus specifies an interaction between two columns in M. In the case168
of k=4and the same elements in uas defined in the previous example,169
V=
12
13
14
23
24
34
(6)
e.g., the first row promotes f3(x1)· f4(x2), the second row f3(x1)· f8(x3), and so on until the n-th170
row. In order to follow the sparsity of effects principle (most variations in a given model output171
should be explained by low-order interactions [44]), the metafunction activates only a fraction of172
these effects: it randomly samples !k2n"rows from V, and computes the corresponding interactions173
8
in M.!k2n"is thus the number of pairwise interactions present in the function. We make k2an174
uncertain parameter described as k2∼U(0.3,0.5)in order to randomly activate only between 30%175
and 50% of the available second-order effects in M.176
4. Same as before, but for third-order effects: let Wbe a (m,3) matrix, for m=k!
3!(k−3)!, the number of177
three-wise combinations between the kinputs in M. For k=4and uas before,178
W=
123
124
134
234
(7)
e.g. the first row leads to f3(x1)· f4(x2)· f8(x3), and so on until the m-th row. The metafunction then179
randomly samples !k3m"rows from Wand computes the corresponding interactions in M.!k3m"180
is therefore the number of three-wise interaction terms in the function. We also make k3an uncertain181
parameter described as k3∼U(0.1,0.3)to activate only between 10% and 30% of all third-order182
effects in M. Note that k2>k3because third-order effects tend to be less dominant than two-order183
effects (Table 2).184
5. Three vectors of coefficients (α,β,γ) of length k,nand mare defined to represent the weights of the185
first, second and third-order effects respectively. These coefficients are generated by sampling from186
a mixture of two normal distributions Ψ=0.3N(0,5)+0.7N(0,0.5). This coerces the metafunction187
into replicating the Pareto [45] principle (around 80% of the effects are due to 20% of the parameters),188
found to widely apply in SA [1,46].189
6. The metafunction can thus be formalized as
y=
k
!
i=1
αifuiφi(xi)
+
!k2n"
!
i=1
βifuVi,1φi(xVi,1)fuVi,2φi(xVi,2)
+
!k3m"
!
i=1
γifuWi,1φi(xWi,1)fuWi,2φi(xWi,2)fuWi,3φi(xWi,3).
(8)
Note that there is randomness in the sampling of φ, the univariate functions in uand the coefficients190
in (α,β,γ). The parameter εassesses the influence of this randomness by fixing the starting point191
of the pseudo-random number sequence used for sampling the parameters just mentioned. We use192
ε∼U(1,200)to ensure that the same seed does not overlap with the same value of Nt,kor any193
other parameter, an issue that might introduce determinism in a process that should be stochastic. In194
Figs. S2–S3 we show the type of Tiindices generated by this metafunction.195
Finally, we describe the parameter δas δ∼DU(1,2). If δ=1, we compute the Kendall τ-b correlation
coefficient between ˆrand r, the estimated and the “true” ranks calculated from
ˆ
Tand Trespectively.
9
This aims at evaluating how well the estimators in Table 1 rank all model inputs. If δ=2, we compute
the Pearson correlation between rand ˆrafter transforming the ranks to Savage scores [38]. This setting
examines the performance of the estimators when the analyst is interested in ranking only the most important
model inputs. Savage scores are given as
Sai=
k
!
j=i
1
j,(9)
where jis the rank assigned to the jth element of a vector of length k. If x1>x2>x3, the Savage scores196
would then be Sa1=1+1
2+1
3,Sa2=1
2+1
3, and Sa3=1
3. The parameter δthus assesses the accuracy of the197
estimators in properly ranking the model inputs; in other words, when they are used in a factor prioritization198
setting [1].199
In order to examine also how accurate the estimators are in approaching the “true” indices, we run an
extra round of simulations with the MAE as the only performance measure, which we compute as
MAE =%k
i=1|Ti−ˆ
Ti|
k.(10)
Note that, unlike Equation 4, Equation 10 does not make use of replicas. This is because the effect of200
the sampling is averaged out in our design by simultaneously varying all parameters in many different201
simulations.202
2.2 The execution of the algorithm203
We examine how sensitive the performance of total-order estimators is to the uncertainty in the benchmark204
parameters τ,Nt,k,k2,k3,φ,&,δby means of a global SA. We create an A,Band k−1A(i)
Bmatrices, each205
of dimension (211,k), using Sobol’ quasi-random numbers. In these matrices each column is a benchmark206
parameter described with the probability distributions of Table 2 and each row is a simulation with a specific207
combination of τ,Nt,k,.. . values. Note that we use k−1A(i)
Bmatrices because we group Ntand kand208
treat them like a single benchmark parameter given their correlation (see below).209
Our algorithm runs rowwise over the A,Band k−1A(i)
Bmatrices, for v=1,2,.. .,18,432 rows. In210
the v-th row it does the following:211
1. It creates five (Ntv,kv)matrices using the sampling method defined by τv. The need for these five212
sub-matrices responds to the five specific sampling designs requested by the estimators of our study213
(Table 1). We use these matrices to compute the vector of estimated indices
ˆ
Tifor each estimator:214
(a) An Amatrix and kvA(i)
Bmatrices, each of size (Nv,kv),Nv=!Ntv
kv+1"(Estimators 1–4 in215
Table 1).216
(b) An A,Band kvA(i)
Bmatrices, each of size (Nv,kv),Nv=!Ntv
kv+2"(Estimator 5 in Table 1).217
(c) An A,Band kvA(i)
Band B(i)
Amatrices, each of size (Nv,kv),Nv=!Ntv
2kv+2"(Estimator 6 in218
Table 1).219
10
(d) An A,Band kvB(i)
Aand C(i)
Bmatrices, each of size (Nv,kv),Nv=!Ntv
2kv+2"(Estimator 7 in220
Table 1).221
(e) A matrix formed by Nvstars, each of size kv(1
∆h−1)+1. Given that we set ∆hat 0.2 (see222
Supplementary Materials), Nv=#Ntv
4k+1$(Estimator 8 in Table 1).223
The different sampling designs and the value for kvconstrains the total number of runs Ntvthat can224
be allocated to each estimator. Furthermore, given the probability distributions selected for Ntand225
k(Table 2), specific combinations of (Ntv,kv) lead to Nv≤1, which is computationally unfeasible.226
To minimize these issues we force the comparison between estimators to approximate the same Ntv
227
value. Since the sampling design structure of Razavi and Gupta is the most constraining, we use228
Nv=2(4k+1)
k+1(for estimators 1–4), Nv=2(4k+1)
k+2(for estimator 5) and Nv=2(4k+1)
2k+2(for estimators 6–7)229
when Nv≤1in the case of Razavi and Gupta. This compels all estimators to explore a very similar230
portion of the (Nt,k) space, but Ntand kbecome correlated, which contradicts the requirement of231
independent inputs characterizing variance-based sensitivity indices [1]. This is why we treat (Nt,k)232
as a single benchmark parameter in the SA.233
2. It creates a sixth matrix, formed by an Aand kvA(i)
Bmatrices, each of size (211,kv). We use this234
sub-matrix to compute the vector of “true” indices T, which could not be calculated analytically due235
to the wide range of possible functional forms created by the metafunction. Following Becker [21],236
we assume that a fairly accurate approximation to Tcould be achieved with a large Monte Carlo237
estimation.238
3. The distribution of the model inputs in these six sample matrices is defined by φv.239
4. The metafunction runs over these six matrices simultaneously, with its functional form, degree of240
active second and third-order effects as set by εv,k2vand k3vrespectively.241
5. It computes the estimated sensitivity indices
ˆ
Tvfor each estimator and the “true” sensitivity indices242
Tvusing the Jansen [15] estimator, which is currently best practice in SA.243
6. It checks the performance of the estimators. This is done in two ways:244
(a) If δ=1, we compute the correlation between ˆrvand rv(obtained respectively from
ˆ
Tvand Tv)245
with Kendall tau, and if δ=2we compute the correlation between ˆrvand rvon Savage scores.246
The model output in both cases is the correlation coefficient r, with higher rvalues indicating247
a better performance in properly ranking the model inputs.248
(b) We compute the MAE between
ˆ
Tvand Tv. In this case the model output is the MAE, with249
lower values indicating a better performance in approaching the “true” total-order indices.250
11
3 Results251
3.1 Uncertainty analysis252
Under a factor prioritization setting (e.g. when the aim is to rank the model inputs in terms of their253
contribution to the model output variance), the most accurate estimators are Jansen, Razavi and Gupta,254
Janon/Monod and Azzini and Rosati. The distribution of rvalues (the correlation between estimated and255
"true" ranks) when these estimators are used is highly negatively skewed, with median values of ≈0.9.256
Glen and Isaacs, Homma and Saltelli, Saltelli and pseudo-Owen lag behind and display median rvalues of257
≈0.35, with pseudo-Owen ranking last (r≈0.2). The range of values obtained with these formulae is much258
more spread out and include a significant number of negative rvalues, suggesting that they overturned the259
true ranks in several simulations (Figs. 2a, S4).260
−1.0
−0.5
0.0
0.5
1.0
Jansen
Razavi and Gupta
Janon/Monod
Azzini and Rosati
Glen and Isaacs
Homma and Saltelli
Saltelli
pseudo−Owen
r
a
10−2
100
102
104
Janon/Monod
Jansen
Azzini and Rosati
Glen and Isaacs
pseudo−Owen
Homma and Saltelli
Saltelli
Razavi and Gupta
MAE
b
Figure 2: Boxplots summarizing the results of the simulations. a) Correlation coefficient between ˆrand r, the vector of estimated
and “true” ranks. b) Mean Absolute Error (MAE).
When the goal is to approximate the “true” indices, Janon/Monod, Jansen and Azzini and Rosati also261
offer the best performance. The median MAE obtained with these estimators is generally smaller than262
Glen and Isaacs’ and pseudo-Owen’s, and the distribution of MAE values is much more narrower than263
that obtained with Homma and Saltelli, Saltelli or Razavi and Gupta. These three estimators are the least264
accurate and produce several MAE values larger than 102in several simulations (Fig. 2b). The volatility265
of Razavi and Gupta under the MAE is reflected in the numerous outliers produced and sharply contrasts266
with its very good performance in a factor prioritization setting (Fig. 2a).267
To obtain a finer insight into the structure of these results, we plot the total number of model runs268
Ntagainst the function dimensionality k(Fig. 3). This maps the performance of the estimators in the269
input space formed by all possible combinations of Ntand kgiven the specific design constraints of each270
12
formulae. Under a factor prioritization setting, almost all estimators perform reasonably well at a very271
small dimensionality (k≤10,r>0.7), regardless of the total number of model runs available. However,272
some differences unfold at higher dimensions: Saltelli, Homma and Saltelli, Glen and Isaacs and especially273
pseudo-Owen swiftly become inaccurate for k>10, even with large values for Nt. Azzini and Rosati274
display a very good performance overall except in the upper Nt,kboundary, where most of the orange dots275
concentrate. The estimators of Jansen, Janon/Monod and Razavi and Gupta rank the model inputs almost276
flawlessly regardless of the region explored in the Nt,kdomain (Fig. 3a).277
With regards to the MAE, Janon/Monod, Jansen and Azzini and Rosati maintain their high performance278
regardless of the Nt,kregion explored. The accuracy of Razavi and Gupta, however, drops at the upper-279
leftmost part of the Nt,kboundary, where most of the largest MAE scores are located (MAE >10). In280
the case of Saltelli and Homma and Saltelli, the largest MAE values concentrate in the region of small k281
regardless of the total number of model runs, a domain in which they achieved a high performance when282
the focus was on properly ranking the model inputs.283
The presence of a non-negligible proportion of model runs with r<0suggests that some estimators284
significantly overturned the true ranks (Figs 3a, S4). To better examine this phenomenon, we re-plot Fig 3b285
with just the simulations yielding r<0(Fig. S5). We observe that r<0values not only appear in the286
region of small Nt, a foreseeable miscalculation derived from allocating an insufficient number of model287
runs to each model input: they also emerge at a relatively large Ntand low kin the case of pseudo-Owen,288
Saltelli and Homma and Saltelli. The Saltelli estimator actually concentrates in the k<10 zone most of289
the simulations with the lowest negative rvalues (Fig. S5). This suggests that rank reversing is not an290
artifact of our study design as much as a by-product of the volatility of these estimators when stressed by291
the sources of computational uncertainty listed in Table 2. Such strain may lead these estimators to produce292
a significant fraction of negative indices or indices beyond 1, thus effectively promoting r<0.293
We calculate the proportion of Ti<0and Ti>1in each simulation that yielded r<0. In the294
case of Glen and Isaacs and Homma and Saltelli, r<0values are caused by the production of a large295
proportion of Ti<0(25%–75%, the xaxis in Fig. 4). Pseudo-Owen and Saltelli suffer this bias too and296
in several simulations they also generate a large proportion of Ti>1(up to 100% of the model inputs,297
the yaxis in Fig. 4). The production of Ti<0and Ti>1is caused by numerical errors and fostered by298
the values generated at the numerator of Equation 3: Ti<0may either derive from Ex∼i"Vxi(y|x∼i)#<0299
(e.g. Homma and Saltelli and pseudo-Owen) or Vx∼i"Exi(y|x∼i)#>V(y)(e.g. Saltelli), whereas Ti>1300
from Ex∼i"Vxi(y|x∼i)#>V(y)(e.g. Homma and Saltelli and pseudo-Owen) or Vx∼i"Exi(y|x∼i)#<0(e.g.301
Saltelli).302
To better examine the efficiency of the estimators, we summarized their performance as a function of303
the number of runs available per model input Nt/k[21] (Fig. 5, S6). This information is especially relevant304
to take an educated decision on which estimator to use in a context of a high-dimension, computationally305
expensive model. Even when the budget of runs per input is low [(Nt/k)∈[2,20]], Razavi and Gupta,306
Jansen and Janon/Monod are very good at properly ranking model inputs (r≈0.9), and are followed very307
13
Saltelli
Razavi and Gupta
pseudo−Owen
Jansen
Janon/Monod
Homma and Saltelli
Glen and Isaacs
Azzini and Rosati
0 500 1000
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
Nt
k
−1.0 −0.5 0.0 0.5 1.0
r
a
Saltelli
Razavi and Gupta
pseudo−Owen
Jansen
Janon/Monod
Homma and Saltelli
Glen and Isaacs
Azzini and Rosati
0 500 1000
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
0
50
100
Nt
0.001 0.1 10
MAE
b
Figure 3: Number of runs Ntagainst the function dimensionality k. Each dot is a simulation with a specific combination of the
benchmark parameters in Table 2. The greener (blacker) the color, the better (worse) the performance of the estimator. a) Accuracy
of the estimators when the goal is to properly rank the model inputs, e.g. a factor prioritization setting. b) Accuracy of the estimators
when the goal is to approach the “true” total-order indices.
close by Azzini and Rosati (r≈0.8). Saltelli, Homma and Saltelli and Glen and Isaacs come after (r≈0.3),308
with pseudo-Owen scoring last (r≈0.2). When the Nt/kratio is increased, all estimators improve their309
ranking accuracy and some quickly reach the asymptote: this is the case of Razavi and Gupta, Janon/Monod310
and Jansen, whose performance becomes almost flawless from (Nt/k)∈[40,60]onwards, and of Azzini311
14
pseudo−Owen
Saltelli
Glen and Isaacs
Homma and Saltelli
0.00 0.25 0.50 0.75 1.000.00 0.25 0.50 0.75 1.00
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
Proportion of Ti<0
Proportion of Ti>1
−1.00−0.75−0.50−0.25
r
Figure 4: Scatterplot of the proportion of Ti<0against the proportion of Ti>1mapped against the model output r. Each dot is a
simulation. Only simulations with r<0are displayed.
and Rosati, which reaches its optimum at (Nt/k)∈[60,80]. The accuracy of the other estimators does312
not seem to fully stabilize within the range of ratios examined. In the case of Homma and Saltelli and313
Saltelli, their performance oscillates before plummeting at (Nt/k)∈[200,210],(Nt/k)∈[240,260]and314
(Nt/k)∈[260,280]due to several simulations yielding large r<0values (Fig. 5a).315
Janon/Monod and Jansen are also the most efficient estimators when the MAE is the measure of316
choice, followed closely by Azzini and Rosati, Razavi and Gupta and Glen and Isaacs. Saltelli and317
Homma and Saltelli gain accuracy at higher Nt/kratios yet their precision diminishes all the same from318
(Nt/k)∈[200,210]onwards (Fig. 5b).319
3.2 Sensitivity analysis320
When the aim is to rank the model inputs, the selection of the performance measure (δ) has the highest321
first-order effect in the accuracy of the estimators (Fig. 6a). The parameter δis responsible for between322
20% (Azzini and Rosati) and 30% (Glen and Isaacs) of the variance in the final rvalue. On average, all323
estimators perform better when the rank is conducted on Savage scores (δ=2), i.e. when the focus is on324
ranking the most important model inputs only (Figs. S8–S15). As for the distribution of the model inputs325
(φ), it has a first-order effect in the accuracy of Azzini and Rosati (≈10%), Jansen and Janon / Monod326
(≈15%) and Razavi and Gupta (≈20%) regardless of whether the aim is a factor prioritization (r) or327
approaching the “true” indices (MAE). The performance of these estimators drops perceptibly when the328
model inputs are distributed as Beta(8,2)or Beta(2,8)(φ=3and φ=4, Figs. S8-S23), suggesting that329
15
Estimator
Azzini and Rosati Glen and Isaacs Homma and Saltelli
Janon/Monod Jansen pseudo−Owen
Razavi and Gupta Saltelli
−0.5
0.0
0.5
1.0
10 30 100 300
Ntk
median(r)
a
10−2
10−1
100
101
102
10 30 100 300
Ntk
median(MAE)
b
Figure 5: Scatterplot of the model output ragainst the number of model runs allocated per model input (Nt/k). See Fig. S6 for a
visual display of all simulations and Fig. S7 for an assessment of the number of model runs that each estimator has in each Nt/k
compartment.
they may be especially stressed by skewed distributions. The selection of random or quasi-random numbers330
during the construction of the sample matrix (τ) also directly conditions the accuracy of several estimators.331
If the aim is to approach the “true” indices (MAE), τconveys from 17% (Azzini and Rosati) to ≈30%332
(Glen and Isaacs) of the model output variance, with all estimators except Razavi and Gupta performing333
better on quasi-random numbers (τ=2, Figs. S16–S23). In a factor prioritization setting, τis mostly334
influential through interactions. Interestingly, the proportion of active second and third-order interactions335
(k2,k3) does not alter the performance of any estimator in any of the settings examined.336
To better understand the structure of the sensitivities, we compute Sobol’ indices after grouping indi-337
vidual parameters in three clusters, which we define based on their commonalities: the first group includes338
(δ,τ)and reflects the influence of those parameters that can be defined by the sensitivity analyst during339
the setting of the benchmark exercise. The second combines (ε,k2,k3,φ) and examines the overall impact340
of the model functional form, referred to as f(x), which is often beyond the analyst’s grasp. Finally, the341
third group includes (Nt,k)only and assesses the influence of the sampling design in the accuracy of the342
estimators (we assume that the total number of model runs, besides being conditioned by the computing343
resources at hand, is also partially determined by the joint effect of the model dimensionality and the use344
of either a B,A(i)
B),B(i)
Aor C(i)
Bmatrices) (Fig 6b).345
The uncertainty in the functional form of the model [ f(x)] is responsible for approximately 20% of346
the variance in the performance of Azzini and Rosati, Janon/Monod or Jansen in a factor prioritization347
setting. For Glen and Isaacs, Homma and Saltelli, pseudo-Owen or Saltelli, f(x)is influential only through348
interactions with the other clusters. When the MAE is the performance measure of interest, f(x)has a349
16
Sobol' indices SiTi
r
MAE
Azzini and Rosati
Glen and Isaacs
Homma and Saltelli
Janon/Monod
Jansen
pseudo−Owen
Razavi and Gupta
Saltelli
δεk2k3φτεk2k3φτ
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
Sobol' index
a
r
MAE
Azzini and Rosati
Glen and Isaacs
Homma and Saltelli
Janon/Monod
Jansen
pseudo−Owen
Razavi and Gupta
Saltelli
δ τf(x)Nt kf(x)Nt k
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
0.0
0.5
b
Figure 6: Sobol’ indices. a) Individual parameters. b) Clusters of parameters. The cluster f(x)includes all parameters that describe
the uncertainty in the functional form of the model ($,k2,k3,φ). Ntand kare assessed simultaneously due to their correlation. Note
that the MAE facet does not include the group (δτ) because δ(the performance measure used) is no longer an uncertain parameter in
this setting.
much stronger influence in the accuracy of the estimators than the couple (Nt,k), especially in the case350
of Glen and Isaacs (≈40%). In any case, the accuracy of the estimators is significantly conditioned by351
interactions between the benchmark parameters. The sum of all individual Siindices plus the Siindex352
17
of the (Nt,k)cluster only explains from ≈45% (Saltelli) to ≈70% (Glen and Isaacs) of the estimators’353
variance in ranking the model inputs, and from ≈24% (pseudo-Owen) to ≈60% (Razavi and Gupta) of the354
variance in approaching the “true” indices.355
4 Discussion and conclusions356
Here we design an eight-dimension background for variance-based total-order estimators to confront and357
prove their value in an unparalleled range of SA scenarios. By randomizing the parameters that condition358
their performance, we obtain a comprehensive picture of the advantages and disadvantages of each estimator359
and identify which particular benchmark factors make them more prone to error. Our work thus provides360
a thorough empirical assessment of state-of-the-art total-order estimators and contributes to define best361
practices in variance-based SA. The study also aligns with previous works focused on testing the robustness362
of the tools available to sensitivity analysts, a line of inquiry that can be described as a sensitivity analysis363
of a sensitivity analysis (SA of SA) [42].364
Our results provide support to the assumption that the scope of previous benchmark studies is limited365
by the plethora of non-unique choices taken during the setting of the analysis [21]. We have observed366
that almost all decisions have a non-negligible effect: from the selection of the sampling method to the367
choice of the performance measure, the design prioritized by the analyst can influence the performance of368
the estimator in a non-obvious way, namely through interactions. The importance of non-additivities in369
conditioning performance suggests that the benchmark of sensitivity estimators should no longer rely on370
statistical designs that change one parameter at a time (usually the number of model runs and, more rarely,371
the test function [14,16,18,20,23,33,39,40,42]). Such setting reduces the uncertain space to a minimum372
and misses the effects that the interactions between the benchmark parameters have in the final accuracy of373
the estimator. If global SA is the recommended practice to fully explore the uncertainty space of models,374
sensitivity estimators, being algorithms themselves, should be likewise validated [42].375
Our approach also compensates the lack of studies on the theoretical properties of estimators in the376
sensitivity analysis literature (see for instance [15,47]), and allows a more detailed examination of their377
performance than theoretical comparisons. Empirical studies like ours mirror the numerical character378
of sensitivity analysis when the indices can not be analytically calculated, which is most of the time in379
“real-world” mathematical modeling.380
Two recommendations emerge from our work: the estimators by Razavi and Gupta, Jansen, Janon/Monod381
or Azzini and Rosati should be preferred when the aim is to rank the model inputs. Jansen, Janon/Monod382
or Azzini and Rosati should also be prioritized if the goal is to estimate the “true” total-order indices. The383
drop in performance of Razavi and Gupta in the second setting may be explained by a bias at a lower sample384
sizes, i.e. a consistent over-estimation of all total-order indices. This is because their estimator relies on a385
constant mean assumption whose validity degrades with larger values of ∆h[20,39]. In order to remove386
this bias, ∆hshould take very small values (e.g., ∆h=0.01), which may not be computationally feasible.387
18
Since the direction of this bias is the same for all parameters it only affects the calculation of the “true”388
total-order indices, not the capacity of the estimator to properly rank the model inputs.389
It is also worth stating that Razavi and Gupta is the only estimator studied here that require the analyst390
to define a tuning parameter, ∆h. In this paper we have set ∆h=0.2after some preliminary trials with391
the estimator; other works have used different values (e.g. ∆h=0.002,∆h=0.1,∆h=0.3;[20,21,392
39]). Selecting the most appropriate value for a given tuning parameter is not an obvious choice and this393
uncertainty can make an estimator volatile, as shown by Puy et al. [42] in the case of the PAWN index.394
The fact that Glen and Isaacs, Homma and Saltelli, Saltelli and pseudo-Owen do not perform as well in395
properly ranking the model inputs and approaching the “true” total-order indices may be partially explained396
by their less efficient computation of elementary effects: by allowing the production of negative terms in397
the numerator these estimators also permit the production of negative total-order indices, thus leading to398
biased rankings or sensitivity indices. In the case of Saltelli, the use of a Bmatrix at the numerator and an399
Amatrix at the denominator exacerbates its volatility (Table 1, Nž5). Such inconsistency was corrected in400
Saltelli et al. [23].401
The consistent robustness of Jansen, Janon/Monod and Azzini and Rosati makes their sensitivity to the402
uncertain parameters studied here almost negligible. They are already highly optimized estimators with403
not much room for improvement. Most of their performance is conditioned by the first and total-order404
effects of the model form jointly with the underlying probability distributions ( f(x)in Fig. 6b), as well as405
by their sampling design (Nt,k), which are in any case beyond the analyst’s control. As for the rest, their406
accuracy might be enhanced by allocating a larger number of model runs per input (if computationally407
affordable), and especially in the case of Homma and Saltelli, Saltelli and Glen and Isaacs, by restricting408
their use to low-dimensional models (k<10) and sensitivity settings that only require ranking the most409
important parameters (a restricted factor prioritisation setting; [1]). Nevertheless, their substantial volatility410
is considerably driven by non-additivities, a combination that makes them hard to tame and should raise411
caution about their use in any modeling exercise.412
Our results slightly differ from Becker [21]’s, who observed that Jansen outperformed Janon/Monod413
under a factor prioritization setting. We did not find any significant difference between these estimators.414
Although our metafunction approach is based on Becker [21]’s, our study tests the accuracy of estimators415
in a larger uncertain space as we also account for the stress introduced by changes in the sampling method416
τ, the underlying probability distributions φor the performance measure selected δ. These differences may417
account for the slightly different results obtained between the two papers.418
Our analysis can be extended to other sensitivity estimators (i.e. moment-independent like entropy-419
based [48]; the δ-measure [49]; or the PAWN index, [50,51]). Moreover, it holds potential to be used overall420
as a standard crash test every time a new sensitivity estimator is introduced to the modeling community. One421
of its advantages is its flexibility: Becker [21]’s metafunction can be easily extended with new univariate422
functions or probability distributions, and the settings modified to check performance under different423
degrees of non-additivities or in a larger (Nt,k)space. With some slight modifications it should also allow424
19
to produce functions with dominant low-order or high-order terms, labeled as Type B and C by Kucherenko425
et al. [22]. This should prompt developers of sensitivity indices to severely stress their estimators so the426
modeling community and decision-makers fully appraise how they deal with uncertainties.427
5 Code availability428
The Rcode to replicate our results is available in Puy [52] and in GitHub (https://github.com/429
arnaldpuy/battle_estimators). The uncertainty and sensitivity analysis have been carried out with430
the Rpackage sensobol [53], which also includes the test function used in this study.431
6 Acknowledgements432
We thank Saman Razavi for his insights on the behavior of the Razavi and Gupta estimator. This work433
has been funded by the European Commission (Marie Skłodowska-Curie Global Fellowship, grant number434
792178 to A.P.).435
References436
[1] A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, and S. Tarantola.437
Global Sensitivity Analysis. The Primer. Chichester, UK: John Wiley & Sons, Ltd, Dec. 2008. doi:438
10.1002/9780470725184.439
[2] A. Jakeman, R. Letcher, and J. Norton. “Ten iterative steps in development and evaluation of440
environmental models”. Environmental Modelling & Software 21.5 (May 2006), 602–614. doi:441
10.1016/j.envsoft.2006.01.004.442
[3] S. Eker, E. Rovenskaya, M. Obersteiner, and S. Langan. “Practice and perspectives in the validation of443
resource management models”. Nature Communications 9.1 (2018), 1–10. doi:10.1038/s41467-444
018-07811-9.445
[4] A. Saltelli. “Sensitivity analysis for importance assessment”. Risk Analysis 22.3 (June 2002), 579–446
590. doi:10.1111/0272-4332.00040.447
[5] B. Iooss and P. Lemaître. “A Review on Global Sensitivity Analysis Methods”. Uncertainty Man-448
agement in Simulation-Optimization of Complex Systems. Operations Research/Computer Science449
Interfaces Series, vol 59. Ed. by G. Dellino and C. Meloni. Boston: Springer, 2015, 101–122. doi:450
10.1007/978-1-4899-7547-8_5. arXiv: 1404.2405.451
[6] W. Becker and A. Saltelli. “Design for sensitivity analysis”. Handbook of Design and Analysis of452
Experiments. Ed. by A. Dean, M. Morris, J. Stufken, and D. Bingham. Boca Ratón: CRC Press,453
Taylor & Francis, 2015, 627–674. doi:10.1201/b18619.454
20
[7] T. Homma and A. Saltelli. “Importance measures in global sensitivity analysis of nonlinear models”.455
Reliability Engineering & System Safety 52 (1996), 1–17. doi:10.1016/0951-8320(96)00002-6.456
[8] L. Le Gratiet, S. Marelli, and B. Sudret. “Metamodel-Based Sensitivity Analysis: Polynomial Chaos457
Expansions and Gaussian Processes”. Handbook of Uncertainty Quantification. Cham: Springer458
International Publishing, 2017, 1289–1325. doi:10.1007/978-3- 319-12385- 1_38.459
[9] A. Saltelli, S. Tarantola, and K. P.-S. Chan. “A quantitative model-independent method for global460
sensitivity analysis of model output”. Technometrics 41.1 (Feb. 1999), 39. doi:10.2307/1270993.461
[10] R. I. Cukier, C. M. Fortuin, K. E. Shuler, A. G. Petschek, and J. H. Schaibly. “Study of the sensitivity462
of coupled reaction systems to uncertainties in rate coefficients. I Theory”. The Journal of chemical463
physics 59.8 (1973), 3873–3878.464
[11] R. I. Cukier, H. B. Levine, and K. E. Shuler. “Nonlinear sensitivity analysis of multiparameter model465
systems”. Journal of computational physics 26.1 (1978), 1–42.466
[12] I. M. Sobol’. “On the distribution of points in a cube and the approximate evaluation of integrals”.467
USSR Computational Mathematics and Mathematical Physics 7.4 (Jan. 1967), 86–112. doi:10.468
1016/0041-5553(67)90144-9.469
[13] I. M. Sobol’. “Uniformly distributed sequences with an additional uniform property”. USSR Compu-470
tational Mathematics and Mathematical Physics 16.5 (Jan. 1976), 236–242. doi:10.1016/0041-471
5553(76)90154-3.472
[14] S. Lo Piano, F. Ferretti, A. Puy, D. Albrecht, and A. Saltelli. “Variance-based sensitivity analysis: The473
quest for better estimators and designs between explorativity and economy”. Reliability Engineering474
& System Safety 206.October 2020 (Feb. 2021), 107300. doi:10.1016/j.ress.2020.107300.475
[15] M. Jansen. “Analysis of variance designs for model output”. Computer Physics Communications476
117.1-2 (Mar. 1999), 35–43. doi:10.1016/S0010-4655(98)00154-4.477
[16] A. Janon, T. Klein, A. Lagnoux, M. Nodet, and C. Prieur. “Asymptotic normality and efficiency478
of two Sobol index estimators”. ESAIM: Probability and Statistics 18.3 (2014), 342–364. doi:479
10.1051/ps/2013040. arXiv: arXiv:1303.6451v1.480
[17] G. Glen and K. Isaacs. “Estimating Sobol sensitivity indices using correlations”. Environmental481
Modelling and Software 37 (2012), 157–166. doi:10.1016/j.envsoft.2012.03.014.482
[18] I. Azzini, T. Mara, and R. Rosati. “Monte Carlo estimators of first-and total-orders Sobol’ indices”483
(2020). arXiv: 2006.08232.484
[19] H. Monod, C. Naud, and D. Makowski. Uncertainty and sensitivity analysis for crop models. Ed. by485
D. Wallach, D. Makowski, and J. W. Jones. Elsevier, 2006, 35–100.486
[20] S. Razavi and H. V. Gupta. “A new framework for comprehensive, robust, and efficient global487
sensitivity analysis: 2. Application”. Water Resources Research 52.1 (Jan. 2016), 440–455. doi:488
10.1002/2015WR017558. arXiv: 2014WR016527 [10.1002].489
21
[21] W. Becker. “Metafunctions for benchmarking in sensitivity analysis”. Reliability Engineering and490
System Safety 204 (2020), 107189. doi:10.1016/j.ress.2020.107189.491
[22] S. Kucherenko, B. Feil, N. Shah, and W. Mauntz. “The identification of model effective dimensions492
using global sensitivity analysis”. Reliability Engineering & System Safety 96.4 (Apr. 2011), 440–493
449. doi:10.1016/j.ress.2010.11.003.494
[23] A. Saltelli, P. Annoni, I. Azzini, F. Campolongo, M. Ratto, and S. Tarantola. “Variance based495
sensitivity analysis of model output. Design and estimator for the total sensitivity index”. Computer496
Physics Communications 181.2 (Feb. 2010), 259–270. doi:10.1016/j.cpc.2009.09.018.497
[24] T. Ishigami and T. Homma. “An importance quantification technique in uncertainty analysis for com-498
puter models”. Proceedings. First International Symposium on Uncertainty Modeling and Analysis499
12 (1990), 398–403.500
[25] I. M. Sobol’. “On quasi-Monte Carlo integrations”. Mathematics and Computers in Simulation 47.2-5501
(Aug. 1998), 103–112. doi:10.1016/S0378-4754(98)00096-2.502
[26] P. Bratley and B. L. Fox. “ALGORITHM 659: implementing Sobol’s quasirandom sequence gener-503
ator”. ACM Transactions on Mathematical Software (TOMS) 14.1 (1988), 88–100.504
[27] E. Borgonovo and E. Plischke. “Sensitivity analysis: A review of recent advances”. European Journal505
of Operational Research 248.3 (2016), 869–887. doi:10.1016/j.ejor.2015.06.032.506
[28] A. Saltelli. “A short comment on statistical versus mathematical modelling”. Nature Communications507
10.1 (2019), 8–10. doi:10.1038/s41467-019-11865-8.508
[29] A. Saltelli, G. Bammer, I. Bruno, E. Charters, M. Di Fiore, E. Didier, W. Nelson Espeland, J. Kay, S.509
Lo Piano, D. Mayo, R. Pielke Jr, T. Portaluri, T. M. Porter, A. Puy, I. Rafols, J. R. Ravetz, E. Reinert,510
D. Sarewitz, P. B. Stark, A. Stirling, J. van der Sluijs, and P. Vineis. “Five ways to ensure that models511
serve society: a manifesto”. Nature 582.7813 (June 2020), 482–484. doi:10.1038/d41586- 020-512
01812-9.513
[30] R. Sheikholeslami, S. Razavi, H. V. Gupta, W. Becker, and A. Haghnegahdar. “Global sensitivity514
analysis for high-dimensional problems: How to objectively group factors and measure robustness and515
convergence while reducing computational cost”. Environmental Modelling and Software 111.June516
2018 (2019), 282–299. doi:10.1016/j.envsoft.2018.09.002.517
[31] F. Sarrazin, F. Pianosi, and T. Wagener. “Global Sensitivity Analysis of environmental models:518
Convergence and validation”. Environmental Modelling and Software 79 (2016), 135–152. doi:519
10.1016/j.envsoft.2016.02.005.520
[32] A. Haghnegahdar, S. Razavi, F. Yassin, and H. Wheater. “Multicriteria sensitivity analysis as a diag-521
nostic tool for understanding model behaviour and characterizing model uncertainty”. Hydrological522
Processes 31.25 (2017), 4462–4476. doi:10.1002/hyp.11358.523
22
[33] I. Azzini and R. Rosati. “The IA-Estimator for Sobol’ sensitivity indices”. Ninth International524
Conference on Sensitivity Analysis of Model Output. Barcelona, 2019.525
[34] M. J. Shin, J. H. A. Guillaume, B. F. W. Croke, and A. J. Jakeman. “Addressing ten questions about526
conceptual rainfall-runoffmodels with global sensitivity analyses in R”. Journal of Hydrology 503527
(2013), 135–152. doi:10.1016/j.jhydrol.2013.08.047.528
[35] L. Paleari and R. Confalonieri. “Sensitivity analysis of a sensitivity analysis: We are likely overlooking529
the impact of distributional assumptions”. Ecological Modelling 340 (2016), 57–63. doi:10.1016/530
j.ecolmodel.2016.09.008.531
[36] C. Spearman. “The proof and measurement of association between two things”. The American532
Journal of Psychology 15.1 (Jan. 1904), 72. doi:10.2307/1412159.533
[37] M. G. Kendall and B. B. Smith. “The problem of m rankings”. The Annals of Mathematical Statistics534
10.3 (Sept. 1939), 275–287. doi:10.1214/aoms/1177732186.535
[38] I. R. Savage. “Contributions to the theory of rank order statistics - the two sample case”. Annals of536
Mathematical Statistics 27 (1956), 590–615.537
[39] S. Razavi and H. V. Gupta. “A new framework for comprehensive, robust, and efficient global538
sensitivity analysis: 1. Theory”. Water Resources Research 52.1 (Jan. 2016), 423–439. doi:10 .539
1002/2015WR017559.540
[40] A. B. Owen. “Better estimation of small sobol’ sensitivity indices”. ACM Transactions on Modeling541
and Computer Simulation 23.2 (2013), 1–17. doi:10.1145/2457459.2457460.542
[41] B. Iooss, A. Janon, G. Pujol, with contributions from Baptiste Broto, K. Boumhaout, S. D. Veiga,543
T. Delage, R. E. Amri, J. Fruth, L. Gilquin, J. Guillaume, L. Le Gratiet, P. Lemaitre, A. Marrel,544
A. Meynaoui, B. L. Nelson, F. Monari, R. Oomen, O. Rakovec, B. Ramos, O. Roustant, E. Song, J.545
Staum, R. Sueur, T. Touati, and F. Weber. sensitivity: Global Sensitivity Analysis of Model Outputs.546
R package version 1.22.1, 2020.547
[42] A. Puy, S. Lo Piano, and A. Saltelli. “A sensitivity analysis of the PAWN sensitivity index”. Envi-548
ronmental Modelling and Software 127 (2020), 104679. doi:10.1016/j.envsoft.2020.104679.549
arXiv: 1904.04488.550
[43] A. B. Owen. “Randomly permuted (t, m, s)-nets and (t, s)-sequences”. Monte Carlo and Quasi-Monte551
Carlo Methods in Scientific Computing. Lecture Notes in Statistics, vol. 106. 1995, 299–317.552
[44] G. E. P. Box, J. S. Hunter, and W. G. Hunter. Statistics for Experimenters: Design, Innovation, and553
Discovery. Wiley, 2005.554
[45] V. Pareto. Manuale di Economia Politica. Vol. 13. Societa Editrice, 1906.555
[46] G. E. P. Box and R. D. Meyer. “An analysis for unreplicated fractional factorials”. Technometrics556
28.1 (1986), 11–18. doi:10.1080/00401706.1986.10488093.557
23
[47] I. Azzini, G. Listorti, T. A. Mara, and R. Rosati. “Uncertainty and Sensitivity Analysis for policy558
decision making. An introductory guide”. Luxembourg, 2020.559
[48] H. Liu, W. Chen, and A. Sudjianto. “Relative entropy based method for probabilistic sensitivity560
analysis in engineering design”. Journal of Mechanical Design, Transactions of the ASME 128.2561
(2006), 326–336. doi:10.1115/1.2159025.562
[49] E. Borgonovo. “A new uncertainty importance measure”. Reliability Engineering and System Safety563
92.6 (2007), 771–784. doi:10.1016/j.ress.2006.04.015.564
[50] F. Pianosi and T. Wagener. “A simple and efficient method for global sensitivity analysis based on565
cumulative distribution functions”. Environmental Modelling and Software 67 (2015), 1–11. doi:566
10.1016/j.envsoft.2015.01.004.567
[51] F. Pianosi and T. Wagener. “Distribution-based sensitivity analysis from a generic input-output568
sample”. Environmental Modelling and Software 108 (2018), 197–207. doi:10.1016/j.envsoft.569
2018.07.019.570
[52] A. Puy. R code of "A comprehensive comparison of total-order estimators for global sensitivity571
analysis". 2020. doi:10.5281/zenodo.4946559.572
[53] A. Puy, S. L. Piano, A. Saltelli, and S. A. Levin. “sensobol: an R package to compute variance-based573
sensitivity indices” (Jan. 2021). arXiv: 2101.10103.574
24
A comprehensive comparison of total-order
estimators for global sensitivity analysis
Supplementary Materials
Arnald Puy⇤1,2, William Becker3, Samuele Lo Piano
4, and Andrea
Saltelli2,3
1Department of Ecology and Evolutionary Biology, M31 Guyot Hall, Princeton University,
New Jersey 08544, USA. E-Mail: apuy@princeton.edu
2Centre for the Study of the Sciences and the Humanities (SVT), University of Bergen,
Parkveien 9, PB 7805, 5020 Bergen, Norway.
3European Commission, Joint Research Centre, Via Enrico Fermi, 2749, 21027 Ispra VA,
Italy
4University of Reading, School of the Built Environment, JJ Thompson Building,
Whiteknights Campus, Reading, RG6 6AF, United Kingdom
Contents
1 Razavi and Gupta’s estimator (VARS) 2
2 Figures 3
⇤Corresponding author
1
1 Razavi and Gupta’s estimator (VARS)
Unlike the other total-order estimators examined in our paper, Razavi and
Gupta’s VARS (for Variogram Analysis of Response Surfaces [1,2]) relies on
the variogram (.) and covariogram C(.) functions to compute what they call
the VARS-TO, for VARS Total-Order index.
Let us consider a function of factors x=(x1,x
2,...,x
k)2Rk.IfxAand
xBare two generic points separated by a distance h, then the variogram is
calculated as
(xAxB)=1
2V[y(xA)y(xB)] (1)
and the covariogram as
C(xAxB)=COV [y(xA),y(xB)] (2)
Note that
V[y(xA)y(xB)] = V[y(xA)] + V[y(xB)] 2COV [y(xA),y(xB)] (3)
and since V[y(xA)] = V[y(xB)], then
(xAxB)=V[y(x)] C(xA,xB) (4)
In order to obtain the total-order e↵ect Ti, the variogram and covariogram
are computed on all couples of points spaced hialong the xiaxis, with all other
factors being kept fixed. Thus equation 4 becomes
x⇤
⇠i(hi)=V(y|x⇤
⇠i)Cx⇤
⇠i(hi) (5)
where x⇤
⇠iis a fixed point in the space of non-xi. Razavi and Gupta [1,
2] suggest to take the mean value across the factors’ space on both sides of
equation 5, thus obtaining
Ex⇤
⇠i⇥x⇤
⇠i(hi)⇤=Ex⇤
⇠i[V(y|x⇤
⇠i)] Ex⇤
⇠i⇥Cx⇤
⇠i(hi)⇤(6)
which can also be written as
Ex⇤
⇠i⇥x⇤
⇠i(hi)⇤=V(y)TiEx⇤
⇠i⇥Cx⇤
⇠i(hi)⇤(7)
and therefore
Ti=Ex⇤
⇠i⇥x⇤
⇠i(hi)⇤+Ex⇤
⇠i⇥Cx⇤
⇠i(hi)⇤
V(y)(8)
The sampling scheme for VARS does not rely on A,B,A(i)
B... matrices, but
on star centers and cross sections. Star centers are Nrandom points sampled
across the input space. For each of these stars, kcross sections of points spaced
hapart are generated, including and passing through the star center. Overall,
the computational cost of VARS amounts to Nt=N[k((1/h)1) + 1].
2
2 Figures
Monte−Carlo
Quasi Monte−Carlo
0.0 0.5 1.0 0.0 0.5 1.0
0.00
0.25
0.50
0.75
1.00
X1
X2
Figure S1: Examples of Monte-Carlo and Quasi Monte-Carlo sampling in two dimensions.
N= 200.
3
∑
i=1
k
Si
1
k ∑
i=1
k(Ti>0.05)
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
0
50
100
150
200
Proportion
Count
Figure S2: Proportion of the total sum of first-order e↵ects and of the active model inputs
(defined as Ti>0.05) after 1000 random metafunction calls with k2(3,100). Note how
the sum of first-order e↵ects clusters around 0.8 (thus evidencing the production of non-
additivities) and how, on average, the number of active model inputs revolves around 10–20%,
thus reproducing the Pareto principle.
0.0
0.2
0.4
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17
Sobol' index
Sobol' indices SiTi
Figure S3: Sobol’ Tiindices obtained after a run of the metafunction with the following
parameter settings: N=10
4,k=17,k2=0.5, k3=0.2, "= 666. The error bars reflect
the 95% confidence intervals after bootstrapping (R=10
2). The indices have been computed
with the Jansen [3] estimator.
4
Razavi and Gupta
Janon/Monod
Jansen
Azzini and Rosati
Glen and Isaacs
Saltelli
Homma and Saltelli
pseudo−Owen
0.00 0.05 0.10 0.15
1
N ∑
v=1
N
rv<0
Figure S4: Proportion of model runs yielding r<0.
Jansen
pseudo−Owen
Saltelli
Azzini and Rosati
Glen and Isaacs
Homma and Saltelli
Janon/Monod
0 500 1000 0 500 1000 0 500 1000
0 500 1000
0
50
100
0
50
100
Nt
k
−1.00−0.75−0.50−0.25
r
Figure S5: Scatter of the total number of model runs Ntagainst the function dimensionality
konly for r<0.
5
Saltelli
Razavi and Gupta
pseudo−Owen
Jansen
Janon/Monod
Homma and Saltelli
Glen and Isaacs
Azzini and Rosati
10 30 100 300
−1
0
1
−1
0
1
−1
0
1
−1
0
1
−1
0
1
−1
0
1
−1
0
1
−1
0
1
Ntk
r
a
Saltelli
Razavi and Gupta
pseudo−Owen
Jansen
Janon/Monod
Homma and Saltelli
Glen and Isaacs
Azzini and Rosati
10 30 100 300
10−2
100
102
104
10−2
100
102
104
10−2
100
102
104
10−2
100
102
104
10−2
100
102
104
10−2
100
102
104
10−2
100
102
104
10−2
100
102
104
Ntk
MAE
b
Figure S6: Scatterplot of the correlation between Tiand
ˆ
Ti(r) against the number of model
runs allocated per model input (Nt/k).
6
1
10
100
1000
10 30 50 70 90 110 130 150 170 190 210 230 250 270 310
Mean Nt/k ratio
Nº of simulations
Estimator
Azzini and Rosati Glen and Isaacs
Homma and Saltelli Janon/Monod
Jansen pseudo−Owen
Razavi and Gupta Saltelli
Figure S7: Bar plot with the number of simulations conducted in each of the Nt/k compar-
ments assessed. All estimators have approximately the same number of simulations in each
compartment.
7
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−0.25
0.00
0.25
0.50
0.75
1.00
−0.25
0.00
0.25
0.50
0.75
1.00
Value
r
Azzini and Rosati
Figure S8: Scatterplots of the model inputs against the model output. The red dots show the
mean value in each bin (we have set the number of bins arbitrarily at 30).
8
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−0.25
0.00
0.25
0.50
0.75
1.00
−0.25
0.00
0.25
0.50
0.75
1.00
Value
r
Glen and Isaacs
Figure S9: Scatterplots of the model inputs against the model output. The red dots show the
mean value in each bin (we have set the number of bins arbitrarily at 30).
9
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−1.0
−0.5
0.0
0.5
1.0
−1.0
−0.5
0.0
0.5
1.0
Value
r
Homma and Saltelli
Figure S10: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
10
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−0.25
0.00
0.25
0.50
0.75
1.00
−0.25
0.00
0.25
0.50
0.75
1.00
Value
r
Janon/Monod
Figure S11: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
11
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−0.25
0.00
0.25
0.50
0.75
1.00
−0.25
0.00
0.25
0.50
0.75
1.00
Value
r
Jansen
Figure S12: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
12
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−0.5
0.0
0.5
1.0
−0.5
0.0
0.5
1.0
Value
r
pseudo−Owen
Figure S13: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
13
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
Value
r
Razavi and Gupta
Figure S14: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
14
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
−1.0
−0.5
0.0
0.5
1.0
−1.0
−0.5
0.0
0.5
1.0
Value
r
Saltelli
Figure S15: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
15
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
0.001
0.010
0.100
0.001
0.010
0.100
Value
MAE
Azzini and Rosati
Figure S16: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
16
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
0.01
0.03
0.10
0.30
0.01
0.03
0.10
0.30
Value
MAE
Glen and Isaacs
Figure S17: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
17
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
1e−01
1e+01
1e+03
1e−01
1e+01
1e+03
Value
MAE
Homma and Saltelli
Figure S18: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
18
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
0.001
0.010
0.100
0.001
0.010
0.100
Value
MAE
Janon/Monod
Figure S19: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
19
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
0.001
0.010
0.100
1.000
0.001
0.010
0.100
1.000
Value
MAE
Jansen
Figure S20: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
20
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
0.03
0.10
0.30
1.00
3.00
0.03
0.10
0.30
1.00
3.00
Value
MAE
pseudo−Owen
Figure S21: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
21
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
1e−01
1e+02
1e+05
1e−01
1e+02
1e+05
Value
MAE
Razavi and Gupta
Figure S22: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
22
φ
δ
τ
k2
k3
ε
2 4 6 8 1.0 1.5 2.0 1.0 1.5 2.0
0.3 0.4 0.5 0.1 0.2 0.3 0 100 200
1e−01
1e+01
1e+03
1e+05
1e−01
1e+01
1e+03
1e+05
Value
MAE
Saltelli
Figure S23: Scatterplots of the model inputs against the model output. The red dots show
the mean value in each bin (we have set the number of bins arbitrarily at 30).
23
References
[1] S. Razavi and H. V. Gupta. “A new framework for comprehensive, robust,
and efficient global sensitivity analysis: 2. Application”. Water Resources
Research 52.1 (Jan. 2016), 440–455. doi:10.1002/2015WR017558. arXiv:
2014WR016527 [10.1002].
[2] S. Razavi and H. V. Gupta. “A new framework for comprehensive, robust,
and efficient global sensitivity analysis: 1. Theory”. Water Resources Re-
search 52.1 (Jan. 2016), 423–439. doi:10.1002/2015WR017559.
[3] M. Jansen. “Analysis of variance designs for model output”. Computer
Physics Communications 117.1-2 (Mar. 1999), 35–43. doi:10.1016/S0010-
4655(98)00154-4.
24