Figure 1 - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Source publication

Comparison studies of global sensitivity analysis (GSA) approaches are limited in that they are performed on a single model or a small set of test functions, with a limited set of sample sizes and dimensionalities. This work introduces a flexible ‘metafunction’ framework to benchmarking which randomly generates test problems of varying dimensionali...

## Similar publications

Science education has acknowledged the need to consider modelling as a key process in teaching and learning. Despite scientific modelling has been gaining traction within science education studies, there is relatively little research on the modelling of biological diversity. Our goal was to better understand teacher and students? modelling-related...

## Citations

... The Sobol method is carried by means of the Saltelli-Janson estimator to determine the first and total Sobol indices [42]. The application of the FAST method invokes the use of so-called metafunctions, which is proposed by Becker [9]. For these two methods, we considered three types of distributions of the input parameters, namely, the uniform, gamma, and normal distributions. ...

... For the residual stresses, we apply a form that satisfies these conditions (9) and (10) ...

... There exist a variety of options to apply the FAST method to a certain problem. The latest approach applies so-called metafunctions, which have been proposed in [9], and we apply the metafunctions in the basic form ...

This paper deals with applying two main sensitivity analysis (SA) methods, namely, the Sobol method and the Fourier Amplitude Sensitivity Test (FAST) method on the problem of mixed extension, inflation, and torsion of a circular cylindrical tube in the presence of residual stress. The mechanical side of the problem was previously proposed by Merodio & Ogden (2016). The input parameters in the form of the initial cylinder geometry, the amount of the residual stress, the azimuthal stretch, the axial elongation, and the torsional strain are distributed according to three probability distribution methods, namely the uniform, the gamma, and the normal distribution. In the present work, through applying Sobol and FAST methods, the most influential factors among input parameters on the output variable have been determined. The assessment of our results is then determined by the computation of bias and standard deviation of Sobol and FAST indices for each input parameter in the model.

... After more than 50 years of development, modelers dispose of several SA procedures and of a rich literature informing on which methods are most efficient in each specific SA setting [7,8]. We briefly mention here some of these routines without further description and direct the reader to existing references: ...

... To minimize the influence of the benchmarking design on the results of the analysis, we randomize the main factors that condition the accuracy of sensitivity estimators: the sampling method τ , base sample size N s , model dimensionality d, form of the test function and distribution of model inputs φ [7,8]. We describe these factors with probability distributions selected to cover a wide range of sensitivity analysis settings, from low-dimensional, computationally inexpensive designs to complex, high-dimensional problems formed by inputs whose uncertainty is described by dissimilar mathematical functions (Fig. 3). ...

... 3. We run a metafunction rowwise through both the Jansen and the Discrepancy matrix and produce two vectors with the model output, which we refer to as y J and y D respectively. Our metafunction, whose functional form is defined by (i) , is based on the Becker metafunction [8] and randomizes over 13 univariate functions representing common responses in physical systems and in classic sensitivity analysis functions (from cubic, exponential or periodic to sinusoidal, see Fig. S2). A detailed explanation of the metafunction can be found in Becker [8] and in Puy et al. [7]. ...

While sensitivity analysis improves the transparency and reliability of mathematical models, its uptake by modelers is still scarce. This is partially explained by its technical requirements, which may be hard to understand and implement by the non-specialist. Here we propose a sensitivity analysis approach based on the concept of discrepancy that is as easy to understand as the visual inspection of input-output scatterplots. Firstly, we show that some discrepancy measures are able to rank the most influential parameters of a model almost as accurately as the variance-based total sensitivity index. We then introduce an ersatz-discrepancy whose performance as a sensitivity measure matches that of the best-performing discrepancy algorithms, is simple to implement, easier to interpret and orders of magnitude faster.

... For example, Sudret (2008) and Crestaux et al. (2009) have shown that polynomial chaos-based estimators of Sobol' indices are much more efficient than Monte Carlo or quasi-Monte Carlo-based estimators (for smooth models and dimensions up to 20). Recently, Becker (2020) has shown that certain sample-based approaches can be more efficient than metamodel-based ones for screening with total Sobol' indices. However, the screening performance metrics of Becker (2020) are only based on input ranking. ...

... Recently, Becker (2020) has shown that certain sample-based approaches can be more efficient than metamodel-based ones for screening with total Sobol' indices. However, the screening performance metrics of Becker (2020) are only based on input ranking. In contrary, our practical purpose is to perform a so-called quantitative screening which aims at providing a correct screening and a good estimation of Sobol' indices. ...

Variance-based global sensitivity analysis, in particular Sobol' analysis, is widely used for determining the importance of input variables to a computational model. Sobol' indices can be computed cheaply based on spectral methods like polynomial chaos expansions (PCE). Another choice are the recently developed Poincaré chaos expansions (PoinCE), whose orthonormal tensor-product basis is generated from the eigenfunctions of one-dimensional Poincaré differential operators. In this paper, we show that the Poincaré basis is the unique orthonormal basis with the property that partial derivatives of the basis form again an orthogonal basis with respect to the same measure as the original basis. This special property makes PoinCE ideally suited for incorporating derivative information into the surrogate modelling process. Assuming that partial derivative evaluations of the computational model are available, we compute spectral expansions in terms of Poincaré basis functions or basis partial derivatives, respectively, by sparse regression. We show on two numerical examples that the derivative-based expansions provide accurate estimates for Sobol' indices, even outperforming PCE in terms of bias and variance. In addition, we derive an analytical expression based on the PoinCE coefficients for a second popular sensitivity index, the derivative-based sensitivity measure (DGSM), and explore its performance as upper bound to the corresponding total Sobol' indices.

... In addition to comparing the two SA methods, the accuracy of sensitivity analysis methods also needs to be discussed considering their numerical implementations, as this factor has a significant influence on the efficiency and accuracy of method. A recent study (Becker, 2020) Furthermore, the majority of reliability estimates of factor rankings of VARS (i.e., especially IVARS50, the most comprehensive index) are higher than Sobol, giving more trust to VARS results compared to Sobol method. Our findings on comparative evaluation of both results are consistent with previous studies (Alipour et al., 2022;Razavi & Gupta, 2016a, where they also recognized VARS better than Sobol or other global sensitivity analysis methods. ...

Bioretention cell (BC) design variation significantly changes its hydrological dynamics at unit-scale leading to major changes in its target design goals when designed unit replicated at the catchment-scale. Improved understanding of the behaviors of BC design parameters under different rainfall conditions is critical for effective implementation of BCs to achieve design goals. This paper illustrates and compares two global sensitivity analysis methods (i.e., VARS and Sobol) for the identification of influential parameters and their variations in hydrological dynamics (model responses) under different rainfall conditions and perturbation scales, so as to quantify their uncertainty and reliability for influential factors ranking. From the application results of both sensitivity analysis methods, six parameters out of the total seventeen including conductivity, berm height, vegetation volume, suction head, porosity and wilting point were indicated for their significant sensitivities to surface infiltration, surface outflow and peak flow. In addition, soil thickness, conductivity slope and field capacity (in a total of nine) were categorized as influential to storage outputs. Conductivity and vegetation volume were ranked as the most influential parameters, followed by berm height and suction head, porosity and wilting point. The results analysis also demonstrated that the behavior of design parameters towards model response significantly changes with different rainfall conditions and perturbation scales, as well as the used sensitivity analysis methods. In particular, it is indicated from this study that the VARS is preferred over other sensitivity analysis approaches including the Sobol method (i.e., variance-based method) because of its higher accuracy, reliability, and computational efficiency.

... Usually, k ≫ k t ≫ k s due to the general dominance of low-order effects in mathematical models and the preeminence of the Pareto principle (c. 80% of the effects are conveyed by c. 20% of the parameters) (27)(28)(29). Models live in the space set by k t and k s and not in that nominally defined by k, which may be artificially large if the model includes a non-negligible number of noninfluential parameters. The space defined by k t and k s cannot be simplified without modifying the model's behavior, and thus, it is irreducibly complex (30). ...

Mathematical models are getting increasingly detailed to better predict phenomena or gain more accurate insights into the dynamics of a system of interest, even when there are no validation or training data available. Here, we show through ANOVA and statistical theory that this practice promotes fuzzier estimates because it generally increases the model's effective dimensions, i.e., the number of influential parameters and the weight of high-order interactions. By tracking the evolution of the effective dimensions and the output uncertainty at each model upgrade stage, modelers can better ponder whether the addition of detail truly matches the model's purpose and the quality of the data fed into it.

... Several attempts have been made in the literature to provide an overview of available GSA methods [8], [9], [10], but to the authors' knowledge they are limited to a small number of dimensions and compare only a few methods. While many papers suggest how to work with many samples in low dimensions, not much has been done for the high dimensional situation. ...

... Most related work is limited to a small number of dimensions (typically less than five) and compares only two or three methods. A recent work [10] proposes a meta-function to benchmark different GSA methods in their ability to find ''ground truth'' sensitivity indices. However, the ''ground truth'' was computed with a large sample size and only allows comparison of specific SA methods, since not all SA methods provide similar information (first-order sensitivity scores) as output, making it difficult to compare these algorithms. ...

... VOLUME 10, 2022 A. ROBUSTNESS: RESULTS Figure 10 shows the mean sensitivity of each parameter for different sample sizes and methods. From these plots, it is immediately apparent that some algorithms do not perform well for a relatively small sample size (below 2 10 = 1024 samples). The robustness of the results under different random seeds is also interesting to notice for the model-based approaches, which have much larger variance than, for example Sobol. ...

p>Explainable Artificial Intelligence (XAI) is an increasingly important field of research required to bring AI to the next level in real-world applications. Global sensitivity analysis methods play an important role in XAI, as they can provide an understanding of which (groups of) parameters have high influence in the predictions of machine learning models and the output of simulators and real-world processes. In this paper, we conduct a survey into global sensitivity methods in an XAI context and present both a qualitative and a quantitative analysis of these methods under different conditions. In addition to the overview and comparison, we propose an open source application, GSAreport, that allows you to easily generate extensive reports using a carefully selected set of global sensitivity analysis methods depending on the number of dimensions and samples, to gain a deep understanding of the role of each feature for a given model or data set. We finally present the methods discussed in a complex real-world application of genomic prediction and draw conclusions about when to use which GSA methods.</p

... Several attempts have been made in the literature to provide an overview of available GSA methods [8]- [10], but to the authors' knowledge they are limited to a small number of dimensions and compare only a few methods. While many papers suggest how to work with many samples in low dimensions, not much has been done for the high dimensional situation. ...

... Most related work is limited to a small number of dimensions (typically less than five) and compares only two or three methods. A recent work [10] proposes a meta-function to benchmark different GSA methods in their ability to find "ground truth" sensitivity indices. However, the "ground truth" was computed with a large sample size and only allows comparison of specific SA methods, since not all SA methods provide similar information (first-order sensitivity scores ) as output, making it difficult to compare these algorithms. ...

p>Explainable Artificial Intelligence (XAI) is an increasingly important field of research required to bring AI to the next level in real-world applications. Global sensitivity analysis methods play an important role in XAI, as they can provide an understanding of which (groups of) parameters have high influence in the predictions of machine learning models and the output of simulators and real-world processes. In this paper, we conduct a survey into global sensitivity methods in an XAI context and present both a qualitative and a quantitative analysis of these methods under different conditions. In addition to the overview and comparison, we propose an open source application, GSAreport, that allows you to easily generate extensive reports using a carefully selected set of global sensitivity analysis methods depending on the number of dimensions and samples, to gain a deep understanding of the role of each feature for a given model or data set. We finally present the methods discussed in a complex real-world application of genomic prediction and draw conclusions about when to use which GSA methods.</p

... Several attempts have been made in the literature to provide an overview of available GSA methods [8]- [10], but to the authors' knowledge they are limited to a small number of dimensions and compare only a few methods. While many papers suggest how to work with many samples in low dimensions, not much has been done for the high dimensional situation. ...

... Most related work is limited to a small number of dimensions (typically less than five) and compares only two or three methods. A recent work [10] proposes a meta-function to benchmark different GSA methods in their ability to find "ground truth" sensitivity indices. However, the "ground truth" was computed with a large sample size and only allows comparison of specific SA methods, since not all SA methods provide similar information (first-order sensitivity scores ) as output, making it difficult to compare these algorithms. ...

Explainable Artificial Intelligence (XAI) is an increasingly important field of research required to bring AI to the next level in real-world applications. Global sensitivity analysis methods play an important role in XAI, as they can provide an understanding of which (groups of) parameters have high influence in the predictions of machine learning models and the output of simulators and real-world processes. In this paper, we conduct a survey into global sensitivity methods in an XAI context and present both a qualitative and a quantitative analysis of these methods under different conditions. In addition to the overview and comparison, we propose an open source application, GSAreport, that allows you to easily generate extensive reports using a carefully selected set of global sensitivity analysis methods depending on the number of dimensions and samples, to gain a deep understanding of the role of each feature for a given model or data set.We finally present the methods discussed in a complex real-world application of genomic prediction and draw conclusions about when to use which GSA methods.

... If the inputs are correlated, the use of Shapley coefficients may be a good alternative [8]. Other methods are also available and a rich literature informs on which are the most efficient estimators in each of the SA approaches available to the analyst [9,10]. ...

... We focus on this setting because most often, and especially for high-dimensional models such as those in the environmental sciences/climate domains, the interest lies in properly identifying the top ranks only [24]. Firstly, and in order to relax the dependency of the results on the benchmarking design, we compare the discrepancy measures and the Jansen estimator on a meta-model based on the Becker [10] metafunction. Our meta-model randomizes the model functional form over 13 different univariate functions representing common responses in physical systems, from linear to cubic and trigonometric (Fig. S1) [10]. ...

... Firstly, and in order to relax the dependency of the results on the benchmarking design, we compare the discrepancy measures and the Jansen estimator on a meta-model based on the Becker [10] metafunction. Our meta-model randomizes the model functional form over 13 different univariate functions representing common responses in physical systems, from linear to cubic and trigonometric (Fig. S1) [10]. It also randomizes over the model dimensionality d, the sampling method, the underlying (continuous) input distributions and the strength of higher- [14]. ...

While Sensitivity Analysis (SA) improves the transparency and reliability of mathematical models, its uptake by modelers is still scarce. This is partially explained by its technical requirements, which may be hard to decipher and interpret for the non-specialist. Here we draw on the concept of discrepancy and propose a sensitivity measure that is as easy to understand as the visual inspection of input-output scatterplots. Numerical experiments on classic SA functions and on meta-models suggest that the symmetric $L_2$ discrepancy measure is able to rank the most influential parameters almost as accurately as the variance-based total sensitivity index, one of the most established global sensitivity measures.

... Becker's metafunction can be called in sensobol with metafunction() and its current implementation includes cubic, discontinuous, exponential, inverse, linear, no-effect, non-monotonic, periodic, quadratic and trigonometric functions. We direct the reader to Becker (2020) and Puy et al. (2022) for further information. We benchmark sensobol and sensitivity as follows: ...

The R package sensobol provides several functions to conduct variance-based uncertainty and sensitivity analysis, from the estimation of sensitivity indices to the visual representation of the results. It implements several state-of-the-art first and total-order estimators and allows the computation of up to fourth-order effects, as well as of the approximation error, in a swift and user-friendly way. Its flexibility makes it also appropriate for models with either a scalar or a multivariate output. We illustrate its functionality by conducting a variance-based sensitivity analysis of three classic models: the Sobol' (1998) G function, the logistic population growth model of Verhulst (1845), and the spruce budworm and forest model of Ludwig, Jones, and Holling (1976).