Content uploaded by Tano Gutiérrez-Cánovas
Author content
All content in this area was uploaded by Tano Gutiérrez-Cánovas on Feb 13, 2023
Content may be subject to copyright.
Analysing the impact of multiple stressors in aquatic biomonitoring data:
a cookbook with applications in R
Christian K. Feld1, Pedro Segurado2 & Cayetano Gutiérrez-Cánovas3
1 Aquatic Ecology and Centre for Water and Environmental Research, University Duisburg-Essen,
45117 Essen, Germany, christian.feld@uni-due.de
2 Forest Research Centre (CEF), School of Agriculture, University of Lisbon, Tapada da Ajuda
1349-017, Lisbon, psegurado@isa.ulisboa.pt
3 Catchment Research Group, Cardiff University. School of Biosciences, The Sir Martin Evans
Building, Museum Avenue, Cardiff CF10 3AX (UK), GutierrezCanovasC@cardiff.ac.uk
Author for correspondence
Christian K. Feld, Dept. of Aquatic Ecology, Faculty of Biology, University Duisburg-Essen,
Universitaetsstrasse 5, 45141 Essen, phone: +49 201 183-4390, christian.feld@uni-due.de
Keywords
Analytical framework, Boosted Regression Trees, Random Forest, Generalised Linear Modelling,
Freshwater ecosystems, Water Framework Directive
Highlights
1. Biomonitoring schemes such as the EU Water Framework Directive result in hundreds of
thousands of samples taken from about 120,000 water bodies in Europe.
2. This data allows for unprecedented analysis of the biological impacts of multiple pressures, yet
an analytical framework addressing biomonitoring data in particular is missing.
3. Here we present a 'cookbook' for multiple stressor analysis that provides such an analytical
framework, accompanied by guidance on the analysis and the interpretation of results.
4. Annotated R scripts allow the user to cook the analysis based on his/her own data.
5. Simulated data revealed that reliable rankings of stressor hierarchy and interactions are
achieved, if sample site number is ≥150 and stressors gradient lengths encompasses ≥75 of its
observable gradient in nature.
2
2
Abstract
Multiple stressors threaten biodiversity and ecosystem integrity, imposing new challenges to
ecosystem management and restoration. Ecosystem
managers are required to address and mitigate the impact of multiple stressors, yet the knowledge
required to disentangle multiple-stressor effects is still incomplete. Experimental studies have
advanced the understanding of single and combined stressor effects, yet there is a lack of a robust
analytical framework, to address the impact of multiple stressors based on monitoring data. Since
2000, the monitoring of Europe's waters has resulted in a vast amount of biological and
environmental (stressor) data of about 120,000 water bodies. For many reasons, this data is rarely
exploited in the multiple-stressor context, probably because of its rather heterogeneous nature:
stressors vary and are mixed with broad-scale proxies of environmental stress (e.g. land cover),
missing values and zero-inflated data limit the application of statistical methods and biological
indicators are often aggregated (e.g. taxon richness) and do not respond stressor-specific. Here,
we present a 'cookbook' to analyse the biological response to multiple stressors using data from
biomonitoring schemes. Our cookbook includes guidance for the analytical process and the
interpretation of results. The 'cookbook' is accompanied by scripts, which allow the user to run a
stepwise analysis based on his/her own data in R, an open-source language and environment for
statistical computing and graphics. Using simulated and real data, we show that the recommended
procedure is capable of identifying stressor hierarchy (importance) and interaction in large
datasets. We recommend a minimum number of 150 independent observations and a minimum
stressor gradient length of 75% (of the most relevant stressor's gradient in nature), to be able to
reliably rank the stressor's importance, detect relevant interactions and estimate their standardised
effect size. We conclude with a brief discussion of the advantages and limitations of this protocol.
3
3
Graphical abstract
Biological
indicator Stressors
Boosted
Regression
Trees
Random Forest
noitadilav ataD .1
noitaraperp dna
Outliers
Transform
Standardisaon
Collinearity
Hypothesis
Outliers
Transform
yrotarolpxE .2
analysis
evitatitnauQ .3
analysis
Stressor and inter candidates
Response plots
GLMM
1
GLM
stressor
response model
Global model – link fu
Model ranking / Top models
Model averaging and va
SES and signicance
4
4
1 Introduction
The increasing variety of environmental pressures threatening or damaging ecosystems is
triggering the interest in exploring the biological response to multiple stressors. Yet global change
drivers impose concurrent pressures on more than 40% of Europe's waters (unpublished results
based on the EU WISE database), as a result of combined point source or diffuse pollution, water
abstraction, invasive species or climate change (EEA 2012). This has significant adverse effects
on aquatic ecosystems, through changes in biodiversity, biological integrity and ecological
functioning, all of which are likely to modify the provision of essential ecosystem services. Multiple
pressures—and the multiple stressors resulting thereof— constitute a challenge for river basin
managers and restoration ecologists (Townsend et al. 2008).
The European Water Framework Directive (WFD; 2000/60/EC) has been launched in 2000, to set
a framework for the management of Europe's waters. Since 2000, more than 300 different systems
to assess the ecological status have been developed and implemented in Member State's
monitoring schemes (Birk et al. 2012). During two management cycles, a tremendous amount of
monitoring data has been generated for about 120,000 European water bodies. However, there is
a lack of robust analytical frameworks and guidance to analyse this data in the context of multiple
stressors. Here, we aim to bridge this gap and present a stepwise approach to analyse multiple
stressor effects based on data from biomonitoring schemes.
An environmental pressure (e.g. point source or diffuse pollution) can cause several stressors (e.g.
enhanced concentrations of ammonia, nitrite, nitrate, SRP, fine sediment, pesticides). A stressor is
a measurable variable that exceeds its range of normal variation and thereby adversely affects
individual taxa, community composition or ecosystem functioning (Matthaei et al. 2010). When
biological systems are exposed to various stressors, the biological or ecological response is
difficult to predict, particularly when stressors interact (Townsend et al. 2008). Generally, stressors
may linearly add to each other (additive response) or interact in different ways. Interacting
stressors can have a larger (synergism) or smaller (antagonism) combined effect than the sum of
their individual effects (Crain et al. 2008, Piggott et al. 2015). Both synergistic and antagonistic
interactions constitute a great challenge for ecosystem managers, because interacting stressors
should be jointly addressed by management measures. Disentangling the effects of multiple
stressors, thus is of paramount importance to manage Europe's waters and related ecosystems.
Despite the enormous progress made in the study of multiple stressors during the last years (see
Noges et al. 2015 for a recent review), current river basin management plans are still focused on
single stressors (Birk et al. 2014). Yet, if compared to experimental studies, biomonitoring data is
analysed less consistently, most likely due to a lack of unified guidance. The knowledge on
5
5
suitable analytical methods is scattered and not always available to and digestible by practitioners
in the field of river basin management and restoration.
Here, we present a protocol for multi-stressor analysis based on survey data. The protocol forms a
stepwise 'cookbook', to guide the reader through the data preparation, multiple-stressor analyses,
and interpretation and communication of the effects of multiple stressors on biological and
ecological response variables. We test the cookbook using simulated and real data to detect
limitations and provide recommendations for the user. An earlier version of the cookbook has been
developed for a data evaluation workshop in Tulcea, Romania, held from July 12th–17th 2015 with
scientists of the EU-funded research project 'MARS' (www.mars-project.eu, Hering et al. 2015).
This cookbook is based on R, an open-source language and environment for statistical computing
and graphics (R core team 2016). Along with the core text, we also provide the full R code and
examples (see Supplementary Material, Appendix 1) to enable the reader to "cook" the multiple-
stressor analysis based on his/her own data. We do not intend to provide a comprehensive review
of the available tools to analyse ecological responses to multiple stressors, but present a flexible
and open framework, which may be developed further by the user.
2 General prerequisites for multiple-stressor analysis
2.1 Terminology: pressure or stressor?
The term 'pressure' is part of the Driver-Pressure-State-Impact-Response (DPSIR) scheme that
was developed to communicate the relationship of socio-economic and environmental policies in
Europe (EEA 1999). A pressure (e.g. point source pollution) is a direct effect of a driver (e.g.
human demand for sanitation) (EC 2014). Contrastingly, the term 'stressor' is much more specific
and addresses a measurable environmental variable that, as a result of an anthropogenic
pressure, changes and adversely affects biological or ecological integrity (sensu Matthaei et al.
2010). The terms pressure and stressor are not interchangeable: a single pressure (e.g. diffuse
pollution) may comprise several stressors (e.g. enhanced concentrations of nitrate, nitrite,
ammonia, phosphorous, fine sediments, pesticides, other pollutants) that are very likely to act in
concert, if the pressure is. From an ecological point of view, the interpretation of the effects of
multiple stressors on biology is straightforward, because mechanistic relationships can be
hypothesised (Poff 1997). This is different and more difficult with the broader-defined pressures,
which often renders the interpretation of pressure effects on biology a challenge.
Analysing pressures and stressors in the same statistical model is limited, if both are strongly
correlated. Hence, the selection of appropriate pressure and stressor variables for multiple-stressor
analyses is of paramount importance.
6
6
2.2 Check the presence of multiple stressors in the data
It may seem trivial, but multiple-stressor assessment requires the presence of multiple stressors in
the underlying data. This means that the designated stressor variables need to encompass an
environmentally relevant gradient length, i.e. values that include the gradient's end points occurring
in the targeted ecosystem. Compiling information about the stressor's gradient lengths can help
estimate, if this criterion is met. Identifying stressor gradients in multiple-stressor datasets is quite
straightforward. For example, Principal Components Analysis (PCA) and other multivariate
techniques provide a means to identify stressor gradients in the data, including their correlation
with each other (e.g. Johnson and Hering 2009). Box-and-whisker plots are helpful to display
individual stressor gradient length and its distribution around the mean. Social Network Analysis
provides a more sophisticated method to analyse and display the presence of multiple stressors
and its relationships in a dataset (Ban et al. 2014).
2.3 Select useful and meaningful biological indicators
Biological indicators reflect multifaceted aspects of ecosystems and its biodiversity (e.g. taxonomic,
functional and phylogenetic diversity). In an ideal setting, the biological response variables used for
multiple-stressor analysis should be mechanistically relatable to the stressor variables in the
analysis, which applies to biological and ecological response traits (Townsend and Hildrew 1994,
Poff 1997, Statzner and Bêche 2010). Response traits constitute useful biological indicators in
multiple-stressor analyses (Lange et al. 2014) and provide a potential mechanistic link to stressors
(see Table 1 in Dolédec and Statzner 2008 for an example). Trait data at species level is
increasingly availability (e.g. http://www.freshwaterecology.info/, Schmidt-Kloiber and Hering
2015), however, trait information still is incomplete for many rare and small taxa. Understanding
mechanistic relationships is supported by null models that help distinguish between taxonomic and
functional changes along stressor gradients, for example, to detect niche-based sorting (Mouillot et
al. 2013, Bruno et al. 2016).
More often than traits, however, structural community-based indicators (e.g. taxon richness,
diversity indices, number of Ephemeroptera-Plecoptera-Trichoptera taxa) are used in biomonitoring
schemes (for a review of aquatic methods, see Birk et al. 2012), probably because of their simple
calculations. Although structural community indicators can exhibit multiple-stressor effects (e.g.
Townsend et al. 2008), such metrics have notorious shortcomings and a low capacity to predict
changes in ecosystem function (McGill et al. 2006, Mouillot et al. 2013, Gagic et al. 2015). Diversity
metrics may fail to detect stressor effects at all, because species turnover along a stressor gradient
may render biodiversity metrics unaffected along the gradient (e.g. Feld et al. 2013). For a detailed
discussion about biomonitoring procedures see Bonada et al (2006).
7
7
3 The cookbook
The cookbook presents a stepwise analytical procedure, which reflects the typical flow of statistical
analysis. We recommend following the stepwise approach, because subsequent steps are
conditional on previous ones. Yet, the advanced user may enter at an advanced step.
The cookbook starts with methods to preliminarily check the data quality and consistency. We then
present methods to check for the presence and importance (hierarchy) of multiple stressors in a
dataset. Using the most relevant stressor candidates, we quantify individual and multiple stressor
effects through generalised linear models. The basic code of each step is comprehensively
presented in three sections (Box 1–3), which should allow the reader to easily follow the basic
structure of analysis in R. The full annotated R code and all functions used are available with
Appendix 1 of the Supplementary Material.
Figure 1. Analytical steps of the cookbook. (1Note: GLMM not illustrated in the main text body, see
Supplementary Material, Appendix 4 for details).
Biological
indicator Stressors
Boosted
Regression
Trees
Random Forest
noitadilav ataD .1
noitaraperp dna
Outliers
Transform
Standardisaon
Collinearity
Hypothesis
Outliers
Transform
yrotarolpxE .2
analysis
evitatitnauQ .3
analysis
Stressor and inter candidates
Response plots
GLMM
1
GLM
stressor
response model
Global model – link fu
Model ranking / Top models
Model averaging and va
SES and signicance
8
8
3.1 Data validation and preparation
The first step is to ensure the requirements for data quality and consistency for the subsequent
analyses are met. Outliers and extreme observations, missing data, various stressor gradients with
different gradient lengths and erroneous values need to be detected and handled appropriately.
Outlier analysis
Outliers are observations with extraordinarily high or low values for a given variable and may
reflect abnormal conditions present during sampling or sample processing in the lab. Outliers may
also reflect simple errors in data handling (e.g. typos during data entry, confusion during data
import/export operations, wrong decimal separator). As outliers can affect multiple-stressor effect
sizes, it is important to detect them at an early stage of the analysis and handle them appropriately.
The standard summary and boxplot functions of R are usually sufficient in most situations to
detect outliers and extreme values (Box 1). In addition, the R package outliers (Komsta 2011)
provides useful methods and formal tests to detect and evaluate outliers.
Data transformation and normalisation
Data transformation aims to approach normal distribution of continuous data in that the influence of
high values of a given variable is downweighed (Box 1). Usual transformations are to calculate the
square-root or logarithm. Logit transformation is recommended for proportional (%) values (Warton
and Hui 2011), which is implemented in R, for example, with the logit function of the package
car (Fox and Weisberg 2011).
Standardisation of explanatory variables means to recalculate all variables to a comparable
numerical range. For example, pH values range from 0–14, nitrate concentration may range from
0–300 mg/L, while proportional land use ranges 0–100%. The variables would reveal different
effect sizes just because of their different numerical scaling. Therefore, standardisation is a
prerequisite for the comparison of effects sizes, i.e. to obtain standardised effect sizes (SES) (Box
1). A common standardisation of continuous variables is to recalculate values to achieve mean=0
and SD=1 for the variable (also known as z-transformation; see Grueber et al. 2011 for methods
applicable to categorical variables). If transformation is applied to a variable too, standardisation
should follow transformation.
Independence of explanatory variables: collinearity and variance inflation
Generalised linear modelling requires explanatory variables to be independent, i.e the variables
must not highly correlate with each other. Collinearity addresses this issue and can be easily
9
9
quantified for continuous variables using Pearson’s or Spearman’s correlation coefficients. For a
graphical interpretation, we recommend using R’s plot function pairs (Box 1).
The Variance Inflation Factor (VIF) is another way to quantify collinearity (Box 1). This method also
accounts for non-linear relationships, which may remain undetected using correlation analysis. A
VIFs >8 indicates variance-inflated variables (Zuur et al. 2007), while this threshold can be found
lower in other references. If more than one variables exhibit high VIFs, it is recommended to
exclude them stepwise, starting with the variable that has the highest VIF. This procedure is
repeated until all variable's VIFs are <8. The stepwise procedure is crucial, because the exclusion
of one variable influences the collinearity of all remaining variables in the analysis.
The package usdm (Naimi 2015) has a function vif to calculate collinearity (Box 1). The function
vifstep automatically performs the stepwise deletion of collinear variables until the VIF of all
remaining variables is below a threshold th, which can be defined by the user.
We should be aware that the process of selecting predictors should be carefully checked, to avoid
removing those with an evident mechanistic link to the biotic indicators. We recommend accounting
for natural filters when modelling responses to multiple stressors (see Supplementary Material,
Appendix 3).
3.2 Explanatory analysis of pressure/stressor hierarchy and potential interactions
Multi-stressor data can easily comprise several tens of variables, if physico-chemistry, hydrology,
morphology and land and water uses are covered simultaneously. With very many explanatory
variables, it is recommended to reduce these and focus on the most influential ones. This in
particular becomes crucial, if the number of observations (=samples) in the dataset is not much
higher than the number of explanatory variables (i.e. stressors). As a rule of thumb, the number of
observations at least should be ten times the number of variables (e.g. Harrell 2001). Also the
number of multiple-stressor interactions to be considered should be kept handy. With ten stressors
there are already 45 pairwise interactions possible. Eventually, only the most influential and
meaningful interactions, however, are of interest for the quantification of their effect size.
Collinearity analysis and exploratory analysis (Random Forest or Boosted Regression Trees,
explained in the following sub-sections) can help to decide which variables to exclude, bearing
always in mind to keep those with a stronger biological relationship.
3.2.1 Stressor hierarchy analysis using Random Forests (RF)
Random Forest (RF) analysis is a flexible, non-parametric regression tool belonging to the family of
Classification And Regression Tree analyses (CART). RF can handle a large number of predictors
even in small datasets with a low number of observations. Both explanatory and response
1
0
10
variables can be heterogeneous, i.e. continuous, categorical and nominal (binary) variables are
allowed in the same model. RF is suited to analyse non-linear relationships and complex
interactions (Breiman 2001) and van handle missing values (NAs in R jargon).
In brief, RF fits a number of models (regression trees) to bootstrapped data subsets (typically two
thirds of the observations "in the bag"=learning data) and tests the results against the remaining
observations ("out-of-bag"=test data). The trees recursively split the data according to binary
divisions based on predictor thresholds (e.g. pH<7) for each of the bootstrapped datasets. Then,
the resultant trees are combined to produce a final model (Box 2). For methodological details, see
Breiman (2001), Cutler et al. (2007) and Strobl et al. (2009).
There are several R packages implementing functions to conduct RF analysis, however with
different capabilities for multiple-stressor analysis (see Supplementary Material, Appendix 2, Table
S2.1 for a summary). The function rfsrc() of the package randomForestSRC (Ishwaran and
Kogalur 2016) is able to identify both stressor and interaction hierarchy, even for datasets with
missing values (Ishwaran et al. 2014). The package party (Hothorn et al. 2006) provides the
function cforest() that ranks predictors according to their relevance as single and interactive
terms in the model, using a measure of variable importance, corrected by predictor inter-
correlations (Strobl et al. 2008). This correction reduces the importance of irrelevant predictors,
which show a high correlation with the putative, main stressors. For datasets with missing values,
only unconditional variable importance can be calculated (Hapfelmeier et al. 2014). The package
rfPermute (Archer 2016) contains a permutation-based function with the same name to provide
both variable importance and significance for each predictor, but handles only datasets without
missing values. For each final model, these functions provide measures of goodness-of-fit (R2) and
cross-validation (e.g. root-mean-square error, RMSE).
Each RF function shows genuine advantages. For this cookbook, we chose to use rfsrc(),
because of its capability to rank, identify and visualise stressor interactions (Box 2). To perform a
RF analysis, we need to define several arguments to adjust the analysis. We should specify a
‘seed’ using the function set.seed(),which is used for random number generation. Using the
same ‘seed’ number allows to repeat an analysis with exactly the same randomisation. The code to
run a RF analysis using the function rfsrc()is shown in Box 2.
The function rfsrc() requires the conventional syntax "response ~ predictor" (to be read
as "response is a function of the predictor[s]"). The original variable importance procedure
(Breiman 2001) is estimated when importance=“permute”, while importance=“random”
uses a different algorithm (Ishwaran et al. 2008, 2010).
3.2.2 Stressor interaction analysis Boosted Regression Trees (BRT)
1
1
11
Alike RF, Boosted Regression Tree analysis (BRT) can handle binary, ordinal and continuous
explanatory variables in the same analysis, accommodate collinear data, handle non-linear
variables with missing values and identify interactions between descriptors (Elith et al. 2008).
Running BRT in R requires to install the packages gbm (Ridgeway 2015) and dismo (Hijmans et al.
2016). The function gbm.step() runs a BRT based on several arguments to be set by the user
(Box 2). The argument family defines the link function and error distribution to be used for the
response variable (Elith et al. 2008). It should be set as "gaussian" for continuous variables and
"poisson" for counts (e.g. number of species); for other functions see the Supplementary Material,
Appendix 2, Table S2.2. The argument tree.complexity (tc) defines the way interactions are
fitted, with tc=1 for purely additive models, tc=2 for models including two-way interactions, and so
on. The learning.rate determines the overall number of trees calculated, with smaller values
increasing the number of trees. A typical starting point is a learning.rate=0.005. Each
individual BRT should be run several times to ensure stable results. To increase the comparability
of the results of different models, it is recommended to build each model on a comparable number
of regression trees, usually at least 1,000 trees (Elith et al. 2008).
Also alike RF, BRT internally splits the data into a training dataset and a validation dataset. Both
are used for cross validation. The argument bag.fraction determines the proportion of the data
to be kept "in the bag" for model training at each step. With e.g. bag.fraction=0.7, 70% of the
data are randomly chosen to train the model.
The model summary statistics are accessible with the function summary() and include information
on the deviance explained in the response variable, the contribution of each explanatory variable to
the explained deviance (Fig. B2.1) and a cross-validation statistics. The function gbm.plot()
plots fitted response values against each explanatory variable in the analysis, after accounting for
the effects of all other variables. These partial dependence plots (Fig. B2.2) provide a useful tool to
visualise a response variable along a stressor's gradient (Clapcott et al. 2012, Feld 2013). If
present, response thresholds with dramatic increases/decreases along a gradient can be easily
detected.
The interaction function gbm.interactions() allows to identify and rank interactions among the
descriptor variables in the analysis (Fig. B2.3). These results inform the order of explanatory
variables and its interaction terms to be included in linear regression models (Feld et al. 2016; see
section 3.3). This approach allows of a high level of standardisation in the analysis of the effects of
multiple stressors.
A major disadvantage of BRTs (and probably other machine-learning techniques), however, is that
they require at least about 100–150 observations (samples) to deliver stable and reliable results
(section 5.1).
1
2
12
3.2.3 Limitations— repeated measures in the dataset: the issue of spatial and temporal
autocorrelation
Users should be cautious if the data includes dependent observations. This can be repeated
measures at the same site (temporal dependence) or observations from different but closely
located sites (spatial dependence, also known as spatial autocorrelation). We do not address these
methods in detail here, but briefly present potential solutions in the Supplementary Material,
Appendix 3.
3.3 Quantitative analysis of pressure/stressor effects and interactions
This step aims to quantify and test the significance of single and combined effects of the stressor
candidates identified by RF and/or BRT analysis. To determine if these candidates have significant
additive or interactive effects and to compare the effects, we need to generate standardised effect
sizes (SES) (section 3.1). Although, complex interactions of three or more stressors may occur in
nature, we encourage users to consider and test only pairwise interactions, to keep interactions
handy and be able to interpret and communicate the results. It would be worth testing for higher-
order interactions (e.g. interactions of three stressors) only if there is a priori evidence or
knowledge that support such interactions.
The family of linear models offers a very well-known set of regression techniques to obtain SES of
explanatory (stressor) variables (Supplementary Material, Appendix 2, Table S2.1). Generalised
Lineal Models (GLM) are able to handle different types of quantitative (e.g. count, continuous,
proportional) or binary response variables, through various link functions (e.g. gaussian, poisson,
binomial). With a categorical response variable, we recommend using Cumulative Link Models
(CLM) or the extension suited to account for random effects (Cumulative Link Mixed Models,
CLMM) (McCullagh 1980, Agresti 2002, Christensen 2015). Various packages implement functions
to run linear models in R (Supplementary Material, Appendix 2, Table S2.1), yet differing in
syntaxes and capabilities. Most of them may be used interchangeably in multiple-stressor analysis.
In the following, we will illustrate each step of the analysis using one option just for illustration
purposes. We encourage users to get familiar with the methodology provided here, before he/she
explores other functions.
The quantification of multiple-stressor effects is divided into three steps:
1. Set-up initial model and define link function appropriate for the response variable.
2. Multi-model inference and model averaging.
3. Validate the final model by checking model assumptions.
1
3
13
Detailed descriptions of linear model techniques cannot be given here, but are available elsewhere
(e.g. McCullagh and Nelder 1989, Burham and Anderson 2002, Zuur et al. 2009).
3.3.1 Initial model and link function
GLM analysis requires the descriptor variables (stressors and natural covariates, if applicable) to
be temporally and spatially uncorrelated. A generic function to perform GLM in R is glm() (Box 3).
If the interaction term is included in a model, it is mandatory to also include the additive terms of
the respective variables. If interactions turn out to be significant in the final model, the respective
additive effects need to maintained in the model too, even these effects are not significant.
The link function is set by the argument family and depends on the kind of response variable in
the model (Supplementary Material, Appendix 2, Table S2.2). When modelling count or binary
(presence/absence) response variables, we often find the variance of the response variable to be
much greater than its mean. This is referred to as overdispersion and may render flawed models.
Overdispersion may also be caused by zero-inflated response variables, outliers and other effects
remaining unaddressed (e.g. missing natural covariates, interaction of descriptors, random effects,
inappropriate link function). To estimate overdispersion in GLM, we calculate the theta argument
(Box 3). With theta ranging 2–15, penalised quasi-likelihood models are appropriate. To set these
link functions in our analysis we should specify family=quasipoisson for count data and
family=quasibinomial for binary data. For highly overdispersed data (theta>15), a negative
binomial link would be more appropriate, which is fit by the function glm.nb() of the R package
MASS (Venables and Ripley 2002).
3.3.2 Multi-model inference and model averaging
In multiple-stressor modelling, the primary goal is to determine the importance and significance of
various additive and combined (interactive) stressor terms. Traditionally, model selection is guided
by several model parameters: goodness-of-fit (e.g., R2), Akaike's Information Criterion (AIC, Akaike
1973) and similar metrics. This strategy aims to identify the most adequate and parsimonious
model, i.e. the best model incorporating the least descriptor variables. However, this procedure can
miss some important terms, if there is no overwhelming support for the selected model, while
alternative models with similar AIC values are ignored. Multi-model inference is a more recent
method of model selection, which provides consensus models and variable importance across a
set of "top models", instead of focussing on a single model (Grueber et al. 2011). Here, we suggest
using multi-model inference to select the final model(s) for testing multiple-stressor effects.
The R package MuMIm (Bartoń 2016) offers a useful set of functions to perform the steps required
from model ranking to averaging. First, a global GLM (full) model including all the variables of
interest is run (Box 3).
1
4
14
Second, the function dredge() automatically runs all models possible with different combinations
of stressor variables based on a minimum and maximum number of stressors defined by the user.
For each model, SES of additive and interaction terms, AIC values and differences for the
comparison to the lowest AIC value (delta, ΔAIC) and the AIC weight (AICw=probability of being
the "best" model) can be obtained, together with additional model performance parameters (e.g.
goodness-of-fit, F-values).
Third, if we do not find overwhelming support for a single model (i.e. AICw<0.9), we can select a
set of "top models" that potentially best describe multiple-stressor patterns using the function
get.models(). The argument subset allows to restrict the output based on pre-defined
thresholds of ΔAIC or cumulative AICw. We then can set a sequence of thresholds (e.g. ΔAIC≤2,
ΔAIC≤4, ΔAIC≤6, cumulative AICw≤0.95) and inspect the results. This incremental change of
ΔAIC or cumulative AICw enhances the chances to embrace the best model in the set, however, at
the cost of including less relevant models and variables. We recommend testing different
thresholds, to be able to decide which threshold is most adequate (Box 3).
Fourth, we may derive an averaged model using the function model.avg(). There are two
methods to average model coefficients and their errors (Burnham and Anderson 2002). The natural
method provides a weighted mean for each variable coefficient and its errors, based on AICw
values, but only, if this variable occurs in a model. Variables not occurring in a model are
neglected. In contrast, the zero-method calculates a weighted mean for coefficients and errors, in
which the variable coefficients absent in a model were substituted by zero (Box 3). Although, there
is no clear criteria suggesting which model averaging method could perform better in the multi-
stressor context, the zero-method might provide a more conservative and meaningful view about
parameter SES (Grueber et al. 2011).
3.3.3 Model validation and checking of model assumptions
In order to represent a statistically valid linear model, its residuals need to exhibit normality,
homoscedasticity (i.e. homogeneous variability of residuals along the response gradient), without
strong influential observations and any spatial or temporal autocorrelation in the residuals (e.g.
Zuur et al. 2009). For averaged models, we should perform model validation for all the models we
are going to average. It might be convenient to adjust ΔAIC or cumulative AICw thresholds, if less
relevant models show clear violations of these important assumptions. Model residuals can be
extracted using the function resid() (Box 3).
Homoscedasticity can be checked by plotting the model's residuals against fitted response values
and is fulfilled, if no obvious pattern (e.g. change in the residual's range or aggregated data points)
is observable along the fitted values. The site's coordinates are required to test the residual's
1
5
15
spatial autocorrelation. The R function moran.test() of the package spdep generates Moran's I,
which is an estimate of the spatial autocorrelation of each site (Box 3). A significant result of this
test indicates a strong spatial autocorrelation in the data and suggests accounting for (spatial)
random effects, which can be achieved using GLMM (Supplementary Material, Appendix 3).
4 Understanding multiple-stressor effects
Understanding the effects of multiple-stressors is not straightforward, if strong interactions appear
to be influential on the biological or ecological response variable. Various reasons may render a
good understanding and interpretation of interactions difficult. Using response variables that
mechanistically link to the explanatory (stressor) variables can facilitate the interpretation and
communication of interactions. Trait-based indicators offer adequate features to allow an
ecologically sound interpretation of their response (Lange et al. 2014). Similarly, using explanatory
(stressor) variables that mechanistically link to biological response variables can facilitate the
interpretation of stressor interactions. This link is probably much more difficult to identify when
rather indirect stressor variables or proxies thereof are used (e.g. % land cover). Ecologically
meaningful and interpretable results are more likely to be obtained, if directly measurable stressors
(e.g. nutrient concentrations, flow indices, sediment contents or physical habitat measures) are
included in the analysis. Even then, causality cannot be obtained from survey data, because
unknown or unaddressed stressors may interfere with the stressors in the analysis and influence
their SES.
Meaningful and easily digestible graphical illustrations can facilitate the interpretation of results,
even by the untrained practitioner. Appropriate illustrations may directly facilitate decision making,
helping the end user to delineate specific management actions. Here, we in particular address this
latter point, with a focus on pairwise interactions. In the absence of significant interactions (i.e.
additive effects only), partial response plots of the response indicator against each stressor
gradient can illustrate the individual stressor effects on the response variable (Box 2, Fig. B2.5).
With significant interactions, the effect of one stressor is conditional on the position along the
gradient of the other stressor. For instance, the adverse effect of fine sediment addition on benthic
invertebrate taxon richness is stronger under raised temperature (Piggott et al. 2012).
The options to display interactions graphically depend on the numerical scale of the stressor
variables. With one continuous and another categorical stressor, we can simply display the fitted
response as a function of the continuous variable, but with one response function line for each
level of the second categorical stressor.
Generally, lines showing different slopes illustrate interaction effects (Fig. 2a). With opposing signs
for the individual stressor effects, the lines are usually crossed. For two continuous stressor
1
6
16
variables, there are three alternatives (Fig. 2b–d): (1) we may discretise one stressor and use the
aforementioned procedure for a continuous and a discrete stressor variable. Although quite
straightforward, this option requires the definition of threshold values for each state of the
discretised variable, which brings another subjective “taste” and may influence the outcome; (2) a
two-dimensional contour plot shows the fitted (colour-coded) response values against a surface
defined by both stressors. The pattern in the coloured gradient then visualises the interaction (Fig.
2b); (3) a three-dimensional surface showing the fitted response values as a function of both
stressors (Fig. 2c). In comparison to the two-dimensional contour plot, the three dimensional
surfaces indicates an interaction through the curvilinear deformation of the surface, i.e. how it is
tilted and bent in the 3-D space, which would be a flat plane with purely additive effects (Fig. 2c).
A more quantitative graphical representation of interactions is possible based on the SESs of
single and combined stressor effects in linear models. Piggott et al. (2015) offer a comprehensive
classification of multiple-stressor effects observable with controlled experiments. Here we show a
simplified version to illustrate SESs, but without a control group (Fig. 2d). With individual stressor
coefficients of the same sign, but an interaction coefficient of the opposite sign, the interaction
most likely is antagonistic. In case all individual and interaction coefficients have the same sign, the
interaction is synergistic. If individual stressor coefficients have opposed signs, the interaction most
likely is opposing too, no matter the sign of the coefficient of the interacting term. An opposing
effect can also arise, if the interaction coefficient is high in relation to the individual stressor
coefficients, irrespective of their signs.
1
7
17
Figure 2: Different options to illustrate two-way interactions of two continuous stressor variables:
(a) line plots showing the fitted linear response to one stressor, while fixing the second stressor in
the 10th, 50th and 90th percentiles; (b) 2-dimensional surface plots, (c) 3-dimensional plots and (d)
bar plots illustrating the respective regression coefficients (SES) of single and interaction terms.
5 Testing the cookbook
210 1 2
345678
Stressor 1
Indicator
210 1 2
2345678
Stressor 1
Indicator
210 1 2
345678
Stressor 1
Indicator
210 1 2
23456789
Stressor 1
Indicator
10th
50th
90th
Stressor 2
10th
50th
90th
Stressor 2
10th
50th
90th
Stressor 2
10th
50th
90th
Stressor 2
Stressor 1
Stressor 2
Stressor 2
Indic.
Stressor 1
Stressor 2
Stressor 1
Stressor 2
Stressor 1
Indic. Indic. Indic.
Stressor 1
1
0
1
2
Stressor 2
2
101
Indicator
3
4
5
6
7
Stressor 1
2
1
0
1
2
Stressor 2
2
1012
Indicator
3
4
5
6
Stressor 1
1
0
1
2
Stressor 2
2
101
Indicator
4
5
6
7
8
Stressor 1
1
0
1
2
Stressor 2
2
101
Indicator
2
4
6
8
Synergistic Antagonistic Opposing
a)
b)
c)
d)
Additive
coefficients
Effect
0.5 0.0 0.5
bs2 bs1
coefficients
Effect
0.5 0.0 0.5
coefficients
Effect
0.5 0.0 0.5
coefficients
Effect
0.5 0.0 0.5
bs2 bs1
bs1*bs2
bs2 bs1
bs1*bs2
bs2 bs1
bs1*bs2
1
8
18
5.1 Sensitivity analysis using simulated data
Using simulated data allowed us to specifically test, whether the analytical approach presented in
this cookbook can correctly i) identify the stressor hierarchy, ii) detect important stressor
interactions and iii) estimate the SESs of individual stressors and their interactions. The simulation
of a large number of datasets with both varying pre-defined sample sizes and gradient lengths also
allowed us to estimate the influence of both key criteria on the results.
The simulation of multi-stressor data applied a linear model based on four single putative stressors
(s1–s4) and three pairwise stressor interactions (s1:s2, s1:s3 and s2:s4, representing synergistic,
antagonistic and opposing interactions, respectively). The individual stressor hierarchy was set a
priori to s1>s2>s3=s4, while interaction strength was set to s1:s3>s1:s2>s2:s4. In addition, a set of
spurious stressors (s5–s12) and irrelevant stressors (s13–s20) was generated. The former showed
moderate correlations with the putative stressors (rmax=0.44–0.47), while the latter were
uncorrelated with the response variable (rmax=-0.03–0.03) (see Supplementary Material, Appendix
4 for details on the simulations).
Using the linear model simulations, we generated a global dataset comprising 5,000 independent
observations (hereafter, called sites), with one response variable y and 20 virtual stressors s1–s20.
Random subsets were taken from this global set comprising 25, 50, 75, 100, 150, 300 and 500
sites. Each subset was generated with four different gradient lengths for the main stressor s1: full
gradient length (100%) and 75%, 50% and 25% thereof. We defined gradient length using stressor
s1, because it was the most relevant predictor for both individual and combined effects. This
procedure resulted in 28 different combinations of sample size and gradient length. We then
applied the analytical approach outlined before (see Supplementary Material, Appendix 4, for
details) and ran 100 model simulations for each combination. This resulted in altogether 4,000
models: 2,800 RF models and 1,200 BRT models; BRTs were run on subsets with ≥150 sites only,
because of its cross-validation procedure, which divides a dataset into ten subsets of equal size
(Elith et al. 2008). The minimum number of 150 sites ensured that each subset comprised at least
15 sites, which we defined here sufficient to ensure reliable results.
Simulations revealed that for RF models both sample size and gradient length strongly influence
stressor ranking (Supplementary Material, Appendix 4, Table S4.1). These results were consistent
with the BRT model results (Supplementary Material, Appendix 4, Table S.4.2). In general, the
models’ capability to correctly rank the putative stressors and interactions within the top variables
increased with datasets ≥150 sites and gradient length ≥75%. The main putative stressor s1 was
consistently ranked first, if both criteria for sample size and gradient length were met, and ranged
within the top four single stressors otherwise. To a larger degree, sample size and gradient length
1
9
19
influenced the ranking of stressors s2 and s3. Stressor s4 was ranked within the top four with ≥100
sites.
Overall, models based on gradient length of ≥75% of the full gradient showed a consistently
greater capacity to identify the most important interactions within the top three of combined terms,
while those models based on both shorter gradients (50% and 25%) commonly failed. The
interaction s1:s3 was most often correctly detected with ≥100 sites and gradient length ≥75%. The
combination s1:s2 showed a similar pattern, while the combination s2:s4 was rarely identified
among the top three interactions in the RF models. Similar results were found for BRT models
(Supplementary Material, Appendix 4, Fig. S4.1).
We used multi-model inference to estimate single and combined stressor SES for the variables
ranking best in the RF and BRT models. For each combination of sample size and gradient length,
and for each of the variables identified by RF and BRT models, we fitted a global model using the
top-four single stressors and the top-three ranked interactions (see Appendix 4 for details). The
main goal of this analysis was comparing the SES estimated by model averaging with those initially
used to produced the simulated data, considered as the benchmark (Appendix 4, eq. S4.1). This
deviation is expressed as percent SES error in the following (Appendix 4, eq. S4.3). Overall, and
consistent with RF and BRT models, a larger sample size and longer gradient length reduced the
error percentage in SES estimation (Fig. 2 and 3). For s1, s2, s3 and s1:s3 the error percent in
SES estimation was remarkably reduced with ≥150 sites and ≥75% gradient length. However, the
error percent in SES estimation even increased with sample size and gradient length for s4 and the
interactions s1:s2 and s2:s4, while for s4, models based on partial gradients (<100%) generally
performed better. The error percent in SES estimation of the interaction s1:s2 substantially
decreased with a gradient length ≥75%.
RF-informed GLM models revealed a steep increase in the mean goodness-of-fit with sample sizes
ranging 25–150 sites (r2=0.60–0.80, Fig. 2). Goodness-of-fit was generally higher for models with
the full s1 stressor gradient, yet, surprisingly the shortest gradient (25%) also led to high values for
goodness-of-fit. BRT-based results displayed a similar pattern (Supplementary Material, Appendix
4, Fig. S4.2).
2
0
20
Figure 2. Mean percent SES error estimation for different sample sizes and stressor gradient
lengths of the putative stressors s1–s4 and the three interaction terms for GLM models informed by
RF results on stressor hierarchy and interactions. See Supplementary Material, Appendix 4 for
methodological details.
Figure 3. Mean goodness-of-fit (R2) for the linear models based on the variables selected by RF,
for different sample sizes and gradient lengths. See Supplementary Material, Appendix 4 for
methodological details.
5.2 Testing the cookbook using monitoring data from the Sorraia basin, Portugal
The Sorraia river is the largest tributary of the Tagus river and its basin covers an area of 7,730
km². The Sorraia Valley is one of the largest areas of irrigated crops in Portugal. Therefore, major
2
1
21
anthropogenic impacts relate to hydrological (irrigation, flow regulation, damming) and nutrient
stress. The cookbook is applied mainly to investigate the interplay between groups of stressors
and their single and combined effects on the native fish species richness.
The dataset comprises 205 sites, sampled between 1995 and 2005 within the EU-funded project
EFI+ (http://efi-plus.boku.ac.at). Sites were sampled by electrofishing during low flow periods
employing standard European methods (CEN, 2003), allowing to estimate fish composition and
abundance. A Soil and Water Assessment Tool (SWAT) model (Arnold et al. 1998) was
implemented for the whole Tagus basin (time frame: 1996–2015), to model site-specific stressor
variables related to hydrologic stress (total annual flow, mean duration of high/low pulse events,
number of high/low pulse events) and nutrient stress (total phosphorus, total nitrogen). The
stressor variables of each site were modelled for the year it was electrofished. One may argue that
% agriculture in the upstream catchment in fact is a proxy of different environmental stressors (e.g.
nutrient enrichment, water abstraction, sediment pollution) rather than a stressor in itself. However,
as % land cover is a common "stressor" variable used in many ecosystem assessments, we
decided to include it here too, to account for this common source of data heterogeneity. Further, its
correlation with both other stressors was low (Pearson's rmax=0.41), which confirms that %
agriculture reflects additional stressors in this case study.
After screening and preparing the data (see Supplementary Material, Appendix 5 for details),
stressor hierarchy was investigated using BRT. BRT was preferred over RF, since BRT revealed a
higher goodness-of-fit, although both methods showed a similar ranking of predictors. The top-four
stressors affecting native fish species richness were: annual mean flow, % agriculture in the
catchment upstream, annual mean total phosphorous and % forest in the main channel’s
catchment area. We included distance to source as the most influential natural environmental
covariate, to account for the stream size-dependent natural variation in the data. Important
pairwise interactions were: annual mean flow with % agriculture and distance to source with %
agriculture. The latter might be trivial, as % agriculture is likely to increase along the longitudinal
river continuum, yet it might be worth further investigating this relationship. A GLM full model with a
poisson link function was then run including the top four individual stressor variables and both
interactions. We followed the multi-model inference approach as described in section 3.3.2, but
with models constrained to a maximum of four terms (excluding the intercept), to reduce the
complexity of results.
The final model included the environmental covariate (distance to source), two individual stressors
(annual mean flow and % agriculture upstream) and one interaction term (annual mean flow with %
agriculture upstream). SES of distance to source and annual mean flow were positive, while %
agriculture upstream had a negative SES. The interaction is opposing, since both involved
2
2
22
stressor's SES had opposing signs. With regard to the native fish species richness, this means that
an increase in % agriculture in the upstream catchment attenuates the strong positive effect of
annual mean flow. Until finally, native fish species richness decreases, when a high annual mean
flow exacerbates the negative effect of % agriculture (Fig. 4). For very low values of annual mean
flow, the model predicts a slightly positive effect of % agriculture upstream. The observed effects
are independent of distance to source, which was included in the final model as additive terms.
The model outcome might be interpreted as follows: first, there is a negative effect of agriculture on
native fish richness which, however, is well pronounced only under good conditions for mean
annual flow (10–100 m3/s); second, native fish species richness strongly increases with annual
mean flow, if % agriculture in the catchment upstream is low.
!
Figure 4: Interaction plot showing the response of native fish richness in the Sorraia basin,
Portugal, against combinations of annual mean flow and % agriculture upstream. Percent
agriculture was fixed at three intensities: 10th, 50th and 90th percentile. Original flow values
(m3/s)are shown in a logarithmic scale under the log-transformed and standardized scale, to allow
for a direct interpretation of flow.
6 Summary and outlook
2
3
23
The study of multiple stressor effects in real ecosystems have been scattered and limited,
compared with single-stressor studies. Most evidence of multiple-stressor effects originates from
experimental mesocosm experiments conducted in New Zealand (e.g. Townsend et al. 2008,
Matthaei et al. 2010, Lange et al. 2014, Wagenhoff et al. 2011, Piggott et al. 2012). Although these
studies stimulate and inform further multiple-stressor studies, experimental results are difficult to
transfer to the "real" conditions, i.e. the multiple-stressor condition at a specific site of a given
stream or lake. These conditions cannot be controlled and more often than not the actual stressors
operating at a site remain unknown.
In this contribution, we provide the first framework especially suited to analyse the biological
response to multiple stressors using biomonitoring data. The framework is highly standardisable
and allows assessing the effects of multiple stressors in biomonitoring data. It was effective in
detecting the most relevant stressors and its combinations in simulated and real monitoring
datasets. The guiding text, a real case study example and the annotated R code are combined in
this cookbook, to help users start analysing their own datasets.
Data from field surveys and standard monitoring schemes, such as the EU WFD monitoring data,
have a large potential to improve our knowledge and understanding of multiple stressors. One
potential limitation is to adequately describe and interpret the multiple-stressor situation, i.e. using
direct stressors rather than indirect proxies thereof. The evidence provided by experimental studies
along with the best-available biomonitoring data can help disentangle stressor effects and its
causal relations to broad scale human pressures, such as land use. The same applies to biological
response variables, which ideally can be hypothetically and mechanistically linked to the stressors
a priori. With this setting, our cookbook facilitates the identification of the importance of individual
stressor and their combined effects (stressor hierarchy). However, both aspects are conditional on
the response variable, which underpins the importance to select meaningful biological indicators.
The methods to identify stressor hierarchy and interactions are manifold. As explorative tools, we
recommend using techniques belonging to the family of Classification and Regression Tree
(CART) analyses, namely Random Forest (RF) and Boosted Regression Tree (BRT) analysis.
These techniques show optimal features for data exploration, as they are very flexible and handle
variables of different nature, missing values, non-linear responses, and non-parametric data. Our
simulations revealed that both techniques can reliably identify stressor hierarchy and interactions in
large datasets (≥ 150 observations). With smaller datasets (<100 observations), we recommend
using RF, yet we should note that stressor hierarchy and stressor effects may not be reliably
identified with much less than 100 independent observations. Besides samples size, the length of
individual stressor gradients turned out to be influential on the correct identification of stressor
hierarchy and standardised effect sizes (SES). We recommend checking stressor's gradient
2
4
24
lengths prior to hierarchy analysis, for example, using a box plot or histogram to check the range of
values and its distribution. Our simulation analysis revealed reliable results with a gradient length
≥75% of the full gradient.
Once identified, the stressor hierarchy and interactions inform the selection and order of variables
to determine the SES. We recommend using generalised linear modelling (GLM) in a multi-model
inference framework (Grueber et al. 2011), which is a suitable and transparent method to derive
SES of multiple stressors. Again, we suggest at least ≥150 independent observations and main
stressors having ≥75% of the full gradient’s lengths, to achieve low errors and a high goodness-of-
fit (expressed as R2) of the final averaged model.
Quantified stressor effects—and interactions thereof—can inform thoughtful interpretation, which is
going to be a challenge with stressors whose effects on the biological response variable are not so
easily digestible. We also present some graphical methods and an example, which in particular
illustrates pairwise stressor interactions, to help interpret the ecological meaning behind them.
Overall, the analytical framework provided with this cookbook should allow the user to stepwise
"cook" his/her own data. Although we put focus on freshwater ecosystems, this cookbook should
be applicable to terrestrial and marine ecosystems too. The respective R scripts facilitate this with
numerous annotations of the code, guidance to interpret results and examples of real-case
application. The Supplementary Material provides additional background knowledge and presents
an extension of the methodology to data with random effects, such as repeated measures of the
same sites or spatially autocorrelated samples.
7 Acknowledgements
This work is part of the MARS project (Managing Aquatic ecosystems and water Resources under
multiple Stress) funded under the 7th EU Framework Programme, Theme 6 (Environment including
Climate Change), Contract No.: 603378 (http://www.mars-project.eu).
2
5
25
8 References
2000/60/EC. Directive 2000/60/EC of the European parliament and of the Council of October 2000
establishing a framework for community action in the field of water policy. Official Journal of
European Communities L 327, 1–72.
Agresti A. (2002). Categorical Data Analysis (2nd ed.). Wiley.
Akaike H. (1973). Information theory and an extension of the maximum likelihood principle. In: B.N.
Petrov (ed.): Proceedings of the Second International Symposium on Information Theory
Budapest: Akademiai Kiado, 267–281.
Archer E. (2016). Estimate permutation p-values for importance metrics. R package version 2.0.
(accessed June 2016).
Arnold J.G., Srinivasan R., Muttiah R.S. and Williams J.R. (1998). Large area hydrologic model-
ling and assessment part I: model development. Journal of American Water Resources
Association 34, 73–89.
Ban S.S., Graham N.A.J. and Connolly S.R. (2014). Evidence for multiple stressor interactions and
effects on coral reefs. Global Change Biology 20, 681–697.
Bartoń K. (2016). MuMIn: Multi-Model Inference. R package version1.15.6. https://cran.r-
project.org/web/packages/MuMIn/index.html (accessed June 2016).
Birk S., Bonne W., Borja A., Brucet S., Courrat A., Poikane S., Solimini A., van de Bund W.,
Zampoukas N. and Hering D. (2012). Three hundred ways to assess Europe’s surface waters:
Analmost complete overview of biological methods to implement the Water Framework
Directive. Ecological Indicators 18, 31–41.
Bolker B.M., Brooks M.E., Clark C.J., Geange S.W., Poulsen J.R., Stevens M.H.H. and White J.-
S.S. (2009). Generalized linear mixed models: a practical guide for ecology and evolution.
Trends in Ecology and Evolution 24, 127–135.
Bonada N., Prat N., Resh V.H. and Statzner B. (2006). Developments in aquatic insect
biomonitoring: A comparative analysis of recent approaches. Annual Review of Entomology 51,
495–523.
Breiman L. (2001). Random forests. Machine Learning 45, 5–32.
Bruno D., Gutiérrez-Cánovas C., Sánchez-Fernández D., Velasco J. and Nilsson C. (2016).
Impacts of environmental filters on functional redundancy in riparian vegetation. Journal of
Applied Ecology, 53, 846–855.
Burnham K.P. & Anderson D.R. (2002). Model Selection and Multimodel Inference: A Practical
Information–Theoretic Approach, 2nd edn. Springer, Berlin.
CEN (Comité Européen de Normalisation) (2003). Water Quality – Sampling of Fish with Electricity.
CEN, European Standard—EN 14011:2003 E, Brussels, Belgium.
Christensen R. and Haubo B. (2015). Analysis of ordinal data with cumulative link models—
estimation with the R-package ordinal.
Clapcott J.E., Collier K.J., Death R.G., Goodwin E.O., Harding J.S., Kelly D.J., Leathwick J.R. and
Young R.G. (2012). Quantifying the relationships between land-use gradients and structural and
functional indicators of stream ecological integrity. Freshwater Biology 57, 74–902.
2
6
26
Crain C.M., Kroeker, K. and Halpern B.S. (2008). Interactive and cumulative effects of multiple
human stressors in marine systems. Ecology Letters 11, 1304–1315.
Cutler D.R., Edwards Jr. T.C., Beard K.H., Cutler A., Hess K.T., Gibson J., Lawler J.J. (2007).
Random Forests for Classification in Ecology. Ecology 88, 2783–2792.
Dolédec S. and Statzner B. (2008). Invertebrate traits for the biomonitoring of large European
rivers: an assessment of specific types of human impact. Freshwater Biology 53, 617–634.
EC (European Commission) (2014). WFD Reporting Guidance 2016. Final Draft 6.0.3, 8 December
2015, 402pp. http://www.adbpo.it/PianoAcque2015/Elaborato_12_RepDatiCarte_3mar16/
PdGPo2015_All123_Elab_12_DocRif_3mar16/PdGPo2015_Bibliografia__Elab_0/WFD_Reporti
ngGuidance_vers6_03.pdf (accessed in June 2016).
EEA (1999). Environment in the European Union at the turn of the century. European
Environmental Agency, Copenhagen.
EEA (2012). European waters — current status and future challenges. Synthesis. EEA Report No
9. European Environmental Agency, Copenhagen.
Elbrecht V., Beermann A.J., Goessler G., Neumann J., Tollrian R., Wagner R., Wlecklik A., Piggott
J.J., Matthaei C.D. and Leese F. (2016). Multiple-stressor effects on stream invertebrates: a
mesocosm experiment manipulating nutrients, fine sediment and flow velocity. Freshwater
Biology 61, 362–375.
Elith J., Leathwick J.R. and Hastie T. (2008). A working guide to boosted regression trees. Journal
of Animal Ecology 77, 802–813.
Feld C.K. (2013). Response of three lotic assemblages to riparian and catchment-scale land use:
implications for designing catchment monitoring programmes. Freshwater Biology 58, 715–729.
Feld C.K., de Bello F. & Dolédec S. (2013). Biodiversity of traits and species both show weak
responses to hydromorphological alteration in lowland river macroinvertebrates. Freshwater
Biology 59, 233–248.
Feld C.K., Birk S., Eme D., Gerisch M., Hering D., Kernan M., Maileht K., Mischke U., Ott I.,
Pletterbauer F., Poikane S., Salgado J., Sayer C.D., van Wichelen J. and Malard F. (2016).
Disentangling the effects of land use and geo-climatic factors on diversity in European
freshwater ecosystems. Ecological Indicators 60, 71–83.
Fox J. and Weisberg S. (2011). An R Companion to Applied Regression, Second Edition.
Thousand Oaks CA: Sage. URL: http://socserv.socsci.mcmaster.ca/jfox/Books/Companion
(accessed June 2016).
Gagic V., Bartomeus I., Jonsson T., Taylor A., Winqvist C., Fischer C, Slade E.M., Steffan-
Dewenter I., Emmerson M., Potts S.G., Tscharntke T., Weisser W. and Bommarco R. (2015).
Functional identity and diversity predict ecosystem functioning better than species-based
indices. Proceedings of the Royal Society B 282, 2014–2620.
Grueber C.E., Nakagawa S., Laws R.J. and Jamieson I.G. (2011). Multimodel inference in ecology
and evolution: challenges and solutions. Journal of Evolutionary Biology, 24, 699–711.
Hapfelmeier A., Hothorn T., Ulm K and Strobl C. (2014). A new variable importance measure for
random forests with missing data. Statistics and Computing 24, 21–34.
Harrell F.E. (2001). Regression modeling strategies: with applications to linear models, logistic
regression, and survival analysis. Springer.
2
7
27
Hering D., Carvalho L., Argillier C., Beklioglu M., Borja A., Cardoso A.C., Duel H., Ferreira T.,
Globevnik L., Hanganu J., Hellsten S., Jeppesen E., Kodeš V., Solheim A.L., Nõges T.,
Ormerod S., Panagopoulos Y., Schmutz S., Venohr M. and Birk S. (2015). Science of the Total
Environment. The Science of the Total Environment, 503–504, 10–21.
Hijmans R.J., Phillips S., Leathwick J. and Elith J. (2016). dismo: Species Distribution Modeling. R
package version 1.0-15. http://CRAN.R-project.org/package=dismo (accessed June 2016).
Hothorn T., Buehlmann P., Dudoit S., Molinaro A. and van der Laan M. (2006). Survival
Ensembles. Biostatistics, 7, 355–373.
Ishwaran H. and Kogalur U.B. (2016). Random Forests for Survival, Regression and Classification
(RF-SRC), R package version 2.2.0.
Ishwaran H., Gerds T.A., Kogalur U.B., Moore R.D., Gange S.J. and Lau B.M. (2014). Random
survival forests for competing risks. Biostatistics, 15, 757–773.
Ishwaran H. (2007). Variable Importance in Binary Regression Trees and Forests, Electronic
Journal of Statistics 1, 519–537.
Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests. The
Annals of Applied Statistics 2, 841–860.
Ishwaran H., Kogalur U.B., Gorodeski E.Z., Minn A.J. and Lauer M.S. (2010). High-dimensional
variable selection for survival data. Journal of the American Statistical Association 105, 205–
217.
Jaccard J. and Turrisi R. (2003). Interaction effects in multiple regression (2nd ed.). Thousand
Oaks, Sage, CA.
Johnson R.K. and Hering D. (2009). Response of river inhabiting organism groups to gradients in
nutrient enrichment and habitat physiography. Journal of Applied Ecology 46, 175–186.
Komsta L. (2011). outliers: Tests for outliers. R package version 0.14. http://CRAN.R-
project.org/package=outliers (accessed June 2016).
Laliberté E., Wells J.A., DeClerck F., Metcalfe D.J., Catterall C.P., Queiroz C. et al. (2010). Land-
use intensification reduces functional redundancy and response diversity in plant communities.
Ecology Letters 13, 76–86.
Lange K., Townsend C.R. and Matthaei C.D. (2014). Can biological traits of stream invertebrates
help disentangle the effects of multiple stressors in an agricultural catchment? Freshwater
Biology 59, 2431–2446.
Matthaei C.D., Piggott J.J. and Townsend C.R. (2010). Multiple stressors in agricultural streams:
interactions among sediment addition, nutrient enrichment and water abstraction. Journal of
Applied Ecology 47, 639–649.
McCullagh P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society,
Series B 42, 109–142.
McCullagh P. and Nelder J.A. (1989). Generalized linear models. Chapman and Hall, London.
McGill B.J., Enquist B.J., Weiher E. and Westoby M. (2006). Rebuilding community ecology from
functional traits. Trends in Ecology and Evolution 21, 178–185.
Mouillot D., Graham N.A.J., Villéger S., Mason N.W.H. and Bellwood D.R. (2013). A functional
approach reveals community responses to disturbances. Trends in Ecology and Evolution 28,
167–177.
2
8
28
Naimi B. (2015). usdm: Uncertainty Analysis for Species Distribution Models. R package version
1.1-15. http://CRAN.R-project.org/package=usdm (accessed June 2016).
Nõges P., Argillier C., Borja A., Garmendia J.M., Hanganu J., Kodeš V., Pletterbauer F., Sagouis
A. and Birk S. (2016). Quantified biotic and abiotic responses to multiple stress in freshwater,
marine and ground waters. The Science of the Total Environment 540, 43–52.
Piggott J.J., Townsend C.R. and Matthaei C.D. (2015). Reconceptualizing synergism and
antagonism among multiple stressors. Ecology and Evolution 5, 1538–1547.
Piggott J.J., Lange K., Townsend C.R. and Matthaei C.D. (2012). Multiple Stressors in Agricultural
Streams: A Mesocosm Study of Interactions among Raised Water Temperature, Sediment
Addition and Nutrient Enrichment. PLoS ONE 7, e49873.
Poff N.L. (1997). Landscape filters and species traits: towards mechanistic understanding and
prediction in stream ecology. Journal of the North American Benthological Society 16, 391–409.
R Core Team (2016). R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. URL: https://www.R-project.org/ (accessed June 2016).
Ridgeway G (2015). gbm: Generalized Boosted Regression Models. R package version 2.1.1.
http://CRAN.R-project.org/package=gbm (accessed June 2016).
Ridgeway G. (2012). Generalized Boosted Models: A guide to the gbm package.
http://127.0.0.1:23616/library/gbm/doc/gbm.pdf (accessed June 2016).
Sánchez-Montoya M.M., Vidal-Abarca M.R. and Suárez M.L. (2010). Comparing the sensitivity of
diverse macroinvertebrate metrics to a multiple stressor gradient in Mediterranean streams and
its influence on the assessment of ecological status. Ecological Indicators 10, 896–904.
Schmidt-Kloiber A. and Hering D. (2015). www.freshwaterecology.info - an online tool that unifies,
standardises and codifies more than 20,000 European freshwater organisms and their
ecological preferences. Ecological Indicators 53, 271–282.
Statzner B. and Bêche L.A. (2010). Can biological invertebrate traits resolve effects of multiple
stressors on running water ecosystems? Freshwater Biology 55, 80–119.
Strobl C., Boulesteix A.-L., Kneib T., Augustin T. and Zeileis A. (2008). Conditional variable
importance for random forests. BMC Bioinform, 9, 307.
Strobl C., Malley J. and Tutz G. (2009). An introduction to recursive partitioning: rationale,
application, and characteristics of classification and regression trees, bagging, and random
forests. Psychology Methods 14, 323–348.
Suding K.N., Lavorel S., Chapin F.S., Cornelissen J.H.C., Diaz S., Garnier E. et al. (2008). Scaling
environmental change through the community-level: a trait-based response-and-effect
framework for plants. Global Change Biology 14, 1125–1140.
Townsend C.R. and Hildrew A.G. (1994). Species traits in relation to a habitat templet for river
systems. Freshwater Biology 31, 265–275.
Townsend C.R., Uhlmann S.S. and Matthaei C.D. (2008). Individual and combined responses of
stream ecosystems to multiple stressors. Journal of Applied Ecology 45, 1810–1819.
Venables W.N. and Ripley B.D. (2002). Modern Applied Statistics with S. Fourth Edition. Springer,
New York.
Wagenhoff A., Townsend C.R., Phillips N. and Matthaei C.D. (2011). Subsidy-Stress and Multiple-
Stressor Effects along Gradients of Deposited Fine Sediment and Dissolved Nutrients in a
2
9
29
Regional Set of Streams and Rivers: Sediment and Nutrients in Streams. Freshwater Biology
56, 1916–1936
Warton D.I. and Hui F.K.C. (2011). The arcsine is asinine: the analysis of proportions in ecology.
Ecology 92, 3–10.
Zuur A., Ieno E.N. and Smith G.M. (2007). Analyzing Ecological Data. Springer, New York.
Zuur A., Ieno E.N., Walker N., Saveliev A.A. and Smith G.M. (2009). Mixed effects models and
extensions in ecology with R. Springer, New York.
Box 1: Stepwise preparation and validation of the data to ensure basic requirements for the
analytical procedure are met. See the Supplementary Material, Appendix 1 for the full annotated R
code.
Outlier analysis
Outliers can be detected numerically or graphically. The numerical check is done with R’s summary
statistics:
summary (my.data$var) # replace “my.data” by your object‘s name and “var”
!by the variable's (column) name of the respective variable in your data
The function summary returns the minimum and maximum value, 25th, 50th (median) and 75th
percentile and the mean for each variable. The minimum and maximum value may give a good hint
of the presence of outliers.
A common graphical method is available with the function boxplot (Fig. B1.1):
boxplot (my.data$var) # this function is only valid for continuous variables
!and at least N=8 objects (rows) in your dataset
With the following modification, you can combine numerical and graphical methods and return
outliers, maximum and minimum values:
boxplot(my.data$var)$out # prints the values of the outliers
max(boxplot(my.data$var)$out) # prints the maximum outlier value
min(boxplot(my.data$var)$out) # prints the minimum outlier value
By default, boxplot classifies an observation as an outlier when its value reaches more than 1.5
times the interquartile range, i.e. the range between the 25th and 75th percentiles.
> summary (my.data$catchment.size)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.46 31.27 120.00 326.10 241.00 6400.00
Figure B1.1: Numerical and graphical check of outliers in an
environmental variable (catchment size [km2] of river sites).
Transformation and normalisation
2
Common transformations to achieve normal distribution in continuous variables include square-
root- and log10-transformation (Fig. B1.2):
sqrt.var <- sqrt (my.data$var) # creates a new object containing the
!sqrt-transformed values of var
log.var <- log10 (my.data$var+1) # creates a new object containing
!log-transformed values of var; note that log10 of zero is not defined,
!which is why you need to add a constant value to var
With proportional (%) variables, logit-transformation is preferable over arc sin-square-root
transformation (Warton and Hui 2011) (Fig. B1.3):
logit.var <- logit (my.data$var) # see Fox and Weisberg (2011) for
!details on how to use the function
Various methods are available to normalise a variable. A common approach is z transformation,
which converts a variable to values with mean=0 and SD=1. The R function scale can be used to
z-transform a continuous variable:
norm.var <- scale (my.data$var) # the function by default centres
!and standardises (normalises) a given variable var
Figure B1.2: Distribution of untransformed (left), square-root-transformed (centre) and log10-
transformed catchment sizes. With log10 transformation, normality is approached, but not yet
achieved (Shapiro Wilk test: W=0.9216, p<0.001). Nevertheless, this procedure can help
achieve normal distribution of regression residuals, which is the main goal here.
Figure B1.2: Distribution of untransformed (left), square-root-transformed (centre) and log10-
transformed catchment sizes. With log10 transformation, normality is approached, but not yet
achieved (Shapiro Wilk test: W=0.9216, p<0.001). Nevertheless, this procedure can help
achieve normal distribution of regression residuals, which is the main goal here.
Figure B1.2: Distribution of untransformed (left), square-root-transformed (centre) and log10-
transformed catchment sizes. With log10 transformation, normality is approached, but not yet
achieved (Shapiro Wilk test: W=0.9216, p<0.001). Nevertheless, this procedure can help
achieve normal distribution of regression residuals, which is the main goal here.
2
untransformed sqrt-transformed log10-transformed
3
Figure B1.3: Distribution of untransformed (left), logit-transformed (centre) and z-transformed
(normalised) values of a proportional variable (e.g. % macroinvertebrate shredders).
Figure B1.3: Distribution of untransformed (left), logit-transformed (centre) and z-transformed
(normalised) values of a proportional variable (e.g. % macroinvertebrate shredders).
Figure B1.3: Distribution of untransformed (left), logit-transformed (centre) and z-transformed
(normalised) values of a proportional variable (e.g. % macroinvertebrate shredders).
Collinearity and VIF
Collinearity can be checked numerically or graphically. A numerical method is the calculation of
correlation coefficients for each pair of variables:
cor (my.data) # calculates pairwise Pearson correlation coefficients for
!all variables of the object my.data; note that the function
!is only applicable to numerical variables
The graphical function pairs draws a matrix of scatterplots for each pair of variables, which allows
to detect also non-linear relationships. A more convenient extension of the function includes the
display of correlation coefficients (Pearson), a histogram showing the distribution of each variable
and a smoother (red line) in each scatterplot to better identify the relationship of variables: (Fig.
B1.4):
pairs (my.data) # note that with many variables, individual plots might
!get too small for proper interpretation
pairs (my.data, diag.panel=panel.hist, upper.panel=panel.smooth,
!lower.panel=panel.cor) # calculates Pearson correlation coefficients
!by default
In order to apply the histogram and smoother functions, the following code must be run before
(copy-paste the code into the R console and press enter):
panel.hist <- function(x, ...)
{
3
untransformed logit-transformed logit- and
z-transformed
4
usr <- par("usr"); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks; nB <- length(breaks)
y <- h$counts; y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col = "cyan", ...)
}
panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...)
{
usr <- par("usr"); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
r <- abs(cor(x, y))
txt <- format(c(r, 0.123456789), digits = digits)[1]
txt <- paste0(prefix, txt)
if(missing(cex.cor)) cex.cor <- 0.8/strwidth(txt)
text(0.5, 0.5, txt, cex = cex.cor * r)
}
Figure B1.4: Standard and extended pairplot of numerical variables to detect correlations
between variables. Note that correlation coefficient's font size is proportional to correlation
strength.
Figure B1.4: Standard and extended pairplot of numerical variables to detect correlations
between variables. Note that correlation coefficient's font size is proportional to correlation
strength.
Variance inflation factors in addition account for non-linear relationships between variables. The R
package usdm contains a function vif:
vif (my.data) # a set of benthic invertebrate feeding types
Variables VIF
1 pAkal 7.754461
2 pPelal 5.590338
3 pPsammal 3.992673
4 pPhytal 4.035367
5 pStones 1.367112
6 pGrazer 3.658217
7 pXylo 1.220878
8 pShredder 5.170925
4
5
9 pGatherer 4.556690
10 pFiltA 3.912285
11 pFiltP 8.415196
The function vifstep performs an automated stepwise exclusion of variance-inflated variables,
conditional on a threshold th entered by the user:
vifstep (my.data, th=7) # threshold set to VIF<7
2 variables from the 11 input variables have collinearity problem:
pFiltP pAkal
After excluding the collinear variables, the linear correlation coefficients
ranges between:
min correlation ( pGatherer ~ pGrazer ): 0.01339895
max correlation ( pFiltA ~ pPelal ): 0.7127248
---------- VIFs of the remained variables --------
Variables VIF
1 pPelal 3.599328
2 pPsammal 3.322414
3 pPhytal 1.631246
4 pStones 1.296565
5 pGrazer 1.427984
6 pXylo 1.069608
7 pShredder 1.764175
8 pGatherer 2.583278
9 pFiltA 3.291911
5
Box 2: Multiple-stressor hierarchy and interaction.
Random Forest (RF)
Run a Random Forest analysis using the package randomForestSRC (Ishwaran and Kogalur
2016):
set.seed (1234) # sets a numerical starting point; will be set randomly
!if not set by the user
my.rf <- rfsrc (response ~ ., mtry=5, ntree=2000, importance=”random”,
!data=my.data) # function to run RF
my.rf # output of model details, e.g. goodness-of-fit, out-of-bag (OBB) error
In this example, the dot in response ~ . tells the function to include all predictors in the dataset.
The mtry parameter sets the number of variables randomly selected as candidates for each node
split. For regression, mtry should equal the number of predictors divided by three, while for
classification, the square-root of the number of predictors is a useful value. ntree is the number of
trees to be fitted. The parameter importance sets the way in which variable importance is
calculated.
The rfsrc results are stored in the object my.rf (e.g. number of trees, goodness-of-fit). It is
recommended to run several models, starting with a high number of trees (e.g. 5000) and then
refine the models, for example, based upon the OOB error against the number of trees (Fig. B2.1):
plot (gg_error (my.rf)) # plot OOB error rate against the number of trees
Figure B2.1: OOB error rate against number of
trees in a Random Forest model.
To estimate the variables importance, we can use the function gg_vimp():
my.rf.vimp <- gg_vimp (my.rf) # provides the predictor's importance
my.rf.vimp
plot (my.rf.vimp) # visualises the predictor’s importance (Fig. B2.2)
2
Figure B2.2: Bar plot of predictor’s
importance in a Random Forest
model. Large positive numbers of
importance indicate a high
predictive capacity of a variable,
whereas small or negative values
(red bars) mark irrelevant
predictors in the dataset.
To plot the response variable against each predictor variable, we can generate partial dependance
plots (Fig. B2.3):
my.rf.part.plot <- plot.variablec(my.rf, partial=TRUE, sorted=FALSE,
!show.plots=FALSE)
gg.part <- gg_partial (my.rf.part.plot)
plot (gg.part, xvar=names(my.data[,-1]), panel=TRUE, se=TRUE)
Alternatively, one can plot the marginal response with a smooth fitted line (Figure not shown here):
plot.variable (my.rf, partial=TRUE, smooth.lines=TRUE) # plots the
!marginal fitted response of each predictor variable
The selection of the best stressor candidates can be informed by the function max.subtree()
(Ishwaran et al. 2010):
md.obj <- max.subtree (my.rf)
md.obj$topvars # extracts the names of the variables in the object md.obj
The function find.interaction() ranks interactions according to its importance. It is suggested
to rank interactions only for the most important predictors, for example, those hypothesised a priori
or those delivered by max.subtree(my.rf) using the argument xvar.names. In this example, we
only select the most relevant subset detected by max.subtree(my.rf):
my.rf.interaction <- find.interaction (my.rf, xvar.names=md.obj$topvars,
!importance="random", method="vimp", nrep=3) # method=“vimp”
!computes the additive and combined effects of each pair of stressors in
!the response variable
Large positive or negative numbers between additive and combined stressor effects can indicate a
potential interaction effect (Ishwaran 2007).
2
3
Figure B2.3:
Partial
dependence plots
showing the
response variable
(yhat) against
each predictor
(s1–s20) in a
Random Forest
model. The
strongest
response in this
example is
observable on the
first row.
Boosted Regression Trees (BRT)
Run Boosted Regression Trees using the packages gbm (Ridgeway 2015) and dismo (Hijmans et
al. 2016):
my.brt <- gbm.step (data=my.data, gbm.x=c(1:3, 5), gbm.y=6,
!family="gaussian", tree.complexity=5, learning.rate=0.005,
!bag.fraction=0.7) # with 5-way interactions and 70% of the data used to
!train the model
Descriptor variables are defined by the parameter gbm.x, which is set to the variables in columns
1–3 and 5 of the object my.data in this example. The family parameter is set to "gaussian" for a
continuous response variable (column 6 of my.data).
sum.my.brt <- summary (my.brt) # create an object of the summary
!statistics
sum.my.brt # tabular overview of descriptors ranked by their contribution
!to the deviance explained in the model (Fig.B2.4)
summary (my.brt) # bar plot of descriptor's contributions to the
!deviance explained in the model (Fig.B2.4)
The functions gbm.plot() and gbm.plot.fit() generate partial dependence plots with
normalised and fitted values, respectively. With gbm.plot(), the y-axes are centred around a
3
4
mean=0 and logit-scaled, while the original scale of the response variable is maintained with
gbm.plot.fit(). The parameter smooth=TRUE adds a smoother line to the plots, title=FALSE
suppresses the plot title. With plot.layout=c(2, 2), the output is arranged in a matrix of 2 x 2
plots (as there are only four descriptor variables (gbm.x) in the analysis):
plot.my.brt <- gbm.plot (my.brt, smooth=TRUE, n.plots=4, write.title=FALSE,
!plot.layout=c(2, 2)) # partial dependence plot with a fitted line
!and a smoother line overlaid, y-axis normalised (Fig. B2.5)
plot.my.brt.fits <- gbm.plot.fits (my.brt) # partial dependence plot with fitted
values plotted, y-axis with original response variable's scale
Interact