ArticlePDF Available

Big data opportunities and challenges for assessing multiple stressors across scales in aquatic ecosystems

Abstract and Figures

Aquatic ecosystems are under threat from multiple stressors, which vary in distribution and intensity across temporal and spatial scales. Monitoring and assessment of these ecosystems have historically focussed on collection of physical and chemical information and increasingly include associated observations on biological condition. However, ecosystem assessment is often lacking because the scale and quality of biological observations frequently fail to match those available from physical and chemical measurements. The advent of high-performance computing, coupled with new earth observation platforms, has accelerated the adoption of molecular and remote sensing tools in ecosystem assessment. To assess how emerging science and tools can be applied to study multiple stressors on a large (ecosystem) scale and to facilitate greater integration of approaches among different scientific disciplines, a workshop was held on 10–12 September 2014 at the Sydney Institute of Marine Sciences, Australia. Here we introduce a conceptual framework for assessing multiple stressors across ecosystems using emerging sources of big data and critique a range of available big-data types that could support models for multiple stressors. We define big data as any set or series of data, which is either so large or complex, it becomes difficult to analyse using traditional data analysis methods.
Content may be subject to copyright.
Big data opportunities and challenges for assessing multiple
stressors across scales in aquatic ecosystems.
K. A. Dafforn
, E. L. Johnston
, A. Ferguson
, C.L. Humphrey
, W. Monk
S. J. Nichols
, S. L. Simpson
, M. G. Tulbure
and D. J. Baird
Evolution and Ecology Research Centre, School of Biological, Earth and Environmental Sciences,
University of New South Wales, Sydney, NSW 2052, Australia.
Sydney Institute of Marine Sciences, Mosman, NSW 2088 Australia.
Office of Environment and Heritage, PO Box A290, Sydney South, NSW 1232, Australia.
Environmental Research Institute of the Supervising Scientist, PO Box 461, Darwin, NT 0801,
Canadian Rivers Institute, Faculty of Forestry and Environmental Management, University of
New Brunswick, PO Box 4400, Fredericton, NB, E3B 5A3, Canada.
Institute for Applied Ecology and MDBfutures Collaborative Research Network, University of
Canberra, Canberra, ACT 2601, Australia.
CSIRO Land and Water, Centre for Environmental Contaminants Research, Locked Bag 2007,
Kirrawee, NSW 2232, Australia.
Environment Canada @ Canadian Rivers Institute, Department of Biology, University of New
Brunswick, PO Box 4400, Fredericton, NB, E3B 5A3, Canada.
Corresponding author. Email
Abstract. Aquatic ecosystems are under threat from multiple stressors, which vary in distribution and intensity across
temporal and spatial scales. Monitoring and assessment of these ecosystems have historically focussed on collection of
physical and chemical information and increasingly include associated observations on biological condition. However,
ecosystem assessment is often lacking because the scale and quality of biological observations frequently fail to match
those available from physical and chemical measurements. The advent of high-performance computing, coupled with new
earth observation platforms, has accelerated the adoption of molecular and remote sensing tools in ecosystem assessment.
To assess how emerging science and tools can be applied to study multiple stressors on a large (ecosystem) scale and to
facilitate greater integration of approaches among different scientific disciplines, a workshop was held on 10–12
September 2014 at the Sydney Institute of Marine Sciences, Australia. Here we introduce a conceptual framework for
assessing multiple stressors across ecosystems using emerging sources of big data and critique a range of available big-data
types that could support models for multiple stressors. We define big data as any set or series of data, which is either so
large or complex, it becomes difficult to analyse using traditional data analysis methods.
Received 12 March 2015, accepted 20 July 2015, published online
22 October 2015
the states of many aquatic ecosystems, freshwater,
estuarine and marine, are under threat from a complex array
of pressures and stressors, which vary in their distribution
and intensity across temporal and spatial scales (Burton and
Johnston 2010; Harris and Heathwaite 2012). Drivers of change
include both human actions (e.g. intensification of industries
and land-use), and climate change and natural variability, and
these drivers influence how various pressures and associated
stressors (e.g. contaminants) modify the state of an ecosystem at
different scales (Baird et al. 2015). Existing ecosystem moni-
toring and assessment programs focus on collection of data to
support management goals, yet the data collected are variable in
quality and quantity. These data may be insufficient to support
more comprehensive risk assessments of ecosystem degrada-
tion, resulting from multiple stressors occurring over broad
spatial and temporal scales (Dafforn et al. 2014).
Historically, monitoring and assessment of aquatic ecosys-
tems have focussed on collection of information on water
quality, in terms of descriptions of physical and chemical
conditions, but increasingly habitat (including environmental
flows and riparian conditions) and biota are included (Norris and
Thoms 1999; Dafforn et al. 2012). Data gathering has tradition-
ally focussed at the site scale, with data subsequently aggregated
to inform larger-scale assessments, e.g. at the ecosystem scale
(Borja et al. 2008). For example, the European Union Water
Marine and Freshwater Research
Journal compilation Ó CSIRO 2015 Open Access
Quality Framework Directive (WFD) aims to manage multiple
ecosystem components at the river-basin scale, with integration
across political and administrative boundaries ( Borja et al.
2009). Within such assessments, consideration of multiple stres-
sors is generally either a specific or implied goal in aquatic-
ecosystem assessment, yet the approaches employed generally
lack a truly systematic focus in identifying the potential for
combined effects of multiple stressors and apportioning cause.
Identifying and ranking key stressors pose a challenge in
natural ecosystems. In such multi-stressed systems, untangling
the effects of individual drivers can require many observations.
Inductive approaches based on ‘expert judgement’ are often
used but may lead to incorrect diagnoses, with the more obvious
or visible agents being causally implicated without clear mech-
anistic evidence. In a recent review of the state of the science in
cumulative impacts assessment, Halpern and Fujita (2013)
promoted an approach developed for assessment of multiple
stressor impacts on coastal areas (Halpern et al. 2008), which
organises pressures and associated stressors to gauge system
vulnerability. This approach has seen widespread use across
aquatic ecosystem types, including river (Vo¨ro¨smarty et al.
2010), Great Lake (Allan et al. 2013) and marine assessments
(Ban et al. 2010).
The Halpern et al. (2008) approach focuses on assessing
vulnerability of specific habitat types to specified human activi-
ties (pressures) and their associated stressors. However, vulner-
ability is a difficult concept to capture, particularly within larger
systems, owing to the diffuse nature of key ecosystem elements,
and a general lack of pervasive data by which to encompass their
essential properties. By relying on a classification system based
heavily on expert judgment as a method to rank and weight
stressors, it suffers from a lack of objectivity, as it focuses on
potential exposure to specific hazards rather than cumulatively
quantifying their effects. Furthermore, it presumes substantial
local knowledge among those experts for its application in
specific situations, whereas stressors are defined in broad,
overlapping categories, conflating potential risks and potentially
leading to confounding errors. Other strategies developed in the
United States and Australia seek to address the uncertainty
inherent in large-scale assessments of the cumulative impacts
of multiple disparate stressors, by incorporating Bayesian meth-
ods into ecological risk assessment (Wiegers et al. 1998; Pollino
et al. 2007; Ayre and Landis 2012; Bayliss et al. 2012). These
models are often flexible with the potential to integrate qualitative
knowledge (e.g. expert opinion) with quantitative primary data
(Bayliss et al. 2012). This goes some way to reduce uncertainty,
which is inherent in complex ecological systems (Ayre and
Landis 2012). The incorporation of large primary datasets into
ecological risk assessment to inform environmental regulation is
addressed in detail by Van den Brink et al. (2015).
The advent of high-performance computing systems, cou-
pled with new earth observation platforms, has accelerated the
adoption of geographical information systems (GIS) and remote
sensing tools in ecosystem assessment (Pettorelli et al. 2014).
Their use in interpretation of scale-dependent, ecological phe-
nomena is an emerging area, and here we explore synergies with
key areas of stress ecology. Moreover, the increasing availabili-
ty of molecular techniques for environmental applications is
now providing a vast pool of data relevant to interpreting
changes in ecosystem health (e.g. organism biodiversity, abun-
dance and impacts to function) (Chariton et al. 2015). These
diverse and often large data sources provide us with increased
on potential drivers, stressors and biological
responses of a scale never previously realised. The increased
availability of such data can help improve the ranking of
stressors by providing more observations of individual drivers
and their associate stressors and affected responses, and reduce
the need for decisions based on expert opinion alone.
As indicated above and in Baird et al. (2015), a key focus of
the 2014 Sydney workshop was to find and report commonality
among diverse approaches developed by aquatic scientists
working on marine and freshwater systems. Here we propose
a conceptual framework for assessing multiple stressors across
ecosystems using emerging sources of large datasets often
termed ‘big data’. These data include satellite, mapping, geo-
physical and monitoring data sources. We define big data as any
set or series of data, which is either so large or complex that it
becomes difficult to analyse using traditional data analysis
methods (Hampton et al. 2013). This includes the new data
being generated by advances in ecological monitoring, biomo-
nitoring science, ecogenomics and earth observational technol-
ogies. These big datasets have the potential to address questions
over larger spatial and temporal scales and be used to investigate
complex environmental problems. We give examples and cri-
tique a range of available big-data types that could support
models for multiple stressors and the assessment of their effects.
The two complementary papers from the workshop addressed
how multiple stressors can be better evaluated with ecogenomic
tools and analyses (Chariton et al. 2015), and with the incorpo-
ration of big data into ecological risk assessment to inform
environmental regulation (Van den Brink et al. 2015).
Ecosystem monitoring across scales
Environmental monitoring across ecosystems requires consid-
eration of both the temporal and spatial scales at which stressors
may be acting in different systems. In terms of spatial scale, this
can vary substantially across systems, from the nano- to the
bioregion scale. For example, the Australian and New Zealand
water quality guidelines seek to distinguish the information
sought and acquired, and associated tradeoffs, at broad-scale
(regional) and finer scales respectively (ANZECC/ARMCANZ
Broad-scale monitoring is applied at catchment, regional or
larger scales, using, typically, rapid-assessment methods. Rapid
assessment is typically applied over wide geographical areas for
first-pass determination of the extent of a problem or potential
problem (broad-scale land-use issues, diffuse-source effluent
discharges or information for State of Environment Reporting),
screening of sites, or to assess results from large-scale remedia-
tion efforts (ANZECC/ARMCANZ 2000). These methods do
not provide detailed quantitative information for a site, but are
cost-effective and quick enough to generate sufficient data for
initial management purposes, and assist managers to decide
what further types of information may be required, and where to
direct efforts. In Australia, the most developed rapid-biological-
assessment method is the Australian River Assessment System
(AUSRIVAS), a River Invertebrate Prediction and Classifica-
tion System (RIVPACS)-type predictive modelling method for
B Marine and Freshwater Research K. A. Dafforn et al.
freshwater systems using macroinvertebrate communities in
rivers and streams. RIVPACS and similar approaches have been
developed or implemented in the EU, UK and Canada (Turak
et al. 1999; Clarke et al. 2003; Davy-Bowker et al. 2006; Hargett
et al. 2007). The risks in only applying rapid-assessment
methods are that insufficient details are gathered at site-level
to detect small changes, resulting from disturbance or manage-
ment actions and this generally prevents determining the mag-
nitude of response.
Finer-scale monitoring is typically required for sites of
particular interest (e.g. sites of high conservation value, major
developments or point-sources of particular potential concern).
Attention to good experimental design and quantitative sam-
pling is prescribed in the monitoring requirements, aiming to
result in replicated and representative community-structure
data, but adequate methods to achieve this do not always exist
Two key features distinguish finer-scale and broad-scale
monitoring programs: the provision of detailed quantitative
and accurate assessments of selected indicators but at limited
(fine) spatial scales, for reasons of high cost; and the provision of
less accurate first-pass assessments of broad-scale indicators,
but at greater (broad) spatial scales (ANZECC/ARMCANZ
2000). Inference at finer scale is typically directed at detecting
stressors associated with the putative disturbance in question.
If these studies are not long-term and are spatially constrained,
they may not detect or diagnose trends in unforeseen stressors
e.g. stressors acting at broader-scales ¼ upon reference sites.
The temporal scale of monitoring programs is also important
in detecting anthropogenic impacts against background varia-
tion associated with natural cycles in environmental stressors.
Two examples of major environmental stressors (light and
freshwater inputs) that interact to drive the biological responses
of an aquatic system are described in Fig. 1. Seasonal variation
in the availability of light and nutrients (as delivered by
freshwater inputs) are the principal drivers of new primary
production in aquatic ecosystems (Harding 1994). As such,
the nature of biological responses (primary and secondary
production) and resultant ecosystem structure are shaped by
the relative timing in the annual cycles of these two environ-
mental stressors. These dynamics of multiple stressors vary
considerably among different systems, with some subject to
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Temperature/solar radiation
Stochastic freshwater input
high inter-annual variability
Seasonal freshwater input
low inter-annual variability
Change in freshwater input
due to flow regulation
High latitude systems
freshwater inputs
driven by snow melt
Sub-tropical systems
freshwater inputs driven by
synoptic weather patterns
Arid systems
freshwater inputs
driven by major storms
Freshwater flow variability
Fresh water inputs
Fig. 1. The extremes in freshwater flow variation relative to seasonal variation in temperature and
solar radiation, which tend to co-vary. The shaded area in each curve is intended to indicate the range of
observations throughout the year. In high latitude systems, freshwater inputs occur as a spring ‘freshet’
driven by snow melts (e.g. Prinsenberg 1980; De´ry et al. 2005), with little inter-annual variation in
either the timing or magnitude of flows (as indicated by the green shading). In contrast, freshwater
flows to arid systems typically occur in response to major storm events, which may occur across a
wider seasonal range. These storms tend to have multi-year to decadal return intervals, hence inter-
annual variability in freshwater flow and biological responses is high. Anthropogenic stressors (e.g.
flow regulation) may cause significant changes in the frequency, timing and magnitude of freshwater
flows (indicated by the dashed line), resulting in a cascade of impacts throughout the ecosystem (Leigh
et al. 2010).
Big data for assessing aquatic ecosystems Marine and Freshwater Research C
greater environmental variability than others. Therefore, the
choices for the temporal scale of monitoring among systems are
important in order to capture this variability. Systems that
experience more stochastic freshwater inputs exhibit much
higher variability in biological responses because of variability
in the timing of freshwater delivery relative to optimal light or
temperature. For example, arid systems tend to exhibit dramatic
boom-bust responses to flooding events, which may occur as
infrequently as once every 2–3 decades (e.g. Lake Eyre in
central Australia; (Kingsford et al. 1999)(Fig. 1). Human
stressors (e.g. flow regulation) may cause a shift in the natural
range of variation in key environmental stressors causing
significant changes in biological function and ecosystem struc-
ture (Fig. 1)(Baldwin et al. 2013).
Environmental stressors and biological indicators measured
at a particular point in time and space integrate processes over
variable periods of time, preceding the time of collection. An
understanding of this is fundamental to the interpretation of
monitoring data. Measurements should be collected over care-
fully selected timescales so that the data will provide a time-
integrated history of the system (Table 1). The requirement for
data gathered at daily, monthly, or yearly intervals etc. should be
considered in any study and these scales incorporated in ecosys-
tem-based approaches, so that we can identify and describe the
nature of any changes observed. For example, salinity measured
within an estuary is dependent on the state of the tide (timescale
of hours) and recent freshwater inflows (weeks); therefore,
sampling may need to include hourly and weekly replication.
Furthermore, nutrient concentrations are dependent on tide and
freshwater inflows, but also on internal recycling processes
(e.g. benthic fluxes); therefore, they integrate over seasonal
timescales (Table 1). In general, sediment properties integrate
over longer timescales than pelagic properties and therefore
some properties such as grain sizes, metals and organic contami-
nants may require less frequent sampling. However, factors that
modify the bioavailability of contaminants, for example redox
conditions, may vary over short timescales (Simpson et al. 2005).
The spatial and temporal scale at which organisms respond to
environmental stressors (i.e. whether they are pulse or press
phenomena; Lake 2000) and the subsequent influence on
ecological processes will depend on the system and the organ-
ism type (e.g. microbe, invertebrate, vertebrate) (Fig. 2). In
general, we expect organism size and life span will determine
their response at different scales, and thus their suitability as
indicators for monitoring at various spatial and temporal scales.
In part, this is because of a general increasing sensitivity to
environmental conditions with decreasing size and greater
ability to adapt rapidly (Hilty and Merenlender 2000; McKinley
et al. 2011; Sun et al. 2012; Dafforn et al. 2014), and the
increasing ability to disperse greater distances with increasing
size thus increasing the spatial extent of environmental exposure
(Bowler and Benton 2005). Hence, the temporal and spatial
scale at which data are required in order to detect ecological
change (from single or multiple stressors), should always be
determined by the nature of the driver e.g. pulse or press, and the
process or organism(s) of interest (Fig. 2).
Using the example of estuarine monitoring, biological
responses (of phytopla nkton, invertebrates, or vertebrates) to
the major environmental stressors or drivers (e.g. light, tem-
perature and fresh wate r flow) will vary signific antly wi thin an
annual cycle (Fig. 3). Observed patterns in bio tic composition
in time and space will reflect (1) the timing of f reshwater flow
or nutrients relative to seasonal variations of temperature
and light; and (2) the individual lifespans and life-histories
of the biota. Phytoplankton respond to nutrient delivery as
turbidity reduces, temperatures rise and photosynthetically
active radiation increases (Fig. 3). The initial spring bloom
collapses onc e nutrients are depleted; however, reminera lisa-
tion of organic matter in the sediments during summer fuels
subsequent blooms in late summer and autumn. Inverte brate
communities (pelagic and benthic) exhibit classic predator–
prey dynamics, with seasonal peaks in late summer–autumn
et al. 1986).
Fish comm unitie s tend to e xhi bit more
inter-annual variation compared with invertebrate communi-
ties. Given such spatial and temporal discontinuity between
responses of different taxa to environmental stressors and the
differential sensitivity of these taxa, a range of stresso rs require
monitoring over different scales in order to identify the causes
Table 1. Variability in pelagic and sediment properties over different time scales (from minutes to decades)
POPs, persistent organic pollutants (e.g. Dioxins, PCBs); HOCs, hydrophobic organic contaminants (e.g. polycyclic aromatic hydrocarbons); Pesticides, the
more hydrophilic pesticides
Parameter Decades Years Months Weeks Days Hours Minutes
Waters Salinity
(pelagic) Light ––
Temperature ––
Chlorophyll ––
Most contaminant inputs
Sediments Particle size
(benthic) Organic matter
Nutrients or sulfides
Metals or POPs
D Marine and Freshwater Research K. A. Dafforn et al.
of change in a multi-stressed system, and to limit error s of
Ecological models are widely used in ecosystem monitoring
to understand and predict changes over spatial and temporal
scales among systems. Such modelling approaches require
cross-system building blocks that may incorporate physics
(e.g. geomorphology, hydrodynamics and temperature), chem-
istry (e.g. water quality parameters) and biology (e.g. genes,
cells, organisms, species, populations, assemblages and com-
munities). Furthermore, ecological functions and processes are
integral to building comprehensive models, and also need to be
captured in their outputs. Such functions include photosynthe-
sis, chemosynthesis and decomposition. Processes might be
measured and influenced on a local scale, including recruitment,
immigration, emigration, competition, facilitation and preda-
tion. Regional processes include climate, evolution and dispersal.
The process of building a model, therefore, can require exten-
sive information on inputs (e.g. the physical and chemical
components), and the outputs (e.g. biological health compo-
nents) have the potential to guide future monitoring efforts. For
Temperature or light
Freshwater flow
Fig. 3. Biological responses to the major environmental stressors in an aquatic system over many years. Coloured
lines represent increases in abundance of phytoplankton (dark green, initial spring bloom; light green, late summer or
autumn bloom), invertebrates (brown) and vertebrates (red) in response to temperature or light (yellow) and
freshwater flow (blue).
Hours Days Weeks Years Decades CenturiesSeasonsSeconds
1 µm
1 mm
1 m
1 km
10 km
100 km
1000 km
Fig. 2. The spatial and temporal scales over which drivers (identified as pulse or press in nature) change
and how these vary between systems and taxonomic groups (microbes, invertebrates, vertebrates and
ecosystems) (e.g. Lake 2000).
Big data for assessing aquatic ecosystems Marine and Freshwater Research E
example, heuristic or conceptual models are frequently used to
generate hypotheses and identify gaps or knowledge require-
ments, whereas statistical or empirical models often investigate
relationships whose accuracy can only be improved through
extensive data calibration. Information incorporated into
models can allow us to communicate important relationships
between pressures and stressors. Models also facilitate scaling-
up of measurements, such that information collected on smaller
scales can be extrapolated to make predictions on larger scales,
both temporally and spatially. Models are, however, only as
useful as the information available to build and calibrate
them. Therefore, care is needed to ensure there are sufficient
data (e.g. quantity, quality, form) to calibrate and test model
Detecting human-induced change
Approaches that seek to assess human-induced change require a
comparison with some notion of baseline or pre-impact state to
identify impacts and targets for improvements. Practically, and
in recognition that the best-available natural state typically has
some degree of minimal disturbance, while still maintaining
ecological integrity, this is often described as a ‘reference state’.
Human activities may drive environmental conditions beyond
the reference state by changing the range, frequency or spatial
scale of conditions (Fig. 1), or by introducing an entirely new
condition (e.g. a new contaminant or invasive species). Such
extensions or modifications to conditions would represent the
presence of new or potential stressors and would be measurable
as changes in the range of parameters (e.g. temperature, pH,
salinity, light, nutrients sedimentation, hydrology, disease, tur-
bidity, invasive species and toxicants). Informative data sources
may also be available at the driver level (e.g. catchment land use,
rainfall, shipping activity, fishing and mining). These may be
useful proxies for predicting and identifying potential stressors
during early conceptual modelling. Sampling of stressors and
ecological responses should be targeted to the spatial and tem-
poral scale of the expected response to that activity or pressure,
and a reference set of environmental and biological conditions
must be ascertained over a spatial and temporal scale relevant to
that ecosystem type.
Prediction and early detection of possible effects of human
and natural stressors are important components of effective
monitoring programs so that substantial and ecologically impor-
tant impacts can be avoided. Early information enhances the
options for management; the information may be as specific as
bioaccumulating contaminants downstream of a point source or,
through broad-scale coverage, pinpointing of potential ‘hot-spots’
that would otherwise be missed (e.g. ANZECC/ARMCANZ
2000). Monitoring targets requires clear articulation, for example,
about whether we want to know that systems are functioning in a
sustainable way or how they are functioning relative to a refer-
ence condition. Ecosystem-based approaches to understanding
cumulative impacts first require basic research, including base-
line monitoring, to detect changes caused by humans that are
outside the range of natural variability.
Distinguishing multiple stressors
The mode of impact should ideally be sampled over consistent
spatial and temporal scales for stressors that are known to have
similar mechanisms of action. Distinguishing the effects of
multiple stressors will require multiple lines of evidence,
regardless of the size of the available data streams. Observa-
tional data may be incorporated into several data-driven models
that can assess the relative importance of different human
and environmental stressors (e.g. Artificial Neural Networks
and Boosted Regression Trees (Chariton et al. 2015). These
modelling approaches are predicated on substantial observa-
tional-data holdings. For this reason, quality and quantity of
such data will determine the strength of any predictions.
Experimental studies of interacting stressors may provide evi-
dence for causal interactions. Historical data and expert opinion
are often used to complement empirical data to populate
Bayesian approaches (Van den Brink et al. 2015). However,
studies attempting to demonstrate a causal relationship in
environmental systems face challenges associated with multiple
stressors, natural variability, the difficulty of performing rig-
orous experiments, and the issue of time and money required to
undertake such studies. Transparent and logical methods are
needed to synthesise and evaluate the vast pool of evidence from
the multiple studies in existing scientific research (see Norris
et al. 2012 ). If the maintenance of broader ecosystem integrity
(diversity, function and process) is the goal of management, then
the smallest spatial and temporal scale of variation relevant to
any one driver or ecological response should determine the
finest scale of data needed. Not all data sources may need to be
collected at this scale.
Shifting baselines
Apart from the complexities of multiple stressor effects, dealing
with broad-scale changes in environmental conditions, such as
the response to climate change, is a significant challenge for
biological monitoring. Changes in baseline conditions mean that
users of monitoring approaches that refer to reference conditions
may need to be aware of any changes in the reference condi-
tions themselves as a consequence of altered climatic conditions.
Studies examining trends in reference-site condition show that
sites can remain within a stable reference condition for some time
(Metzeling et al. 2002; Nichols et al. 2010; T. B. Reynoldson,
unpubl. data, 2006). However, these findings cannot be accepted
as a general conclusion (Reynoldson and Wright 2000), but
rather highlight the need for more extensive datasets to describe
long-term trends in ecological condition, particularly in light of
predicted climate changes. Combining these data with modern
statistical approaches, GIS and remote sensing tools can allow a
detailed understanding of the effects of climate alone and in
combination with multiple impacts on ecological condition.
Ecosystem function measured at a particular point in time
integrates a history of complex interactions between environ-
mental and human stressors that may span decades. In the
provided in Fig. 4 (Glibert et al. 2011), changes in
the trophic structure of San Francisco Bay have arisen from (1)
changes in nutrient loadings to the system, which have altered
the nitrogen : phosphorus ratios; (2) the introduction of an
invasive bivalve Corbula amurensis; and (3) decadal cycles in
freshwater flow. Complex interactions between biogeochemical
processes, ambient nutrient ratios and predator–prey dynamics
have resulted in a dramatic decline in pelagic-based foodchains,
the establishment of invasive macrophytes (Egeria densa), and
F Marine and Freshwater Research K. A. Dafforn et al.
an increase in the abundance of toxic cyanobacteria. The
trajectories of change for any ecosystem component (e.g. the
pelagic fish Delta smelt, Hypomesus transpacificus; Fig. 4)
occur over decadal timescales, and can only be fully understood
with reference to interactions among environmental drivers and
their flow-on effects on phytoplankton and zooplankton com-
munity composition (Glibert et al. 2011). As such, detection of
the true impacts resulting from these or other human-induced
stressors (e.g. chemical contaminants) may require multiple data
streams collected over long periods.
DIN:DIP (wt:wt)
Delta smeltStary flounder
Largemouth bass
Wastewater treatment
plant online
1980 1990 2000
1980 1990 2000
1980 1990 2000
Nutrient ratios
Pelagic organism
decline begins
Removal of phosphates
from detergents
Invasion by Corbula
Increase in WWTP
2E-22 R
2000 2010
Phytoplankton Zooplankton
Fig. 4. Changes in the trophic structure of San Francisco Bay have arisen from (1) changes in nutrient loadings to the system, which have altered the
nitrogen : phosphorus ratios; (2) the introduction of an invasive bivalve Corbula amurensis; and (3) decadal cycles in freshwater flow. The graph
symbols are coloured according to when data were collected relative to distinct periods of freshwater flows and anthropogenic stressors (see bar at top of
hydrograph panel). All data are log-transformed. Phytoplankton (cells mL
), zooplankton (individuals m
), and fish (abundance based on trawl and
seine net indices) See Glibert et al. (2011) for full discussion of the influence of nutrient stoichiometry over foodweb structure. Copyright 2011, from
Ecological Stoichiometry, Biogeochemical Cycling, Invasive Species, and Aquatic Food Webs: San Francisco Estuary and Comparative Systems by
Glibert et al. (2011). Reproduced by permission of Taylor and Francis Group, LLC (
Big data for assessing aquatic ecosystems Marine and Freshwater Research G
New big-data sources for biomonitoring
Significant advances in ecological monitoring, biomonitoring
science, ecogenomics and earth observational technologies
provide us with new opportunities to address problems across
large spatial and temporal scales, which were previously
impossible. Broadly classified, datasets can provide input to
models as drivers, stressors and state (defined in Baird et al.
2015), with some data sources informing more than one of these
categories (Tables 2, 3). These big datasets now provide the
opportunity to directly link drivers, stressors and state, with
much readier detection and diagnosis of human-induced chan-
ges for a broad spectrum of scenarios. Short-term, point-source
impacts (e.g. contaminant inputs from stormwater overflow)
can be continuously monitored by water quality sondes and
passive samplers, which are complemented by DNA or RNA-
based biomonitoring data (Table 2). These sources provide the
fine-scale long-term monitoring data at relevant temporal
scales, required to detect and diagnose trends related to these
Table 2. Examples of new data applications for typical or high-profile drivers and pressures upon riverine systems
Driver or pressure Ecological processes affected Big data sources to improve assessments
Flood impacts Biodiversity Spatial extent
Re-colonisation Changes in geomorphology
Succession Turbidity
Nutrient dynamics
Drought impacts Changes in biological community structure Wetted area
Biodiversity Flow
Traits structure Temperature
Dispersal Catchment vegetation
Migration Riparian and wetland vegetation
Succession Chlorophyll
Primary production
Water quality
Short-term, point-source impacts Biodiversity Continuous monitoring (sondes) of standard analytes
Water quality On-line ion-specific analytes
Re-colonisation Passive samplers
Connectivity DNA/RNA-based biomonitoring data (e.g. macroinvertebrates)
Dispersal and migration (mixing zones) UAV and telemetry
Erosion and sedimentation
Long-term, diffuse-source impacts Biodiversity Satellite (Landsat, MODIS)
Water quality Land use mapping
Re-colonisation Water catchment management
Connectivity Soil geology
Dispersal and migration (mixing zones) Biodiversity (GBIF)
Erosion and sedimentation
Biological invasions Biodiversity eDNA (fish, etc.)
Pathogens and disease Extent of weeds (remotely sensed)
Water quality
Dispersal or migration (e.g. barriers caused by
weed invasion)
Land-use change Processes: decomposition, erosion,
sedimentation, nutrient dynamics
Land clearing
Effects on biota or biodiversity Land use mapping
Water quality Vegetation type mapping
Erosion (extent, turbidity)
Habitat fragmentation
Climate change Changes in biological community structure DNA or RNA-based biomonitoring data
Biodiversity Habitat loss
Traits structure Vegetation type
Dispersal or migration Erosion (extent, turbidity)
Succession Habitat fragmentation
Primary production (P/R) Hydrology
Water quality
Nutrient cycling
H Marine and Freshwater Research K. A. Dafforn et al.
Table 3. Existing geospatial data types currently available for assessing multiple stressors across different temporal and spatial scales for aquatic ecosystems: scales, forms and reliability
Data types are classified by use in measures of 1, drivers; 2, stressors; and 3, states. Web links include general and specific examples. Reliability: 1, information used directly from measurement source;
2, information derived from model using data from measurement; 3, industry source; 4, data quality relies on reporting frequency and object or target (possibly patchy or infrequently collected); 5, potentially
outdated data, collected from a significant time ago; and 6, quality depends on training undertaken (citizen science). Form: QN, quantitative; QL, qualitative; PR, processing required; P, processed; R, raw.
Data types are further classified according to the aquatic ecosystem they can be applied to (Y, yes; P, proxy)
Use Data types Example data source Spatial resolution Temporal resolution Reliability Form Ocean Coast Estuary Lake River
1 Landsat 30 m 16 days 1 PR, QN Y Y Y Y Y
1 MODIS 250–1000 m 1 day 1 PR, QN Y Y Y Y Y
,1.1 km 3 days 1 PR, QN Y Y Y Y Y
1.1 km 1 day
1 Digital elevation models: 10 m–90 m Periodic 1 PR, QN Y Y Y Y Y
Regional DEM
1 Very high resolution: 2–5 m 1 day 1 PR, QN Y Y Y Y Y
IKONOS http://www.satimagingcorp.
QUICKBIRD http://www.satimagingcorp.
1 LiDAR 2–5 m 1 day 1 PR, QN Y Y Y Y Y
1 SWOT (in 2019)
50–100 m Twice every 21 days 1 PR, QN Y Y Y Y Y
1 Water observations from
LANDSAT (,25 m) Historical archives 1 PR, QN Y Y Y Y Y
1 Earth Observing System Data
and Information System
(EOSDIS) Multiple Multiple 1 PR, QN Y Y Y Y Y
1 Sentinel missions (2014/
1 (LiDAR)–30 m ,1 h–27 days PR, QN Y Y Y Y Y
1, 3 Land cover
1 km 1 year 2 P, QN P P P P P
Big data for assessing aquatic ecosystems Marine and Freshwater Research I
Table 3. (Continued)
Use Data types Example data source Spatial resolution Temporal resolution Reliability Form Ocean Coast Estuary Lake River
1 Land use
1 km 1 year 2,4 P, QN P P P P P
1, 3 Vegetation data
0.5–1 km 1 day-1 year 2 P, QN P P P P
1, 3 Vegetation type
1–100 km 1–10 years 2 P, QL P P P
1, 2 Ocean Data
Multiple Multiple 2,4 P, QN Y
1, 2 Water quality 100 m–1 km 1 day–1 year 2,4 P, QN Y Y Y Y Y
1, 2, 3 Water and catchment man-
agement utility
2, 3 Lakes (GloboLakes,
100 m–100 km mixed 2,4 P, QN Y
1 Census data
5–100 km 4 years 5 P, QN P P P P
1 Infrastructure
1–100 m Periodic 5 P, QN P P P P
1, 2 Agriculture and forestry 30 m–1 km 1 year 4 P, QN P P P P
2 Pesticide spraying 30 m–1 km 1 year 4 P, QL P P P P
1 Fires
1 km–100 km 1 day 4 P, QN P P P P
3 Biodiversity (GBIF) Point data to large scale Point data 4 P, QN Y Y Y Y Y
1 Soil geology
1–10 km 10–30 years 5 P, QN P P P
1 Climate
Observed Point data Daily 1 P, QN Y Y Y Y Y
1–10 km 1 day–10 years 2 P, QN Y Y Y Y Y
1 Evapotranspiration
5 km 1 day–1 year 2 P, QN Y Y Y Y
1 Air quality Point data 1 h–28 days 1 P, QN Y Y
J Marine and Freshwater Research K. A. Dafforn et al.
1 Ocean global monitoring
Point data 1 day 1 P, QN Y Y
1 Ocean thermal monitoring
(sea surface temperature)
1 km 1 day 2 P, QN Y
1 Land thermal monitoring
(land surface temperature)
1 km 1 day 2 P, QN Y
1 Bathymetry data
2 Airborne (contaminants,
Point data 1 day–1 year 3 P, QN
2 Flux tower data (GG, carbon
Point data 1 day–1 year 3 P, QN
1, 2 Geological database and
metadata Point data Point data 4 P, QL Y Y Y Y
1, 2 Ice mapping and data
Point data–1 km 1 day-seasonal 4 P, QN Y Y Y Y
1 Snow melt
1 km 1 week–1 month 4 P, QN
1 Snow cover http://modis-snow-ice.gsfc.
1 km 1 week–1 month 4 P, QN
1, 3 UAVs (e.g. wildlife monitor-
ing, physical habitat,
Point data Demand 1,2 P, PR, QN, QL
Monitoring - government
2 Chemical/water quality mon-
itoring networks
Point data Demand 1,3 P, QN Y
2 Coliforms (beach watch,
water utilities, sewerage
Point data 1–7 days 1,3 P, QN Y
Water use (e.g. abstractions,
.10 km 5 years 1 P, QN Y
Big data for assessing aquatic ecosystems Marine and Freshwater Research K
Table 3. (Continued)
Use Data types Example data source Spatial resolution Temporal resolution Reliability Form Ocean Coast Estuary Lake River
2 Biomonitoring networks
Point data Highly variable 4,6 P, QN Y
1,2 Hydrometric networks Point data 1 day 1 P, QN Y
2 Coastal/estuarine monitoring 100 m–100 km 1 day–1 year 4 Y Y
2 Fish stocks, plankton
10 100 km 1 year 4 P, QN Y Y Y Y
1, 2, 3 Broad biological, physical,
and socioeconomic datasets.
Multiple 1 day to 1 year Mixed Mixed Y Y Y Y
Monitoring - research and
citizen science
2, 3 Research and site assessment
Point data 4,5 QL, R Y Y Y
3 Marine mammal tracking Point data Demand 4 P, QN Y Y Y
3 Biological Records Centre Highly variable Highly variable 4,5,6 QN-QL, R-P Y Y
3 CBEMN Highly variable Highly variable 4,6 QL, R Y Y
3 Bird count Point data 1 year 4,6 P, QN Y Y Y
3 Bird migration Point data 1 year 4,6 P, QN Y
3 Frog watch http://frogwatch.fieldscope.
org/v3 (USA)
Point data 1 year 4,6 P, QN Y
english/frogwatch/ (Canada)
3 Vegetation phenology Point data Highly variable Y Y
3 Apps (Whale, Koala)
Point data Highly variable 4,6 QL, R
3 Grasshoppers iRecord
Point data Highly variable 4,6 QL, R
1,2,4,5 P, QN Y
1 UK Environmental change
network 2,4,5
1, 3 AUSRIVAS 1,2,4,5 P, QN Y
Anthropogenic activities and
1 Marine parks, and levels
One off 4,5 P, QN Y Y Y Y Y
1 Fragmentation 1 m–10 km 1 year 2,4,5 P, QL
L Marine and Freshwater Research K. A. Dafforn et al.
human-induced changes. Similarly, many data sources are now
able to identify potential impacts from land-use change
(Table 2) and include land use and vegetation mapping over
large spatial scales (Table 3). Furthermore, these human-
induced changes need to be considered in relation to climate
change, and shifting baselines may create challenges for
assigning a reference state. The extensive datasets now available
to describe long-term trends in ecological condition (e.g. sat-
ellite data on sea surface temperature and chlorophyll-a, map-
ping of vegetation type and habitat loss together with ongoing
biodiversity surveys; Tables 2, 3), will be crucial for interpreting
impacts, for example, as a consequence of habitat shifts induced
by climate warming.
Earth observation technologies: data opportunities,
challenges, and needs
Most earth observation techniques provide information on dri-
vers and sometimes state, and range from global to sub-meter
in spatial resolution, and time scales of over 3 decades at daily
to annual resolution. These data streams are too numerous to
exhaustively list, and here we provide examples of data types
and streams that are available or becoming available in the
near future ( Table 3). The liberalisation of satellite data policy
(e.g. Landsat archive) and the advent of new high-performance
computing capabilities (e.g. the National Computational Infra-
structure, home to the southern hemisphere’s fastest super-
computer and file systems,, accessed 7 August
2015) enable us to systematically process high volumes of
geospatial data across large scales (Tulbure and Broich 2013).
The trends in environmental applications for big data have
led to the development and usage of open-data repositories
(e.g. DataONE:, accessed 7 August
2015), and usage of open-source software for data analysis
(rOpenSci initiative:, accessed 7 August
2015). Moreover, emerging technologies (e.g. drones) and
new satellites, sensors for characterising broad classes of phy-
toplankton (e.g. flow cytometry in the marine environment:
specifications-aoa/, accessed 7 August 2015), and processing
platforms (European Space Agency’s Sentinels, NASA’s SWOT,
Tandem-X and -L) will continue to expand our capabilities to
monitor (observe, analyse and quantify) earth-systems dynamics
in the bio-, litho-, hydro- and cryosphere. The increasingly higher
spatial and temporal resolution provided by earth observation
technologies will provide biological, physical and chemical
information at scales required to improve biological monitoring
and assessment and the discrimination of multiple stressors.
The requirements for spatial and temporal resolution of data
will differ depending on assessment purpose (e.g. annual pre-
dictions for large bioregions or daily predictions for an estuary)
and whether they are intended for input to models or as proxies
of state. Drivers include, for example, human activities (industry
activity, development, urbanisation) and climate events
(drought, storms), and require information on spatial intensity
(e.g. of industry activity) and temporal duration of change in
state. For example, baseline geology does not change signifi-
cantly on a temporal scale of seasons and years, but land use
changes seasonally for agriculture, and rainfall and river levels
1 Protected areas
1 Groundwater mapping Point data 1 month–1 year 4 P, QN Y
1 Irrigation networks 10 m–10 km 1 month–1 year 4 P, QN Y Y
1 Stormwater networks 1 m–1 km 1 month–1 year 4 P, QN Y Y Y
1 Shipping traffic, volumes or
1 km–1000 km 1 month–1 year 4 P, QN Y Y Y Y Y
1 Ballast waters 1 km–1000 km 1 month–1 year 4 P, QN Y Y
1 Dredging 1 km–1000 km 1 month–1 year 4 P, QN Y Y
1 Tourism 1 km–1000 km 1 month–1 year 4 P, QN Y Y Y Y Y
1 Hazardous substance
1 km–10000 km 1 month-1 year 4 P, QN Y Y Y Y Y
Big data for assessing aquatic ecosystems Marine and Freshwater Research M
change from daily to weekly. Geospatial maps, remote sensing,
and monitoring networks can already provide much of these data
for regional geology, land cover and river levels respectively.
Websites like ‘’ increase the accessibility to water
and catchment-management-utility tools (databases and mod-
els) for broad datasets (e.g. climate, land use, soil hydrology,
stream flow, urban water, water balance, water quality).
We are well placed to make use of existing spatial and
temporal geospatial data for biological monitoring, but there
are still data challenges, which include: (1) availability of
metadata; (2) data ‘openness’ in terms of data portals; and (3)
public access and ownership of the data. Large amounts of high
value, distributed information are collected by individual
researchers but this information needs to be shared before its
collective power needs can be harnessed (Hampton et al. 2013).
Furthermore, in many parts of the world, big data are proprie-
tary; availability of data in different formats and different
repositories that are not necessarily compatible with one anoth-
er; and existing data (e.g. Landsat) still need pre-processing
before being readily usable. Many existing datasets are proxies
rather than direct measurements of the process of interest
e.g. land cover as a proxy for the spatial scale of human activity
and related drivers. Monitoring of the state of the environment
(i.e. ecosystem health) is also not yet feasible at the optimal
resolution for desired temporal and spatial scales. In the context
of ecosystem risk assessment, there are few geospatial datasets
that provide both structural and functional biological measure-
ments and also few that provide the high frequency needed for
chemical measurements (e.g. pesticides). As a result, ecosystem
risk assessment remains largely reliant on proxies rather than
direct measurements.
Regarding chemical stressors, routine monitoring of the
.10 000 man-made industrial and household chemicals is not
feasible in many countries. However, improved use of proxies
(e.g. basing loads of pesticides on agriculture intensity, or heavy
metals on industry discharge data) and improvements in pas-
sive-sampling techniques to provide time-integrated concen-
trations of all major classes of chemical stressors (e.g. Vrana
et al. 2005) may increasingly provide suitable model inputs for
chemical stressors. Improvements in data networks already have
the potential to provide real-time information on habitat stres-
sors such as river-water levels as well as physical–chemical
stressors such as suspended solids concentrations and water
quality (electrical conductivity, pH, dissolved oxygen etc.).
Continuous or remotely sensed environmental data are a key
requirement for improved stressor-response modelling and
(with or without telemetry) early detection capability. Continu-
ous (in-situ) or remotely sensed stressor and response measures
or surrogates, include at least electrical conductivity, water
temperature, turbidity, chlorophyll a, DO, and continuous or
remote-monitoring techniques for specific contaminants are
also improving but limited in coverage and by often inadequate
detection limits (e.g. Vrana et al. 2005; Dong et al. 2015).
To couple with earth observation technologies, inputs to
models require greater use of proxies (e.g. changes in land use
providing a proxy for changes in pesticide inputs, or metal-
industry type and density a for metal-contaminant inputs).
Stressor and state vary more temporally and spatially than the
drivers; however, most data are snapshots in space and time,
which reflect ‘here’ and ‘now’. Thus, currently the assumptions
based on best professional judgement often dictate the outcomes
when using these data to extrapolate models to larger temporal
and spatial scales.
The availability of observation technologies for monitoring
of stressors (e.g. contaminants) and more detailed aspects of
state (e.g. biodiversity, abundance) are far fewer than those for
monitoring the drivers. Remote sensing of key wetland-plant
species, forms and community types is a notable exception for
habitat and biodiversity assessment (e.g. Adam et al. 2010).
Currently there is a strong reliance on local datasets with small
coverage of species. However, improvements in image-analysis
systems, often associated with videography and passive acoustic
monitoring are expected to improve monitoring of larger organ-
isms (e.g. birds (LaRue et al. 2014) and even marine mammals
(Fujioka et al. 2014)). Next-Generation-Sequencing (NGS,
‘omics’, ecogenomics techniques) approaches are also increas-
ing the availability of biological information for monitoring
programs, and may potentially lead to semi-automated biodi-
versity monitoring in some environments e.g. DNA metabar-
coding where multiple species in a bulk sample can be identified
from modern or ancient DNA in an automated fashion (Taberlet
et al. 2012). Here we present an overview of molecular tools
for bioassessment, but for a detailed discussion of emerging
ecogenomic technologies and analyses please see Chariton et al.
(2015) in this volume.
Molecular tools: data opportunities, challenges, and needs
The rapid development of molecular techniques for taxonomic
identification, and the advent of next-generation DNA or RNA
sequencing (NGS) technologies, along with associated
advancement in data generation and analysis methods, have
made molecular analyses both fast and cost-effective (Shokralla
et al. 2012
). Thus, an opportunity now exists to develop a
for biomonitoring at all scales, which is both time and
cost-efficient. Recent studies have used NGS approaches to
study biodiversity in various habitats and taxonomic groups,
from all domains of life (Hajibabaei 2007). The potential to use
these technologies to extract species-level information on key
bioindicator insect species from aquatic-biomonitoring samples
was demonstrated using a combination of cytochrome c oxidase
(COI) DNA barcodes linked to a locally generated barcode
reference library (Hajibabaei et al. 2011).
The advent of NGS tools, which can extract information
rapidly and cost-effectively from environmental samples, offers
the promise of a solution to the problems associated with the
time-consuming sorting and identification of organisms, and the
workflow bottleneck it presents for aquatic-monitoring pro-
grams. Moreover, the data generated from molecular methods
have new properties (Baird and Hajibabaei 2012). They contain
a mix of ‘named’ operational taxonomic units (OTUs) (i.e. those
DNA sequences that can be linked to a Linnean taxonomic name
from a relevant database such as GenBank or the Barcode of
Life Database) and ‘unnamed’ OTUs (i.e. DNA sequences that
can be placed in a phylogenetic context but have not previously
been deposited in a database) (Baird and Hajibabaei 2012).
DNA-based information that could be extracted from biomoni-
toring samples offers great change in the immediacy, accuracy
N Marine and Freshwater Research K. A. Dafforn et al.
and quantity of information, which can be obtained, potentially
without sacrificing current biomonitoring infrastructure invest-
ment (Baird and Hajibabaei 2012).
How to deal with these new sources and types of information
within the biomonitoring context presents a major challenge,
which we discuss briefly here (but see Chariton et al. 2015). The
large datasets that are generated by NGS create a disparity
between the small sample sizes and large numbers of measure-
ments, which may increase the false detection (Type 1 error)
rate. A diverse range of tools exists to address these statistical
difficulties including the commonly used false detection rate
(FDR) corrections, which rely on Bonferroni-based calculations
(Benjamini and Hochberg 1995; Waite and Campbell 2006).
Post-sampling strategies for molecular data to assign taxonomic
information to sequences are also diverse and the attempts to
standardise processing and analyses are challenging because
of the rapid development of this field. A range of processing
software packages exists to remove duplicate sequences intro-
duced during the PCR amplification process, to cluster
sequences based on similarity and more accurately estimate
phylogenetic composition. Denoising and chimera removal are
additional steps to correct problems in sequence data and
improve biological accuracy. These different approaches to
preparing the raw sequences will affect diversity estimates to
some extent, but the overall trends, which explain relationships
with drivers or stressors, remain the same (Morgan et al. 2013).
Furthermore, many of the problems that exist for processing
molecular data are somewhat ephemeral as platforms rapidly
evolve. This ephemerality points to other drawbacks with using
rapidly developing technologies, the potential difficulty in
comparing across studies and the potential of longer-term
redundancy. When different studies use alternative marker
genes or primer sets to generate amplicons, meta-analyses must
be carried out based on either existing taxonomy (likely of
a fairly low resolution), or by potentially time-consuming
re-analysis of data against a curated database containing all
required elements (e.g. Full length 18S rRNA gene sequences,
which cover different variable regions used in different studies).
However, such databases are rapidly being generated (e.g. SILVA,
BOLD) and the computational requireme nts for r e-analysis
being con tinually improved.
Work has begun on molecular freshwater biomonitoring
(Hajibabaei et al. 2011; Pilgrim et al. 2011; Carew et al.
2013), integrating these new approaches with existing frame-
works; their application in large-scale monitoring is the future
for bioassessment. Other studies in aquatic systems explored the
use of NGS to estimate biodiversity in marine meiofaunal
communities (Creer et al. 2010) and similar approaches were
used on marine sediments, in order to study the response of
Australian coastal biota to contaminants (Chariton et al. 2010;
Sun et al. 2012; Sun et al. 2013 ; Chariton et al. 2014; Dafforn
et al. 2014). Another Australian study involving chironomid
midges (Chironomidae) demonstrated that most of these species
could be identified by using COI and Cytb amplicon sequences
that were matched against DNA reference databases of Sanger
sequenced voucher species (Carew et al. 2013). In the context
of international river biomonitoring, molecular analysis suc-
cessfully obtained species-level information from samples
to compare patterns of taxon occurrences in urbanised and
conservation landscapes (Hajibabaei et al. 2011). These studies
demonstrate the potential to extract rich biodiversity data from
biomonitoring samples using NGS technologies.
The methods for biomonitoring of stream condition have
struggled to meet the needs in regards to diagnosing the causes
poor condition. Environmental managers require objective
methods to determine why stream condition is degraded, to
enable appropriate remedial actions. New developments in
diagnostic-biomonitoring indices, which are based on the inver-
tebrate traits (not just their taxonomic identity), are promising
(Scha¨fer et al. 2011). Such diagnostic indices may require
taxonomic identification at the lowest taxonomic unit possible.
This is because of variation in the tolerance to stressors exhib-
ited by various taxa within the coarser groupings, such as
taxonomic families. Molecular techniques could generate the
species-level information required to deliver timely break-
throughs in terms of diagnostic assessment or the large-scale
adoption of diagnostic tools (Jones 2008; Baird and Hajibabaei
2012). DNA techniques and the increase in data generated could
increase our ability to diagnose factors degrading aquatic
Basing aquatic ecosystem management purely on biomoni-
toring data, i.e. the present practice, limits inference on stressor
effects on ecosystem services (Tolonen et al. 2014). There has
been an increasing emphasis on using ecological-trait analyses
to determine which stressors are influencing the functioning of
aquatic communities (Scha¨fer et al. 2011). Multi-taxa responses
to human stressors (Johnston et al. 2008; Johnston et al. 2009)
are required to assess the likely functional consequences of
species loss. Future research may also reveal functional genes
associated with suites of taxa, and thus directly assess ecosys-
tem-functional consequences using molecular analysis of bio-
monitoring data.
The ecological values associated with aquatic health are
based on maintenance of ecological integrity (including ecosys-
tem functioning). However, the links between data (usually
based on taxonomic richness) and ecosystem-function data
(based on functional traits) are not always demonstrated.
Molecular analysis provides information on a wide range of
biota, both living and dead. There is, therefore, an increased
amount of information available from molecular techniques,
which can allow us to distinguish responses over longer time
periods and greater spatial scales, as well as differentiating
between multiple stressors. Although the potential for molecular
approaches to sample living and dead organisms creates a useful
time-integrated response to stressors, these data may provide
challenges when interpreting a recent impact, as any lethal
effects may be masked. However, it is also possible to retrieve
only the signature of ‘active’ organisms using RNA approaches
rather than DNA.
Case study: using big-data sources to improve biomonitoring
As discussed above, there are two distinct approaches to
bioassessment programs; broad-scale monitoring, and targeted,
finer-scale, intervention monitoring. However, in some cir-
cumstances the focus on one without the other will make it
difficult for managers to understand and tackle major manage-
ment issues such as the effects of climate change, significant
changes in land use on waterways (including intervention
Big data for assessing aquatic ecosystems Marine and Freshwater Research O
programs) and the widespread effects of multiple stressors.
If only broad-scale monitoring were undertaken, the data may
be of insufficient detail to detect small changes resulting
from management actions at specific locations. Finer-scale
intervention monitoring alone may be effective for detecting
and assessing specific (especially point-source) human dis-
turbances. However, if such finer-scale monitoring was the only
means of data collection for broader land management issues then
lack of relevant long-term data across broader scales could
obscure shifts in baseline conditions. This, in turn, could obscure
responses to management actions, or fail to detect unforeseen
impacts of major stressors at those broader scales. Without broad-
scale, surveillance monitoring of aquatic conditions, any unex-
pected declines in river health may pass undetected and hinder
swift remediation. We examine some typical or high-profile
drivers and pressures upon riverine systems and monitoring
programs at finer and broad-scales (Fig. 5) to identify potential
improvements in approaches offe red by ecogenomics and
earth observation and other remote-data-acquisition technologies.
Improving the habitat template for RIVPACS approaches
The reference-condition approach to biological assessment is
widely used in many countries to detect human induced change
in freshwater systems. For example, in the United Kingdom
(RIVPACS; Wright et al. 2000), Australia (AUSRIVAS;
Simpson and Norris 2000), Canada (CABIN; Reynoldson et al.
2000; Armanini et al. 2013), and the United States (where a mix
of multi-metric and RIVPACS-type assessments are used;
Hawkins et al. 2010) RIVPACS-type approaches use models to
predict site-specific benthic-invertebrate assemblages expected
at sites (i.e. the assemblage that should be there in the absence
of human-induced stress). New spatial tools are becoming
increasingly available, particularly GIS and remote sensing
tools, along with an array of catchment-scale map layers
describing attributes, such as geology, land use, vegetation type,
and climate (Frazier et al. 2012). These data offer alternative
approaches to identifying reference sites (Yates and Bailey
2010) and are sources of potential new predictor variables
(Armanini et al. 2013), which could potentially improve pre-
dictive ability of the models (Ostermiller and Hawkins 2004).
Armanini et al. (2013) constructed a similar RIVPACS-type
predictive model for Atlantic Canada but instead of relying on
ground-based environmental data, generated a robust model using
GIS-based rather than local, ground-measured, habitat variables.
Armanini et al. (2013) highlighted the advantages in this approach
for future large-scale implementation of river biomonitoring,
i.e. a standardised approach with global application.
Broad-scale monitoring
Improving empirical models
Successful strategies for sustainable water use will depend on
our ability to describe long-term trends and responses of stres-
sors in natural ecological systems in a cost-effective yet rigorous
way. A major challenge is assessing the level of environmental
change in natural systems before these transformations irre-
versibly change fundamental ecological processes. Broad-scale
assessments that use macroinvertebrate data, such as RIVPACS-
type approaches, may trade-off attributes of sensitivity (to
impact detection) for spatial coverage (ANZECC/ARMCANZ
2000). This may be exacerbated if spatial coverage of reference
Temporal scale of monitorin
Spatial scale
Single Months Years Decades Centuries
BACI studies
Climate change impacts
Drought impacts
point source
CSG, large coal mines, AMD
long term studies
National monitoring sentinel reference sites
Landuse change
Targeted site-region-specific monitoring
Fig. 5. Spatial-temporal scales at which monitoring should be conducted to distinguish the effects of typical or
high-profile drivers and pressures. Adapted from S. J. Nichols, L. A. Barmuta, B. C. Chessman, P. E. Davies,
F. J. Dyer, E. T. Harrison, C. P. Hawkins, I. Jones, B. J. Kefford, S. Linke, R. Marchant, L. Metzeling, K. Moon,
R. Ogden, M. Peat, T. B. Reynoldson and Ross M. Thompson, unpubl. data.
P Marine and Freshwater Research K. A. Dafforn et al.
sites used to develop the predictive models was not adequate to
capture the variability in the ‘reference condition’. For example,
jurisdictional models may encompass enormous geographical
areas with significant climate variation (e.g. in Australia, the
entire states of Western Australia and Queensland, encom-
passing tropical and temperate climes). If the resolution of
sampling sites does not capture these variations, the implica-
tions for models may be that test sites are not correctly matched
to a reference condition, thus hindering the sensitivity to detect
impact. Thus, accuracy in broad-scale assessments can poten-
tially be improved with greater density of reference sites in
datasets for model development. Currently, obtaining such data
involves expense in accessing multiple sites for sampling as well
as the time-consuming process of sorting and identifying the
invertebrates from the sample matrix collected at each reference
site. Together, these costs can constrain timely water quality
assessment (Haase et al. 2010).
Increasingly, it is becoming difficult to obtain the taxonomic
expertise required to accurately identify freshwater or marine
macroinvertebrate taxa, particularly to species level (Buyck
1999). Traditional taxonomic approaches can also introduce
errors in the sample sorting and identification phase that affect
data quality. Potential sources of error can include: failure to
remove all desired organisms in a sample or sub-sample and bias
in the selection of specimens from the sample (Humphrey et al.
2000); inability to identify damaged specimens; and errors in
species identifications (a particular problem with early instars).
Most bioassessment programs have developed quality-control
procedures to minimise these sources of errors, but they cannot
eliminate them and these quality-control measures add to
program cost. In freshwater bioassessment, the costs are bal-
anced with the need for sufficient information to distinguish
impaired sites (Jones 2008), but because of the need to minimise
bioassessment costs and to expedite the process, only a sub-set
of the collected sample is used and taxonomic identification
is further restricted to coarse (e.g. family-level) resolution. The
urgent need is to develop sensitive, cost-effective, robust bioas-
sessment sampling and processing methods that can extract
sufficient information for bioassessment and enable the
advancement of diagnostic assessment tools. Traditional mor-
phological taxonomy cannot generate the timely and robust data
required for new breakthroughs in bioassessment techniques
(Baird and Hajibabaei 2012; Dafforn et al. 2014).
Development of sensitive, more cost-effective, robust bioas-
sessment sampling and processing methods may also help in
improving model performance by removing socio-political
constraints impeding a common approach to regional model
development, for example:
(1) Model boundaries should be based on natural physio-
graphic, bioregional or drainage boundaries.
(2) Within improved model boundaries from the above, stan-
dardising methods across jurisdictions would enhance the
ability to combine datasets and extend models for reliable,
broader State of Environment assessments and reporting.
Such standardisation would also reduce inter-operability
biases and errors at various steps in the protocol including,
for example, taxonomic identifications. Standardisation
could also consider not only adoption of a more common
predictor-variable suite, but a reduced reliance on on-
ground measured variables more prone to subjectivity in
selection and measurement.
Broad-scale bioassessment application in large countries
such as Australia and Canada brings with it remoteness of
sampling sites and access problems for extended periods
of the year. Any capacity to reduce expensive access costs
(e.g. helicopter use) and time in the field will result in more sites
sampled (i.e. increase reference-site density, see above). This
may be achieved through remotely acquired proxies for ground-
measured environmental variables and by removing on-ground
processing of biological samples.
RIVPACS-type predictive models provide a powerful plat-
form to demonstrate improved performance through the refine-
ments offered by the new science tools listed above (Table 3).
Many of the identified weaknesses listed above may be poten-
tially redressed.
Improving the biological response measures
Next-generation molecular sequencing technologies, with
associated new data generation and analysis methods, are rap-
idly developing for taxonomic identifications. The potential to
use these technologies to extract species-level information on
key freshwater biomonitoring species from samples has been
demonstrated (Baird and Hajibabaei 2012). Potential applica-
tions redressing limitations identified above, include:
(i) Rapidity of data generation.
(ii) Generation of a local barcode reference libraries.
(iii) Removal of errors associated with traditional taxonomic
identifications based on morphology.
(iv) Generation of highly sensitive species-level information
from multiple taxonomic groups, with improved diagnostic
value. This includes a basis for trait development.
(v) Applicable to impact and conservation assessments, includ-
ing detection of cryptic invasive biota,
(vi) Identification of functional genes.
(vii) Cost-effective, systematic monitoring of test sites and
sentinel or reference sites over time to assess effects of
broad-scale land-use changes and climate change, as
described above.
(viii) The collection of biological data sets on a scale to match
that of environmental data.
Demonstration programs for potential improvements to
RIVPACS or AUSRIVAS models are now feasible, based on
the approaches described above.
databases defining drivers (human actions),
pressures, stressors and state (habitat)
All impact, conservation and biodiversity assessments con-
ducted at broad scales will benefit from the availability of
continuous or remotely sensed environmental data and biolog-
ical data collected at a similar scale. Driver-Pressure-State
measures derived from the same databases should ensure that
test-site assessments more reliably and accurately reflect human
agents of change. Further, a move from spot measurements of
water physical–chemical variables in particular, to continuous
Big data for assessing aquatic ecosystems Marine and Freshwater Research Q
or remotely sensed measures or surrogates, combined with
discharge or water-availability data, will invariably lead to new
or improved empirical water-quality models. Accuracy and
flexibility in water-quality objective setting, when this process
is based upon characteristics of the reference condition
(e.g. ANZECC/ARMCANZ 2000), are markedly improved
when water-quality variables vary naturally and in predictable
ways to discharge.
Early detection capabilities are also enhanced with new
technologies and data collected at the broad-scale. This can also
potentially overcome problems of accessing remote areas. For
example, improved stressor-response relationships from above
(empirical or conceptual models) may be combined with
remotely sensed or telemetered water quality etc. data to identify
emerging stressor hot-spots that require more detailed study.
Similarly, using molecular data, new biological invasions pos-
ing ongoing threats to ecosystems can be identified at an early
stage, before significant spread and before control becomes
expensive or impossible.
Finer-scale intervention monitoring
Many of the improvements in approach offered by ecogenomics
and earth observation and other remote-data-acquisition tech-
nologies identified above, including those for broad-scale
assessments, equally apply at the finer-scale. For biological-
response measures where species-level data for many taxo-
nomic groups become available, there remain challenges in
acquiring relative abundance information from new molecular-
sequencing technologies. Overcoming such challenges may
provide additional information to further improve impact detec-
tion, and better inform management decisions.
Fine-scale intervention monitoring may be better placed for
the application of technologies and approaches, which are
currently prohibitively expensive to consider at the broad-scale.
Testing of the methods, including deployment, testing, calibra-
tion and modelling, may lead to new or improved conceptual and
empirical models, which may be scaled-up for broader-scale
consideration and assessment. These techniques may include
continuous measurements from sondes, passive samplers and
on-line monitoring of a variety of physical (DO, salinity) and
chemical stressors. New unmanned aerial vehicle (UAV), drone,
transmitter and camera technologies (e.g. Turner 2014) might
also be better tested at such finer scales. These methods provide
greater resolution in some key variables including habitat
characterisation and may be powerful tools in monitoring the
success of local restoration efforts. Early detection capability is
also enhanced at this spatial scale. For example, with improved
stressor-response relationships using new technologies and data
sources, combined with telemetry, it is possible to determine
when water-quality limits etc. are reached in compliance and
licence settings.
Biological monitoring and assessment have traditionally
implemented local-scale site collections to accumulate data and
information about pressures on the natural environment and
relate these to potential stressors. In the past, sampling efforts to
measure physical, chemical and ecological change in the envi-
ronment have been constrained by costs. The increasing
deployment of satellites and collection of large-scale geospatial
data are now allowing us to comprehensively consider a wider
range of drivers of biogeographical patterns (e.g. geology and
climate). Furthermore, biological data-streams have tradition-
ally been poor, but the advent of high-throughput genomic-
sequencing techniques now allow us to collect biological data-
sets on a similar or greater scale to physical and chemical
Many important gaps and research areas still remain to be
explored by big data techniques. Remote sensing and molecular
tools provide large datasets of chemical, physical and now
biological measures for improved model calibration and predic-
tion. In future, they will also need to provide spatially and
temporally explicit information about functions and processes,
which thus far have been difficult to quantify over large spatial
and temporal scales, but are crucial building blocks in risk-
assessment models. Furthermore, environmental and ecological
changes increasingly need to be considered in the context
of global climate change. Hence, we require more extensive
datasets to describe long-term trends in ecological condition, so
that we can differentiate human impacts under shifting base-
lines. We also highlight that although proxies for pesticide
and contaminant inputs (e.g. land use and industry types) can
provide useful information for risk assessments, this can be
improved with more direct measurements and call for increased
use and development of tools such as passive samplers, traits-
based analysis and metabarcoding. These direct measurements
should be coupled with experimental studies of interacting
stressors to provide evidence for causal interactions.
Big-data tools will continue to evolve, to address these
research needs, and strategies to tackle challenges associated
with big-data sources are at the cutting edge of research. Thus,
the unique opportunities provided by emerging molecular tools
and remote-sensing technologies seem likely to revolutionise
biological monitoring and assessment in the future.
The authors thank the CSIRO’s Oceans and Atmosphere Flagship, Sydney
Institute of Marine Science and the New South Wales Office of Environment
and Heritage for their financial support of the workshop ‘New diagnostics
for multiply stressed marine and freshwater systems: integrating models,
ecoinformatics and Big Data’. D. J. Baird acknowledges support from the
Natural Sciences and Engineering Research Council of Canada’s Discovery
Grant program. This paper represents SIMS Publication Number 168.
Adam, E., Mutanga, O., and Rugege, D. (2010). Multispectral and hyper-
spectral remote sensing for identification and mapping of wetland
vegetation: a review. Wetlands Ecology and Management 18, 281–296.
Allan, J. D., McIntyre, P. B., Smith, S. D. P., Halpern, B. S., Boyer, G. L.,
Buchsbaum, A., Burton, G. A., Campbell, L. M., Chadderton, W. L., and
Ciborowski, J. J. H. (2013). Joint analysis of stressors and ecosystem
services to enhance restoration effectiveness. Proceedings of the National
Academy of Sciences of the United States of America 110, 372–377.
R Marine and Freshwater Research K. A. Dafforn et al.
ANZECC/ARMCANZ (2000). Australian and New Zealand Guidelines
for Freshwater and Marine Water Quality. (Australian and New
Zealand Environment and Conservation Council and Agriculture
and Resource Management of Australia and New Zealand: Canberra,
Armanini, D. G., Monk, W. A., Carter, L., Cote, D., and Baird, D. J. (2013).
Towards generalised reference condition models for environmental
assessment: a case study on rivers in Atlantic Canada. Environmental
Monitoring and Assessment 185, 6247–6259. doi:10.1007/S10661-012-
Ayre, K. K., and Landis, W. G. (2012). A Bayesian approach to landscape
ecological risk assessment applied to the Upper Grande Ronde Water-
shed, Oregon. Human and Ecological Risk Assessment: An International
Journal 18, 946–970. doi:10.1080/10807039.2012.707925
Baird, D. J., and Hajibabaei, M. (2012). Biomonitoring 2.0: a new paradigm
in ecosystem assessment made possible by next-generation DNA
sequencing. Molecular Ecology 21, 2039–2044. doi:10.1111/J.1365-
Baird, D. J., Van den Brink, P. J., Chariton, A. A., Dafforn, K. A., and
Johnston, E. L. (2015). New diagnostics for multiply stressed marine and
freshwater ecosystems: integrating models, ecoinformatics and big data.
Marine and Freshwater Research, in press. doi:10.1071/MF15330
Baldwin, D. S., Colloff, M. J., Rees, G. N., Chariton, A. A., Watson, G. O.,
Court, L. N., Hartley, D. M., King, A. J., Wilson, J. S., and Hodda, M.
(2013). Impacts of inundation and drought on eukaryote biodiversity
in semi-arid floodplain soils. Molecular Ecology 22, 1746–1758.
Ban, N. C., Alidina, H. M., and Ardron, J. A. (2010). Cumulative impact
mapping: advances, relevance and limitations to marine management
and conservation, using Canada’s Pacific waters as a case study. Marine
Policy 34, 876–886. doi:10.1016/J.MARPOL.2010.01.010
Bayliss, P., van Dam, R. A., and Bartolo, R. E. (2012). Quantitative
ecological risk assessment of the Magela Creek Floodplain in Kakadu
National Park, Australia: comparing point source risks from the Ranger
Uranium Mine to diffuse landscape-scale risks. Human and Ecological
Risk Assessment: An International Journal 18, 115–151. doi:10.1080/
Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery
rate: a practical and powerful approach to multiple testing. Journal of the
Royal Statistical Society B. Methodological 57, 289–300.
Borja, A., Bricker, S. B., Dauer, D. M., Demetriades, N. T., Ferreira, J. G.,
Forbes, A. T., Hutchings, P., Jia, X., Kenchington, R., Marques, J. C., and
Zhu, C. (2008). Overview of integrative tools and methods in assessing
ecological integrity in estuarine and coastal systems worldwide. Marine
Pollution Bulletin 56, 1519–1537. doi:10.1016/J.MARPOLBUL.2008.
Borja, A., Bald, J., Franco, J., Larreta, J., Muxika, I., Revilla, M., Rodrı´guez,
J. G., Solaun, O., Uriarte, A., and Valencia, V. (2009). Using multiple
ecosystem components, in assessing ecological status in Spanish
(Basque Country) Atlantic marine waters. Marine Pollution Bulletin
59, 54–64. doi:10.1016/J.MARPOLBUL.2008.11.010
Bowler, D. E., and Benton, T. G. (2005). Causes and consequences of animal
dispersal strategies: relating individual behaviour to spatial dynamics.
Biological Reviews of the Cambridge Philosophical Society 80,
205–225. doi:10.1017/S1464793104006645
Burton, G. A., and Johnston, E. L. (2010). Assessing contaminated sedi-
ments in the context of multiple stressors. Environmental Toxicology
and Chemistry 29, 2625–2643. doi:10.1002/ETC.332
Buyck, B. (1999). Taxonomists are an endangered species in Europe. Nature
401, 321. doi:10.1038/43762
Carew, M. E., Pettigrove, V. J., Metzeling, L., and Hoffmann, A. A. (2013).
Environmental monitoring using next generation sequencing: rapid
identification of macroinvertebrate bioindicator species. Frontiers in
Zoology 10,
45. doi:10.1186/1742-9994-10-45
Chariton, A. A., Court, L. N., Hartley, D. M., Colloff, M. J., and Hardy, C. M.
(2010). Ecological assessment of estuarine sediments by pyrosequen-
cing eukaryotic ribosomal DNA. Frontiers in Ecology and the Environ-
ment 8, 233–238. doi:10.1890/090115
Chariton, A. A., Ho, K. T., Proestou, D., Bik, H., Simpson, S. L., Portis,
L. M., Cantwell, M. G., Baguley, J. G., Burgess, R. M., Pelletier, M. M.,
Perron, M., Gunsch, C., and Matthews, R. A. (2014). A molecular-based
approach for examining responses of eukaryotes in microcosms to
contaminant-spiked estuarine sediments. Environmental Toxicology
and Chemistry 33, 359–369. doi:10.1002/ETC.2450
Chariton, A., Sun, M. Y., Gibson, J., Webb, J. A., Leung, K. M. Y., Hickey,
C. W., and Hose, G. C. (2015). Emergent technologies and analytical
approaches for understanding the effects of multiple stressors in aquatic
environments. Marine and Freshwater Research, in press. doi:10.1071/
Clarke, R. T., Wright, J. F., and Furse, M. T. (2003). RIVPACS models for
predicting the expected macroinvertebrate fauna and assessing the
ecological quality of rivers. Ecological Modelling 160, 219–233.
Creer, S., Fonseca, V. G., Porazinska, D. L., Giblin-Davis, R. M., Sung, W.,
Power, D. M., Packer, M., Carvalho, G. R., Blaxter, M. L., and Lambs-
head, P. J. D. (2010). Ultrasequencing of the meiofaunal biosphere:
practice, pitfalls and promises. Molecular Ecology 19, 4–20. doi:10.1111/
Dafforn, K. A., Simpson, S. L., Kelaher, B. P., Clark, G. F., Komyakova, V.,
Wong, C. K. C., and Johnston, E. L. (2012). The challenge of choosing
environmental indicators of anthropogenic impacts in estuaries.
Environmental Pollution 163, 207–217. doi:10.1016/J.ENVPOL.
Dafforn, K. A., Baird, D., Chariton, A., Sun, M., Brown, M., Simpson, S.,
Kelaher, B., and Johnston, E. (2014). Chapter one faster, higher and
stronger? The pros and cons of molecular faunal data for assessing
ecosystem condition. Advances in Ecological Research 51, 1–40.
doi:10.1016/B978-0-08-0 99970-8 .00003-8
Davy-Bowker, J., Clarke, R., Johnson, R., Kokes, J., Murphy, J., and
Zahra´dkova´, S. (2006). A comparison of the European Water Framework
Directive physical typology and RIVPACS-type models as alternative
methods of establishing reference conditions for benthic macroinverte-
brates. In ‘The Ecological Status of European Rivers: Evaluation and
Intercalibration of Assessment Methods’. (Eds M. Furse, D. Hering,
K. Brabec, A. Buffagni, L. Sandin, and P. M. Verdonschot.) pp. 91–105.
(Springer: Dordrecht, Netherlands.)
De´ry, S. J., Salomonson, V. V., Stieglitz, M., Hall, D. K., and Appel, I.
(2005). An approach to using snow areal depletion curves inferred from
MODIS and its application to land surface modelling in Alaska.
Hydrological Processes 19, 2755–2774. doi:10.1002/HYP.5784
Dong, Z., Lewis, C. G., Burgess, R. M., and Shine, J. P. (2015). The
Gellyfish: An in-situ equilibrium-based sampler for determining multi-
ple free metal ion concentrations in marine ecosystems. Environmental
Toxicology and Chemistry 34, 983–992. doi:10.1002/ETC.2893
Frazier, P., Ryder, D., McIntyre, E., and Stewart, M. (2012). Understanding
riverine habitat inundation patterns: remote sensing tools and techni-
ques. Wetlands 32, 225–237. doi:10.1007/S13157-011-0229-9
Fujioka, E., Soldevilla, M. S., Read, A. J., and Halpin, P. N. (2014). Integration
of passive acoustic monitoring data into OBIS-SEAMAP, a global bio-
geographic database, to advance spatially explicit ecological assessments.
Ecological Informatics 21, 59–73. doi:10.1016/J.ECOINF.2013.12.004
Glibert, P. M., Fullerton, D., Burkholder, J. M., Cornwell, J. C., and Kana, T. M.
(2011). Ecological stoichiometry, biogeochemical cycling, invasive
species, and aquatic food webs: San Francisco Estuary and comparative
systems. Reviews in Fisheries Science 19, 358–417. doi:
Haase, P., Pauls, S. U., Schindehu
tte, K., and Sundermann, A. (2010). First
audit of macroinvertebrate samples from an EU Water Framework
Big data for assessing aquatic ecosystems Marine and Freshwater Research S
Directive monitoring program: human error greatly lowers precision of
assessment results. Journal of the North American Benthological Society
29, 1279–1291. doi:10.1899/09-183.1
Hajibabaei, M. (2007). The barcode of life initiative. Journal of Phycology
43, 20.
Hajibabaei, M., Shokralla, S., Zhou, X., Singer, G. A. C., and Baird, D. J.
(2011). Environmental barcoding: a next-generation sequencing
approach for biomonitoring applications using river benthos. PLoS
One 6, e17497. doi:10.1371/JOURNAL.PONE.0017497
Halpern, B. S., and Fujita, R. (2013). Assumptions, challenges, and
future directions in cumulative impact analysis. Ecosphere 4, art131.
Halpern, B. S., Walbridge, S., Selkoe, K. A., Kappel, C. V., Micheli, F.,
D’Agrosa, C., Bruno, J. F., Casey, K. S., Ebert, C., Fox, H. E., Fujita, R.,
Heinemann, D., Lenihan, H. S., Madin, E. M. P., Perry, M. T., Selig,
E. R., Spalding, M., Steneck, R., and Watson, R. (2008). A global map
of human impact on marine ecosystems. Science 319, 948–952.
Hampton, S. E., Strasser, C. A., Tewksbury, J. J., Gram, W. K., Budden, A.
E., Batcheller, A. L., Duke, C. S., and Porter, J. H. (2013). Big data and
the future of ecology. Frontiers in Ecology and the Environment 11,
156–162. doi:10.1890/120103
Harding, L. W. (1994). Long-term trends in the distribution of phytoplank-
ton in Chesapeake Bay: roles of light, nutrients and streamflow. Marine
Ecology Progress Series 104, 267–291. doi:10.3354/MEPS104267
Hargett, E. G., ZumBerge, J. R., Hawkins, C. P., and Olson, J. R. (2007).
Development of a RIVPACS-type predictive model for bioassessment of
wadeable streams in Wyoming. Ecological Indicators 7, 807–826.
Harris, G. P., and Heathwaite, A. L. (2012). Why is achieving good
ecological outcomes in rivers so difficult? Freshwater Biology 57,
91–107. doi:10.1111/J.136 5-2427.2011.02640.X
Hawkins, C. P., Olson, J. R., and Hill, R. A. (2010). The reference condition:
predicting benchmarks for ecological and water-quality assessments.
Journal of the North American Benthological Society 29, 312–343.
Hilty, J., and Merenlender, A. (2000). Faunal indicator taxa selection for
monitoring ecosystem health. Biological Conservation 92, 185–197.
Humphrey, C., Storey, A., and Thurtell, L. (2000). AUSRIVAS: operator
sample processing errors and temporal variability implications for
model sensitivity. In ‘Proceedings of an International Workshop held in
Oxford, UK on 16th September 1997’. (Eds J. F. Wright, D. W. Sutcliffe,
and M. T. Furse.) Record number: 0900386622, pp. 143–163. (Freshwa-
ter Biological Association: Ambleside, UK.)
Johnston, C. A., Ghioca, D. M., Tulbure, M., Bedford, B. L., Bourdaghs, M.,
Frieswyk, C. B., Vaccaro, L., and Zedler, J. B. (2008). Partitioning
vegetation response to anthropogenic stress to develop multi-taxa
wetland indicators. Ecological Applications 18, 983–1001. doi:10.1890/
Johnston, C. A., Zedler, J. B., Tulbure, M. G., Frieswyk, C. B., Bedford,
B. L., and Vaccaro, L. (2009). A unifying approach for evaluating the
condition of wetland plant communities and identifying related stressors.
Ecological Applications 19, 1739–1757. doi:10.1890/08-1290.1
Jones, F. C. (2008). Taxonomic sufficiency: The influence of taxonomic
resolution on freshwater bioassessments using benthic macroinverte-
brates. Environmental Reviews 16, 45–69. doi:10.1139/A07-010
Kingsford, R., Curtin, A., and Porter, J. (1999). Water flows on Cooper
in arid Australia determine ‘boom’ and ‘bust’ periods for water-
birds. Biological Conservation 88, 231–248. doi:10.1016/S0006-3207
Lake, P. S. (2000). Disturbance, patchiness, and diversity in streams. Journal
of the North American Benthological Society 19, 573–592. doi:10.2307/
LaRue, M. A., Lynch, H. J., Lyver, P. O. B., Barton, K., Ainley, D. G.,
Pollard, A., Fraser, W. R., and Ballard, G. (2014). A method for
estimating colony sizes of Ade´lie penguins using remote sensing imag-
ery. Polar Biology 37, 507–517. doi:10.1007/S00300-014-1451-8
Leigh, C., Sheldon, F., Kingsford, R. T., and Arthington, A. H. (2010).
Sequential floods drive ‘booms’ and wetland persistence in dryland
rivers: a synthesis. Marine and Freshwater Research 61, 896–908.
McKinley, A. C., Miskiewicz, A., Taylor, M. D., and Johnston, E. L. (2011).
Strong links between metal contamination, habitat modification
and estuarine larval fish distributions. Environmental Pollution 159,
1499–1509. doi:10.1016/J.ENVPOL.2011.03.008
McQueen, D. J., Post, J. R., and Mills, E. L. (1986). Trophic relationships in
freshwater pelagic ecosystems. Canadian Journal of Fisheries and
Aquatic Sciences 43, 1571–1581. doi:10.1139/F86-195
Metzeling, L., Robinson, D., Perriss, S., and Marchant, R. (2002). Temporal
persistence of benthic invertebrate communities in south-eastern
Australian streams: taxonomic resolution and implications for the use
of predictive models. Marine and Freshwater Research 53, 1223–1234.
Morgan, M. J., Chariton, A. A., Hartley, D. M., Court, L. N., and Hardy, C.
M. (2013). Improved inference of taxonomic richness from environmen-