ArticlePDF Available

Can big data tame a “naughty” world?

Authors:

Abstract

The big data revolution is changing the way data is produced, analyzed, and valued. In the environmental sciences, big data has made it onto the agenda through calls to utilize the current data “deluge” more effectively and a desire for more complete measurement. However, a wider philosophical and ethical critique of big data is needed to assess its utility for environmental explanation. We distil three definitions relevant to the environmental sciences, focusing on the characteristics that make data “big,” the methods of analysis used, and the models of explanation favoured by big data analysts. We critically interrogate the new priorities implicit within big environmental data, and for a historical analogue we compare the big data moment in the environmental sciences to the period in the 1970s when systems theory was being invoked as a paradigmatic shift. Like systems theory, big data is poised to become the new lingua franca of many fields of scientific inquiry. Here we echo Barbara Kennedy’s caution that whilst new methods of analysis seem fascinating and promissory, scientists must always be accountable to the “naughty” world in which we live, rather than the clean abstractions that we seek to generate.
Can big data tame a naughtyworld?
Jennifer Ann Salmond
School of Environment, University of Auckland
Marc Tadaki
Department of Geography, University of British Columbia
Mark Dickson
School of Environment, University of Auckland
Key Messages
The implications of the big data revolution for the environmental sciences are potentially significant
and require critical interrogation.
Thematic examination of big data definitions can encourage scientists to consider how big
environmental data may alter environmental scientific priorities.
In the environmental sciences, big data are most valuable when complementary to (and conversing
with) traditional data and approaches.
The big data revolution is changing the way data is produced, analyzed, and valued. In the environmental
sciences, big data has made it onto the agenda through calls to utilize the current data delugemore
effectively and a desire for more complete measurement. However, a wider philosophical and ethical critique
of big data is needed to assess its utility for environmental explanation. We distil three denitions relevant to
the environmental sciences, focusing on the characteristics that make data big,the methods of analysis
used, and the models of explanation favoured by big data analysts. We critically interrogate the new
priorities implicit within big environmental data, and for a historical analogue we compare the big data
moment in the environmental sciences to the period in the 1970s when systems theory was being invoked as
a paradigmatic shift. Like systems theory, big data is poised to become the new lingua franca of many elds
of scientic inquiry. Here we echo Barbara Kennedys caution that whilst new methods of analysis seem
fascinating and promissory, scientists must always be accountable to the naughtyworld in which we live,
rather than the clean abstractions that we seek to generate.
Keywords: Environmental science, big data, methodology, critical physical geography
Les m
egadonn
ees peuvent-elles apprivoiser un monde «r
efractaire »?
La r
evolution des m
egadonn
ees est en voie de modier la mani
ere dont on produit, analyse et valorise le
savoir. Les sciences de lenvironnement accordent aux m
egadonn
ees une attention prioritaire par la
promotion dun emploi plus efcace du «d
eluge »de donn
ees et le souhait de se doter dune mesure plus
exhaustive. Cest toutefois sur la base dune critique philosophique et
ethique
elargie des m
egadonn
ees que
leur utilit
e en mati
ere environnementale peut ^
etre
etablie. Trois d
enitions pertinentes pour les sciences de
lenvironnement sont
elabor
ees
apartir des
el
ements qui distinguent les m
egadonn
ees, les m
ethodes
danalyse utilis
ees et les mod
eles dexplication privil
egi
es par les chercheurs qui se servent des m
egadonn
ees.
Nous examinons dun œil critique les nouvelles priorit
es issues des m
egadonn
ees environnementales et nous
comparons, en
etablissant un parall
ele historique, l
epoque des m
egadonn
ees dans les sciences de
Correspondence to/Adresse de correspondance: Dr. Jennifer Salmond, School of Environment, University of Auckland, Private Bag 92019,
Auckland 1142, New Zealand. Email/Courriel: j.salmond@auckland.ac.nz
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
DOI: 10.1111/cag.12338
© 2017 Canadian Association of Geographers / L'Association canadienne des g
eographes
The Canadian Geographer
Le Géographe canadien
The Canadian Geographer
Le Géographe canadien
lenvironnement
a celle des ann
ees 1970 alors quon all
eguait que la th
eorie des syst
emes entra^
ınait un
changement paradigmatique.
Alinstar de la th
eorie des syst
emes, les m
egadonn
ees sont sur le point de
devenir la nouvelle langue v
ehiculaire de nombreux domaines dactivit
es scientiques. Nous reprenons ici
lavertissement
emis par Barbara Kennedy qui soulignait que bien que les nouvelles m
ethodes danalyse
puissent para^
ıtre fascinantes et prometteuses, cela ne dispense pas les scientiques de toujours rendre
compte du monde «r
efractaire »dans lequel nous vivons, plut^
ot que des abstractions pures auxquelles nous
aspirons.
Mots cl
es : sciences de lenvironnement, m
egadonn
ees, m
ethodologie, g
eographie physique critique
Introduction: Big data, big claims?
The exponential increase in our ability to acquire,
store, transmit, and analyze data has led various
commentators to suggest that a world of big data
i
has arrived, a world in which research questions can
be answered by data directly, without reference to
theoretical frameworks (Miller and Goodchild 2015).
By covering massive spatial scales and diverse
scientific domains, big data and accompanying
analytical techniques offer the possibility of identi-
fying new patterns and predictors from the chaos
and complexity of human and environmental pro-
cesses (Death 2015;OSullivan and Manson 2015). Big
data, it is claimed, has the potential to provide
unprecedented insights into environmental systems
and human behaviour and offer an improved basis
for decision making in a new era of data-informed
policy (White et al. 2015). Big data has also been
touted as key to solving the city(Lehrer 2010) and
revolutionizingunderstandings of climate change
vulnerability (Ford et al. 2016), potentially leading to
miraculous solutions to well-worn problems
(Crang 2015, 351). With the data deluge upon us,
some claim that machine learning analyses of big
data may be able to succeed where science has
failedin solving complex environmental problems
(Death 2015, 595). In this emerging infrastructure
and analytics of big data, a fourth paradigm in
scientific thoughthas been proposed (Hey et al. 2009;
Kitchin 2014), heralding the end of theory(Ander-
son 2008) and requiring an across-the-board revalu-
ing of scientific practices (Elliott et al. 2016).
Amid this optimism, environmental scientists
have proceeded with collecting and analyzing big
data. Often such studies focus on case studies
which highlight the potential value of combining
social and biophysical data to enhance our under-
standing of complex environmental problems. A
range of different topics and approaches have
been explored, using passively and actively
acquired social and environmental datasets, ap-
plied to problems at a variety of different temporal
and spatial scales. For example, Fleming et al.
(2014) combine climate and health data to propose
new models to evaluate the impacts of climate
change on human populations and the work by
Chariton et al. (2016) shows how big data can be
used to analyze multiple environmental stressors
in aquatic environments. Other studies have
combined data from social media and biophysical
elements to generate information about how
humans interact with the environment, presenting
this information in ways that can directly support
decision makingfor example, in conservation
(Levin et al. 2015; Verma et al. 2016); climate
change adaptation (Ford et al. 2016); air pollution
abatement (Steinle et al. 2013); and planning
(Dunkel 2015). More commonly, however, we
either see authors proposing new analytical tools
to organize environmental big datasets more
efficiently (e.g., Baumann et al. 2016; Chariton
et al. 2016), or proposing rules for organizing data
into categories that facilitate both wider use and
application (e.g., La Salle et al. 2016). Agenda-
setting discussions on the value of big data and
future research directions are also increasingly
common (e.g, Pimm et al. 2015; White et al. 2015;
Laurance et al. 2016 in ecology; Viles 2016 in
geomorphology). Across all of these applications
of big data in the environmental sciences, the
question needs to be asked: what makes these
applications distinctive (or new) approaches to
scientific inquiry? If big data is to become more
than just a buzzword with which to channel and
secure research funding, it demands careful
intellectual scrutiny.
i
We refer to big data as both a singular noun (big data as a
discourse), and as a descriptive plural (data that exhibit big
qualities). Sometimes the sentence subject is big datawhereas in
other situations the (big) datathemselves are the subject.
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
2 Jennifer A. Salmond, Marc Tadaki, and Mark Dickson
In this paper, we synthesize and explore what big
data means (or might mean) for physical geography
and the environmental sciences. In some ways,
environmental data has already been bigfor some
time and there is no clear single moment in time or
space at which a step transition from the use of large
datasets to big datasets can be identified. Large
global climate datasets and reanalyses, for example,
have been used to make headlines since the 1980s
and have provided a major impetus for political
action on climate change. From this perspective, a
sceptical scientist might wonder whether big data
represents anything new at all, or whether it is
just another fashionin environmental science
(Sherman 1996). Therefore, in this paper, we aim to
distil what is really new and distinct about big data
as relates to the environmental sciences.
We proceed by distinguishing three thematic
definitions of big data that have been developed
in the (mostly social science) literature. We then
apply and explore these themes as they manifest in
the environmental sciences, in order to identify
and critically interrogate the value commitments,
assumptions, and challenges emerging with big
environmental data. We synthesize relevant cri-
tiques of big data from human geography, connect-
ing these insights to examples and concerns in the
environmental sciences. Then, we step back from
the big data revolution to situate ourselves as
conscious actors in power-laden processes that
are actively reconfiguring environmental science.
We excavate some disciplinary wisdom about para-
digm shifts in physical geography by drawing a
historical comparison with the putative shift in the
1970s towards systems theory as a unifying frame-
work for physical geography. By revisiting the
critique of systems theory posed by Barbara
Kennedy (1979), we consider how physical geogra-
phers might ground their responses to such para-
digm shifts.
What is big data?
Despite widespread usage of the term big data
across the sciences and humanities, industry, and
government, there appears to be little consensus as
to what constitutes bigdata (Graham and Shelton
2013; OSullivan and Manson 2015; Kitchin and
McArdle 2016). In the earth and environmental
sciences, big datasets often loosely refer to
automated data acquisition methods that are
incredibly fast and voluminous compared to tradi-
tional techniques, and which are analyzed using
data-driven methodologies (e.g., Li et al. 2015;
Moosavi et al. 2015; Gabrys 2016). Technological
advances in measurement and telemetry have made
automated mass data acquisition possible even in
the most dynamic and remote environments.
Unprecedented amounts of data are available to
describe the earth-atmosphere system and its
interaction with human activities across previously
unimaginable temporal and spatial scales (Hsu et al.
2015; Krause et al. 2015; Schroeder and Taylor 2015;
Ziegler et al. 2015; Viles 2016).
In the midst of this expansion in data gathering
activities the concept of big data appeared, yet
even in hindsight it is not obvious when this
happened or what makes data bigrather than
simply many.For example, the huge volumes of
remote sensing data from satellites, seismic data,
turbulence data, or ecological data are not in any
sense smallin scale, volume, or scope, but
neither could they be usefully considered big
data.Therefore, to understand what is at stake
when scientists and funding bodies invoke big
data,we should begin by clarifying its difference
from what we term ordinary data.
While big data does not enjoy a consensus
definition, it is often characterized by (i) the unique
qualities and inherent structural characteristics of
the data itself; (ii) the (new) processes and techni-
ques required to transform numbers into knowl-
edge or actionable science; and (iii) a particular
way of making claims about the worlda way of
doing science (Kitchin 2014). We discuss each in
turn.
i) Inherent characteristics of big data
Big data is frequently differentiated from ordinary
data by three Vs: volume, variety, and velocity
(Miller and Goodchild 2015). Big data is usually
characterized by continuous, often real-time, flows
of multiple data sources that are varied in complex-
ity, type, and origin, and represent multiple scales in
time and space. The increasing volumes of automat-
ically generated social data are now frequently being
harnessed and combined with physical data in
scientific studies. For example, Levin et al. (2015)
use satellite data and social media (in the form of
photos from Flickr) to inform conservation studies,
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
Environmental big data 3
and Fleming et al. (2014) use social, economic, and
health data to analyze the relations between climate
change and human health. The inclusion of both
social and physical parameters often distinguishes
many bigdatasets. Further Vshavealsobeen
suggested (Baumann et al. 2016; Kitchin and McArdle
2016), including: viability, value, and veracity. These
pertain to the raw, unchecked, often uncalibrated
nature of data flows that prohibit standard quality
control mechanisms. These additional Vs are rele-
vant for environmentalapplications; for example, big
datasets that include air quality parameters may
include crowd-sourced data or modelled parameters
which have larger uncertainties than ordinary data
(Gura 2013; Mayer-Schonberger and Cukier 2013). In
summary, big data usually (a) contains information
about populations rather than samples, and (b) is
messy (unverified, uncalibrated, lacking quality
control) and lacks metadata (Mayer-Schonberger
and Cukier 2013).
Big datasets can also be characterized by their
source. Often big data is collected opportunistically
by third parties using automated data gathering
techniques and can therefore be multi-purpose,
non-disciplinary, multi-institutional, and multi-
national. Kitchin (2013) distinguishes between
specific sources of big data: directed sources
(measured digitally by a human operator); auto-
mated sources (inherent function of a device or
system such as a mobile phone); and volunteered
sources (gifted by users such as citizen scientists).
Environmental science applications have primarily
focused on direct sources, but the potential for the
automated, volunteer, and crowd data sources is on
the rise, such that environmental scientists require
an awareness of the nature and limitations of social
science datasets.
ii) How data are handled
Some definitions of big data emphasize the tools
and techniques required to put the data to work.
This approach draws attention to the (changing)
balance of decision making between human ana-
lysts and non-human technologies. In the face of
the sensor-driven data delugeit has become
apparent that traditional analytical tools and
techniques are insufficient (Death 2015; Elliott
et al. 2016). In their place, new methods of data
computation, storage, management, representa-
tion, and statistical analysis have emerged. One
set of computational technologies that distin-
guishes big data analysis is the increasing reliance
on statistics and non-linear systems identification
(e.g., genetic algorithms and machine learning) to
reveal relations and patterns, infer dependencies,
and predict outcomes (Goldberg and Holland
1988). Big datasets must be handled carefully as
the data may violate the underlying assumptions
of statistical tests. For example, many big datasets
are based on self-selected populations (e.g., social
media or cell phone users) and thus are not truly
random samples. New protocols for quantifying
the bias in information derived from user-gener-
ated and volunteered geographic data are required
to manage this resource effectively (Dickinson
et al. 2010; Goodchild and Li 2012).
iii) Big data as a way of doing science
While the above definitions of big data are useful
guides, they are not determinative of what counts as
big data, because the term itself possesses a kind of
social capital. What are the conditions that might
motivate the use of the big datalabel?
Human geographers have highlighted that big
data is often characterized by a mindset that values
certain types of knowledge claims (Graham and
Shelton 2013; Miller and Goodchild 2015). Such a
mindset prioritizes: (i) positivistic approaches to
data analysis unbiased by prior assumptions, expe-
rience, or theory; (ii) pseudo-empirical approaches
that assume that datascapes accurately and
completely represent the world; and (iii) a generalist
view of the world that can be reduced to large-scale
comparisons of multiple parameters and issues
across multiple scales. If we place big data into a
historical and disciplinary context, it could be
argued that it represents a paradigmatic shift in
scientific explanation analogous to systems theory
of the 1960s and 1970s (e.g., Kennedy and Chorley
1971); big data conjures images of a world of data,
presenting the promise of a global master dataset
where anything can be drawn out and compared to
anything else. When Anderson (2008) provocatively
declared the end of theory,he was referring to the
possibility of testing any conceivable scientific
hypothesis with big data, thus ending the need for
subject specialisms. In the environmental sciences,
Elliott et al. (2016) suggest that the era of theory-
driven hypothesis testing should give way to a
more iterative relationship with massive datasets.
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
4 Jennifer A. Salmond, Marc Tadaki, and Mark Dickson
Whereas the systems theory of the 1970s called for
all experts to become systems experts, big data
suggests that we will no longer need environmental
scientists, only data scientists.
For the environmental sciences, this universalist
mindset may have particular appeal for areas of the
discipline suffering from physics envy(OSullivan
and Manson 2015). Prestigious science journals
appear to favour publications presenting global
datasets, where the emphasis is on bigempirical
claims about as many geographical regions and as
many subject domains as possible. Funding agen-
cies are being urged to promote the collection of
datasets that can be shared, used for multiple
purposes, or (re)analyzed in multiple ways to ensure
value for money (Costello and Wieczorek 2014;
Specht et al. 2015). According to this third, socio-
logical definition of big data, the value of these
perceived differences in bigdatasets for geo-
graphical and environmental sciences lies in a
social, cultural, and political framework that values
objective, quantitative, controlled, replicable data
leading to universal claims and globalized knowl-
edge, over subjective and descriptive accounts and
local knowledge (Clifford 2009).
How might big data shape the study of
earth-atmosphere systems and their
human interactions?
How might these three thematic definitions of big
data affect the theoretical, methodological, and
institutional priorities of environmental science?
In this section, we engage in a critical physical
geographical analysis of big data, by describing and
interrogating how environmental big data may
be beginning to initiate significant changes to the
priorities and conduct of environmental science
(e.g., see Lane 2016, this issue). We invoke a broad
definition of critical physical geography as a
commitment to expose and interrogate the relation-
ships between values, science, and environmental
outcomes (Tadaki et al. 2015). This definition
extends the sub-disciplinary approach proposed
by Lave et al. (2014), which emphasizes explicit
linkages between social theory and biophysical
approaches. Our objectives are to identify and
critically interrogate the scientific practices associ-
ated with big data, and examine how big dataas a
research framing may be involved in producing
particular societal values and outcomes. We con-
tribute from a scientific perspective to emerging
conversations about the politics of big data in the
environmental sciences (Gabrys 2016).
First, we consider how big data is poised to affect
the observational priorities of environmental sci-
ence, and what this means for the analysis and
understanding of environmental systems. Second,
we examine how a turn to big data is likely to
emphasize the value of (if not require) new analyti-
cal techniques, and we reflect on what the centrali-
zation of expertise implied by big data means for
subject area experts. Third, we consider how the
current focus on big data mimics the movement
towards systems theory in physical geography in
the 1970s. If big data represents a way of doing
science, historical reflection can help us situate and
understand what is at stake as we move towards a
new paradigm in science.
Changing observational priorities: Organizing
the world for big data
Most environmental science projects have tradition-
ally been question-driven, with data collected in
order to answer a specific place-based environmen-
tal question. However, with new technologies it is
tempting to measure everywhere and anywhere,
everything and anything, because it is cheap, easy,
possible, and might be useful for someone or at
some point. This transition may seem generic, and
perhaps harmless, but it raises the prospect of
(i) embedding a rupture between the scientist and
the environment they study; (ii) making the con-
ditions of data collection invisible or unavailable for
interrogation; and (iii) concealing the theoretical
choices underlying data collection behind questions
of technology and measurement. Instead of herald-
ing the end of theory(Anderson 2008), big data
could instead promote an ignorance of theory.
Instituting a rupture between scientist and
environment. The prospect of big data-driven
field campaigns has the potential to alter the
who/what/where/when/why of environmental
observation. For the earth and environmental
sciences these concerns are most tractable in
relation to fieldwork (Church 2013). Big data
collection in the field conveys an image of a team
of technicians entering the field with the mandate to
collect data about everything (or more realistically,
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
Environmental big data 5
as much as possible given time and technological
constraints), or perhaps an army of citizen
scientists and volunteers providing a continuous
flow of data from low-cost sensors. Here, big data
collectors may not need to understand the system
they are studying, or make decisions about when,
where, or what to report; instead decisions might be
constrained by expert instruction (e.g., guidebooks),
available secondary sources (e.g., cell phones), or
standardized observational technologies.
However, Church (2013, 184) cautions that al-
though [r]ecent technological developments have
enhanced our ability to comprehend the landscape
system,the scientists efforts to understand com-
plex and emergent environmental processes will
surely require comprehensive field experience if we
are to regain the whole landscape view of the early
field workers.Change in financial, temporal, and
spatial constraints on data collection imply that the
experiences and theoretical frameworks tradition-
ally employed by experts to choose the study site,
timing of the field campaign, and parameters
measured will no longer be directly connected to,
or in iterative conversation with, the point of data
collection. The personal experiences of scientists
have previously acted to pre-emptively filter the data
collected, and while such experiences bias the data-
set, at least the bias is knowableand explicit as a part
of the inductive process.
A turn towards big data also implies an increasing
shift away from question-driven data collection
towards a situation where questions are modified
to suit the data. Shearmur (2015) observes that big
data are rarely suited to addressing a specific
environmental question. Instead, researchers may
have to ask different questions and adapt their
research to thedata available. That is, bigdatabases
effectively subsidize research that can make use of
existing data. This could lead to scientific methods
increasingly being aligned with the data available,
rather than data collected to satisfy methodological
requirements or advance theoretical understanding
(Miller and Goodchild 2015). Research becomes more
opportunistic and responsive rather than planned. In
this way, the data can shape research and might be
constrained by (non-expert) data collectors in ways
not previously experienced (Shearmur 2015).
Concealing context of data acquisition. Unlike
datasets that are collected personally or by a group
of close colleagues, end-users of big datasets may
know very little about the datawho collected it,
how it was collected, when, where, why. If metadata
are lacking, a user might assume that all data within
a set are created equally and have the same
uncertainty. Big data approaches often emphasize
that uncertainty can be reduced by over-sampling,
that is, by collecting a sample of a large enough size
(Shearmur 2015). This can lead to the assumption
that big data lack observational bias, or that
observational bias is removed by over-sampling.
A consequence of the rupture between scientist
and environmental big data is that the consistency
of the data may be unknown, due to difficulties of
calibrating large numbers of instruments and/or
controlling for differences in data collection meth-
odologies. Whilst such concerns are not new to large
data projects with ordinary datafor example,
Specht et al. (2015) for discussion in ecology;
Soranno et al. (2015) in lake ecosystems; and Hsu
et al. (2015) in experimental geomorphologythe
issues are magnified in big datasets where physical
environments and social environments become
intertwined. Whilst developing a culture of more
transparent exchange and aggregation could en-
hance environmental science, we can also debate
whether a trend towards the homogenization of
environmental observation to suit big data formats
is even desirable.
It could be argued that a big data scattergun
approach to data collection improves the likelihood
of a positive outcome from the field experiment, by
reducing sampling bias in time and space and
reducing the probability that a critical parameter
was not measured. However, biases are not simply
removed, so much as shifted around. For instance,
new sampling biases are created by technological
limitations on data collectionfor example, GPS
and cell phone coverage, flight paths for drones, and
assumptions about what can and should be
counted.
The illusion of theory-free data. Data
acquisition is always undertaken within some
theoretical framework (Odoni and Lane 2010;
Rhoads and Thorn 2011). Even bigdatasets
contain value-based assumptions that are made by
someone (e.g., environmental scientists, industry,
commercial analysts, operators) about what kinds
of data are valuable. Advances in low-cost sensors
and high-resolution observational technologies
reduce the need for the field scientist to choose
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
6 Jennifer A. Salmond, Marc Tadaki, and Mark Dickson
between time and space, but this displaces rather
than removes decisions regarding the quality and
representativeness of data (e.g., Krause et al. 2015).
Changes in observational priorities implied by big
data can be described as a shift from strategic,
theory-justified measurement (data scarcity)
towards conditions of a massive flow of data (data
deluge) where scientists become receivers and
manipulators of data, rather than producers of
data. However, by rendering the theory of observa-
tion and measurement invisible or unavailable to
analysts and users, the overall external validity of
big data analyses may throw into doubt whether big
data really can (or indeed should) herald the end of
theory.
Changing analytical priorities: From landscapes
to datascapes
The discourse about big data emphasizes particular
kinds of large scale, quantitative, positivistic anal-
yses (Kitchin 2014), and this re-valuing has the
potential to change perceptions about what con-
stitutes rigorous analysis in the environmental
sciences (Elliott et al. 2016). As such, big data may
(i) push explanation towards particular modes and
(ii) frame analysis as a task for reductionist
computation rather than human judgement,
thereby reducing landscapes to datascapes, which
(iii) lead to a re-configuring of what constitutes
environmental expertise.
Big data for environmental analysis suggests a
shift (return) to positivist forms of explanation.
Users of big data would probably accept that big
data prioritizes large-n datasets as well as the
generation of broad claims about diverse popu-
lations, domains, and socio-biophysical contexts
(Kitchin 2014). Such a positivist empirical
approach enables new, perhaps unexpected, pre-
dictors to be identified from the apparent chaos
and complexity of a vast number of measure-
ments and possible relationships among variables
(e.g., Krause et al. 2015; Ziegler et al. 2015; Elliott
et al. 2016). Here, everything that can be codified
as data can be compared with everything else.
Instead of biasing our understanding through the
explicit use of pre-existing theories, all possible
correlations between variables can be tested and
quantified (Anderson 2008). However, the short-
comings of reductionist and positivist ap-
proaches have been identified and critiqued by
many scientists, including physical geographers
(Rhoads and Thorn 2011; Slaymaker 2016). Cor-
relation is not the same as causation, and most
physical geographers would agree that automated
pattern recognition run on environmental data is
no substitute for an embodied and theoretically
reflexive engagement with the biophysical envi-
ronment (Church 2013).
A second shift invoked by big data involves a
movement in environmental scientistsroles to-
wards the analysis of correlations automatically
highlighted by algorithms, the aim being to
identify patterns that make physical sense and
develop theoretical arguments and empirical hy-
potheses to test these (Peters et al. 2014).
Unguided, automated exploration of big datasets
is perceived as a productive way to analyze big
data and compare multiple variables across space
and time (e.g., Death 2015; Krause et al. 2015;
Pagano et al. 2016). However, results from auto-
mated techniques can be misleading if they are not
interpreted within the context of existing knowl-
edge frameworks and the limitations of the dataset
(OSullivan and Manson 2015).
Dickson and Perry (2016) make this point in the
context of a coastal landslide dataset. In their
example, a comparatively small dataset on coastal
landslides and potential landslide drivers was
extracted from digital elevation models, aerial
photographs, and fieldwork. Three machine-
learning approaches were then used to automati-
cally detect the likely controls on landsliding
failure. All methods agreed and overall suggest
potential to correctly predict a high proportion
(>85%) of landslides. This approach has consid-
erable potential for coastal management, and
similar analyses of big datasets are likely to yield
similar (apparent) success, but analysts need to
look more deeply when reporting results. Dickson
and Perry (2016) caution that important ques-
tions need to be asked about error sources,
including absent or missing data. For instance,
in coastal landslide studies, sites prone to
landsliding (e.g., cliffs undercut by strong wave
action) may also be the sites of most rapid
evidence removal, meaning that preservation
bias influences the dataset. This type of issue
could easily by overlooked in a burst of big data
collection and automated analysis.
A third shift in analytical priorities implied by big
data involves a reconfiguring of what constitutes an
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
Environmental big data 7
environmental expert and what a valid and authori-
tative analysis looks like. If landscapes can be
converted into datascapes, then what is needed is
not analysis of landscapes but analysis of data-
scapes. Further, these changes in observational and
methodological priorities could lead towards the
prioritization of reanalysis of open source/public
datasets over field science, triggering the rise of the
data scientist(Levy 2015) and the death of the
environmental expert (Death 2015). There is no
doubt that technicians, volunteers, and citizen
scientists, together with industrial and commercial
data analysts, can each play valuable and important
roles in collection and analysis of big datasets, but
we agree with Pagano et al. (2016) that the human
judgment of the environmental scientist remains
critical to meaningful interpretation.
As a potential consequence of this reconfigura-
tion of environmental expertise, it is important to
consider how big data will affect research fund-
ing, data infrastructure (with associated
path-dependencies), and the centralization of
knowledge. Funding bodies such as governments,
industries, and civil society actors are demanding
information that is temporally and spatially
specific to particular problems but also predic-
tive, prescriptive, and scalable(Hampton et al.
2013, 156). Here the big data revolution could see
a channelling of investment into those projects,
disciplines, and investigators who convince fun-
ders that their approach is bigand that big is
good(see Gabrys 2016). Already, some environ-
mental scientists are championing a move to-
wards big data taking particular (mandated)
forms (Baumann et al. 2016). Further, while big
data is already shaping the terrain of science
funding, this process is creating new inequalities
in environmental science at the global and local
scales. Any putative shift from one way of doing
things to another is going to have winners and
losers; the questions that scientists may need to
ask themselves are: who is gaining scientific
authority from the big data revolution, who is
losing, and what does this mean?Not only does
big data shift prestige and authority towards
generalist data experts, but this landscape of
expertiseisunevenacrosstheglobe.Data
experts, and the environmental experts they
may displace, may not be equally distributed,
nor are all data experts able to make equally
powerful or recognized claims with big data.
Big data as a new paradigm: A
disciplinary perspective
Sherman (1996, 89) contends that for scientists,
disciplinary self-examination is especially critical
in times of fundamental uncertainty and change.In
the face of the data deluge (Kitchin 2014), what
values might we espouse as geographers as we
encounter, join, resist, and transform the big data
revolution?
In search of disciplinary wisdom on this matter,
we find it instructive to consider how physical
geographers have responded to previous shifts in
priorities relating to the study of earth-atmosphere
systems and their interactions with human and non-
human life.
Perhaps the closest analogy in physical geography
occurred in the 1960s and 1970s as physical
geographers championed systems theory as a
universal approach to structure environmental
inquiry (Kennedy and Chorley 1971; Terjung
1976). In this period, debates raged about the merits
of reductionist (law-finding) environmental inquiry
as opposed to traditional descriptive accounts of
particular places and environments. Technological
advances in computational power promised to
replace the traditional modes of geographical
description with mathematical tools oriented
towards the control and prediction of environmen-
tal change through systems science (Kennedy and
Chorley 1971). Commenting on these developments
in 1979, Oxford geographer Barbara Kennedy
observed that in mainstream accounts of systems
science,
the geographer ... is urged to move at once into the
conversion of existing information [about landscapes]
into the mathematics thought to be most appropriate
for control and prediction. The object of interest is to
become a system of symbols; the tools of the new
trade are to be those of information theory and
control engineering. (Kennedy 1979, 550)
Despite her early involvement promoting a sys-
tems approach in physical geography (Kennedy and
Chorley 1971), Kennedy became critical of how
systems analysis had been subsequently framed
and pursued. She perceived an overemphasis on
abstraction and mathematical formalization, noting
that our subject matter [i.e., landscape] almost
inevitably has a history and that history will frequently
prove very important indeed in determining future
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
8 Jennifer A. Salmond, Marc Tadaki, and Mark Dickson
developments and non-developments(Kennedy
1979, 551). The mathematical abstractions of systems
theory tended to treat landscapes as if they did not
have histories; they were merely mechanistic entities
that could be represented and controlled through
calculation. Instead, Kennedy maintained that the real
world was historical, multi-scaled, and fundamentally
complex. Real world environments were naughty,
and they did not behave in line with the abstractions of
human observers. She concluded, By all means let the
mathematical modelling of the naughty world con-
tinue apace, but let us not confuse those models with
reality(Kennedy 1979, 558).
Kennedys cautionary note resonates with our
current position, as internal and external influen-
ces push environmental scientists towards be-
coming data analysts more focused on datascapes
than actual environments. Big data are part of a
wider set of (often ungrounded) assumptions that
more data and more computing prowess
will reveal alland solve all our problems
(Shearmur 2015).
Wyly (2014) calls for geographers to become
critically engaged with the new quantitative
revolution.In this context, it is crucial that the
context of big datasets must be acknowledged:
who collected the data, how were they collected,
and why were data collected, or not, at those
points in time and space. At the same time,
geographers must continue to acknowledge the
existence and implications of naughtyworlds,
both in principle and practice (Clifford 2001). Big
data may present eloquent and seemingly compre-
hensive accounts of our social and biophysical
environments, but let us not confuse these
representations with reality.
Big data for a naughtyworld
This article has synthesized some definitional
concepts of big data to help physical geographers
and environmental scientists understand what is at
stake as big data become ever more prominent in the
way we study earth-atmosphere systems and their
interaction with human activity. We have sought to
identify and interrogate the ways in which big
data may influence how environmental science is
conducted. Big data will open new and legitimate
scientific questions, but we live in a naughty world
of geographically specific and irreducibly complex
biophysical and human environments. So, how
should we approach big data?
It is worth re-emphasizing, as a starting point,
that there is significant value in generating and
using big datasets. New questions can be asked, new
patterns recognized, and new linkages identified
and tested (Elliott et al. 2016). New technologies
have transformed the temporal and spatial resolu-
tion of datasets and provided coverage in terrain
that was previously inaccessible owing to varying
social and physical constraints. Indeed, big data can
address some of the fundamental problems in
environmental science and geographical enquiry:
the transcendence of scale, the complexity of
geographical phenomena and the origins of com-
plexity(Clifford 2001, 387). However, there will
always be digital divides, uneven data shadows, and
bias in how technology is used and data reported.
Even if it was possible to measure everything
moving forward, data from the past can never be
recaptured in the same way. The answer to the old
question of how much data is enoughis continu-
ally changing as it is negotiated through cultural
contexts. The jury is still out on whether it makes
sense to measure everything, how to avoid missing
key events, and the influence of confounding
variables.
Big data is a mode of producing data or a quasi-
objective collective truthabout the environment
by representing the world as datascapes. To become
useful in the form of knowledge or actionable
science, big data must be stored, processed, and
analyzed within the context of the social, economic,
and political frameworks that created it. Critical
physical geography presents a pertinent (sub)disci-
plinary identification, as well as a conversation
spacewhich can support reflexive physical geog-
raphers in identifying the human values and
institutions shaping the collection of the data. It is
as important to understand these institutional and
ethical aspects of big data research as it is to
understand the theoretical bias of the lone field
scientist observing a single case study (see Tadaki
2016).
As our collective storehouse and flow of data
continue to expand, there is a need to change the
types of questions we ask about the world and our
approaches to answering them; this requires more
than new tools and observations. Before jumping on
the bandwagon of ever more data acquisition it is
worth pausing to ask whether our new abilities to
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
Environmental big data 9
measure take us towards our perceived goals (e.g.,
classification, understanding, control, prediction,
management) as scientists, academics, environmen-
tal planners, and decision makers. Arguably, being
able to describe an environment in more layers of
increased temporal and spatial detail doesnt
necessarily translate into new ways of solving the
problem (Church 2013).
Geography has had long-running conversations
and arguments about the role of quantitative
methods(OSullivan and Manson 2015, 715).
Whilst it is not fair to assume that computational
methods have invalid epistemologies or are some-
how antithetical to critical research (Wyly 2014),
there is a danger that big data supports a
deterministic view of the world where the pro-
posed solution to residual uncertainty is more data
and better computers(Shearmur 2015, 966). It is
sobering to note that, despite the advances in
observation and computational power that have
resulted in an exponential increase in the amount
of meteorological data from observations and
models over the past few decades, the role of the
human meteorologist still adds value to the
forecast, improving the prediction by 1025%
(Kreinovich and Ouncharoen 2015). This ratio
has not changed significantly with time, money,
improved theoretical understanding, computer
models, or data availability. Clearly, environmental
scientists add value through experience and
wisdom when interpreting the outputs of auto-
mated routines (Pagano et al. 2016). Explanation
may not be required for progress in predictive
performance, but it certainly enhances it and
data cannot replace the value of decision making
and sceptical human understanding (Miller and
Goodchild 2015).
Big data provide a valuable perspective on
human-environment processes and interactions,
but bigperspectives benefit from standing in
complement to (and conversation with) traditional
data and approaches (Dunkel 2015). It is worth
considering how big data might be used in non-
reductionist ways (Shelton et al. 2015); the forms
this could take, however, remain loosely imagined
and there are not yet any simple or obvious
examples of how this might be achieved. Is big
data destined to be constrained by the limitations
of its roots in positivism? Can big data be
mobilized in ways that enrich (rather than sim-
plify) understanding, that level (rather than further
stratify) the global playing field of science and
knowledge, and that acknowledge (rather than
ignore) the naughtiness of our human and ecologi-
cal communities? Perhaps one important step
towards this goal might include re-centring the
agency of human actors (scientists) in generating,
collecting, and collating big data. By humanizing
big data, we might better account for the choices
underpinning our analyses, which can help us to
more meaningfully understand and distinguish
our representations from the naughty world.
References
Anderson, C. 2008. The end of theory. Wired Magazine 16(7): 26.
Baumann, P., P. Mazzetti, J. Ungar, R. Barbera, D. Barboni,
A. Beccati, L. Bigagli, et al. 2016. Big data analytics for earth
sciences: The EarthServer approach. International Journal of
Digital Earth 9(1): 329.
Chariton, A. A., M. Sun, J. Gibson, J. A. Webb, K. M. Y. Leung,
C. W. Hickey, and G. C. Hose. 2016. Emergent technologies
and analytical approaches for understanding the effects of
multiple stressors in aquatic environments. Marine and
Freshwater Research 67: 414428.
Church, M. 2013. Refocusing geomorphology: Field work in
four acts. Geomorphology 200: 184192.
Clifford, N. J. 2001. Editorial: Physical geographyThe naughty
world revisited. Transactions of the Institute of British
Geographers 26(4): 387389.
—— . 2009. Globalization: A Physical Geography perspective.
Progress in Physical Geography 33(1): 516.
Costello, M. J., and J. Wieczorek. 2014. Best practice for
biodiversity data management and publication. Biological
Conservation 173: 6873.
Crang, M. 2015. The promises and perils of a digital geo-
humanities. Cultural Geographies 22(2): 351360.
Death, R. G. 2015. An environmental crisis: Science has failed; let
us send in the machines. Wiley Interdisciplinary Reviews: Water
2(6): 595600.
Dickinson, J. L., B. Zuckerberg, and D. N. Bonter. 2010. Citizen
science as an ecological research tool: Challenges and benefits.
Annual Review of Ecology, Evolution, and Systematics 41:
14972.
Dickson, M. E., and G. L. W. Perry. 2016. Identifying the
controls on coastal cliff landslides using machine-learning
approaches. Environmental Modelling and Software 76:
11727.
Dunkel, A. 2015. Visualizing the perceived environment using
crowdsourced photo geodata. Landscape and Urban Planning
142: 173186.
Elliott, K. C., K. S. Cheruvelil, G. M. Montgomery, and P. M.
Sorrano. 2016. Conceptions of good science in our data-rich
world. BioScience 66(10): 880889.
Fleming, L. E., A. Haines, B. Golding, A. Kessel, A. Cichowska,
C. E. Sabel, M. H. Depledge, et al. 2014. Data mashups:
Potential contribution to decision support on climate change
and health. International Journal of Environmental Research
and Public Health 11(2): 17251746.
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
10 Jennifer A. Salmond, Marc Tadaki, and Mark Dickson
Ford, J. D., S. E. Tilleard, L. Berrang-Ford, M. Araos, R. Biesbroek,
A. C. Lesnikowski, G. K. MacDonald, et al. 2016. Big data has big
potential for applications to climate change adaptation.
Proceedings of the National Academy of Sciences USA 113(39):
1072910732.
Gabrys, J. 2016. Practicing, materialising and contesting environmen-
tal data. Big Data & Society. doi: 10.1177/2053951716
673391.
Goldberg, D. E., and J. H. Holland. 1988. Genetic algorithms and
machine learning. Machine Learning 3(2): 9599.
Goodchild, M. F., and L. Li. 2012. Assuring the quality of
volunteered geographic information. Spatial Statistics 1:
110120.
Graham, M., and T. Shelton. 2013. Geography and the future of big
data, big data and the future of geography. Dialogues in
Human Geography 3(3): 255261.
Gura, T.2013. Citizen science: Amateurexperts. Nature 496(7444):
259261.
Hampton, S. E., C. A. Strasser, J. T. Tewksbury, W. K. Gram,
A. E. Budden, A. L. Batcheller, C. D. Duke, and J. H. Porter. 2013.
Big data and the future of ecology. Frontiers in Ecology and the
Environment (11): 156162.
Hey, A. J. G., S. Tansley, and K. M. Tolle. 2009. The fourth
paradigm: Data-intensive scientific discovery. Redmond, WA:
Microsoft Research.
Hsu, L., R. L. Martin, B. McElroy, K. Litwin-Miller, and W. Kim. 2015.
Data management, sharing, and reuse in experimental
geomorphology: Challenges, strategies, and scientific oppor-
tunities. Geomorphology 244: 180189.
Kennedy, B. A. 1979. A naughty world. Transactions of the
Institute of British Geographers 4(4): 550558.
Kennedy, B. A., and R. J. Chorley. 1971. Physical geography:
A systems approach. London: Prentice-Hall.
Kitchin, R. 2013. Big data and human geography: Opportunities,
challenges and risks. Dialogues in Human Geography 3(3):
262267.
—— . 2014. Big Data, new epistemologies and paradigm shifts. Big
Data & Society AprilJune: 112.
Kitchin, R., and G. McArdle. 2016. What makes Big Data, Big Data?
Exploring the ontological characteristics of 26 datasets. Big
Data & Society January
June: 110.
Krause, S., J. Lewandowski, C. N. Dahm, and K. Tockne. 2015.
Frontiers in real-time ecohydrologyA paradigm shift in
understanding complex environmental systems. Ecohydrology
8(4): 529537.
Kreinovich, V., and R. Ouncharoen. 2015. Fuzzy (and interval)
techniques in the age of Big Data: An overview with
applications to environmental science, geosciences, engineer-
ing, and medicine. International Journal of Uncertainty
Fuzziness and Knowledge-Based Systems 23(Suppl. 1): 7589.
Lane, S. N. 2016. Slow science, the geographical expedition
and critical physical geography. The Canadian Geographer.
doi: 10.1111/cag.12329
La Salle, J., K. J. Williams, and C. Moritz. 2016. Biodiversity
analysis in the digital era. Philosophical Transactions of the
Royal Society B 371: 20150337.
Laurance, W. F., F. Achard, S. Peedell, and S. Schmitt. 2016. Big
data, big opportunities. Frontiers in Ecology and the Environ-
ment 14(7): 347.
Lave R., M. W. Wilson, E. Barron, C. Biermann, M. Carey, M. Doyle,
C. Duvall, et al. 2014. Intervention: Critical physical geography.
The Canadian Geographer 58(1): 110
Lehrer, J. 2010. A physicist solves the city. New York Times
Magazine, 19 December, MM46.
Levin, N., S. Kark, and D. Crandall. 2015. Where have all the
people gone? Enhancing global conservation us ing night
lights and social media. Ecological Applications 25(8): 2153
2167.
Levy, J. 2015. The rise of the data scientist. Business 2 Community:
Technology & Innovation. http://www.business2community.
com/big-data/rise-data-scientist-
01282878#jiv7JBYivPDOAKoX.99.
Li, Y., Y. Zhu, W. Yin, Y. Liu, G. Shi, and Z. Han. 2015. Prediction of
high resolution spatial-temporal air pollutant map from Big
Data sources. In Big Data computing and communications,
ed. Y. Wang, H. Xiong, S. Argamon, X. Y. Li, and J. Z. Li. Basel,
Switzerland: Springer International Publishing, 273282.
Mayer-Schonberger, V., and K. Cukier. 2013. Big Data: A
revolution that will change how we live, work and think.
London, UK: John Murray.
Miller, H. J., and M. F. Goodchild. 2015. Data-driven geography.
Geojournal 80(4): 449461.
Moosavi, V., G. Aschwanden, and E. Velasco. 2015. Finding
candidate locations for aerosol pollution monitoring at street
level using a data-driven methodology. Atmospheric
Measurement Techniques 8(9): 35633575.
OSullivan, D., and S. M. Manson. 2015. Do physicists have
geography envy? And what can geographers learn from it?
Annals of the Association of American Geographers 105(4):
704722.
Odoni, N. A., and S. N. Lane. 2010. Knowledge-theoretic models in
hydrology. Progress in Physical Geography 34(2): 151171.
Pagano, T. C., F. Pappenberger, A. W. Wood, M.-H. Ramos, A.
Persson, and B. Anderson. 2016. Automation and human
expertise in operational river forecasting. WIREs Water.
doi: 10.1002/wat2.163V.
Peters, D. P. C., K. M. Havstad, J. Cushing, C. Tweedie, O. Fuentes,
and N. Villanueva-Rosales. 2014. Harnessing the power of big
data: Infusing the scientific method with machine learning to
transform ecology. Ecosphere 5(6): 115.
Pimm, S. L., S. Alibhai, R. Bergl, A. Dehgan, C. Giri, Z. Jewell,
L. Joppa, R. Kays, and S. Loarie. 2015. Emerging technologies to
conserve biodiversity. Trends in Ecology & Evolution 30(11):
685696.
Rhoads, B. L., and C. E. Thorn. 2011. The role and character of
theory in geomorphology. In The SAGE Handbook of
Geomorphology, ed. K. J. Gregory and A. S. Goudie. London,
UK: Sage, 5977.
Schroeder, R., and L. Taylor. 2015. Big data and Wikipedia
research: Social science knowledge across disciplinary divides.
Information Communication & Society 18(9): 10391056.
Shearmur, R. 2015. Dazzled by data: Big Data, the census and
urban geography. Urban Geography 36(7): 965968.
Shelton, T., A. Poorthuis, and M. Zook. 2015. Social media and the
city: Rethinking urban socio-spatial inequality using user-
generated geographic information. Landscape and Urban
Planning 142: 198211.
Sherman, D. J. 1996. Fashion in geomorphology. In The scientific
nature of geomorphology: Proceedings of the 27th Binghamton
Symposium in Geomorphology, ed. B. L. Rhoads and C. E.
Thorn. Chichester, UK: John Wiley & Sons Ltd., 87114.
Slaymaker, O. 2016. Physical geographersunderstanding of
the real world. The Canadian Geographer. doi: 10.1111/
cag.12334
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
Environmental big data 11
Soranno, P. A., E. G. Bissell, K. S. Cheruvelil, S. T. Christel,
S. M. Collins, C. E. Fergus, C. T. Filstrup, et al. 2015. Building a
multi-scaled geospatial temporal ecology database from
disparate data sources: Fostering open science and data reuse.
Gigascience 4. doi: 10.1186/s13742-015-0067-4.
Specht, A., S. Guru, L. Houghton, L. Keniger, P. Driver, E. G. Ritchie,
K. Lai, and A. Treloar. 2015. Data management challenges in
analysis and synthesis in the ecosystem sciences. Science of
the Total Environment 534: 144158.
Steinle, S., S. Reis, and C. E. Sabel. 2013. Quantifying human
exposure to air pollutionMoving from static monitoring to
spatio-temporally resolved personal exposure assessment.
Science of the Total Environment 443: 184193.
Tadaki, M. 2016. Rethinking the role of critique in physical
geography. The Canadian Geographer. doi: 10.1111/cag.12299.
Tadaki, M., G. Brierley, M. Dickson, R. Le Heron and J. A. Salmond.
2015. Cultivating critical practices in physical geography. The
Geographical Journal 181(2): 160171.
Terjung, W. H. 1976. Climatology for geographers. Annals of the
Association of American Geographers 66(2): 199220.
Verma, A., R. van der Wal, and A. Fischer. 2016. Imagining
wildlife: New technologies and animal censuses, maps and
museums. Geoforum 75: 7586.
Viles, H. 2016. Technology and geomorphology: Are
improvements in data collection techniques transforming
geomorphic science? Geomorphology 270(1): 121133.
White, R. L., A. E. Sutton, R. Salguero-G
omez, T. C. Bray,
H. Campbell, E. Cieraadn, N. Geekiyanage, et al. 2015.
The next generation of action ecology: Novel approaches
towards global ecological research. Ecosphere 6(8): 134.
Wyly, E. 2014. The new quantitative revolution. Dialogues in
Human Geography 4(1): 2638.
Ziegler,C.R.,J.A.Webb,S.B.Norton,A.S.Pullin,andA.H.
Melcher. 2015. Digital repository of associations between
environmental variables: A new resource to facilitate
knowledge synthesis. Ecological Indicators 53: 6169.
The Canadian Geographer / Le G
eographe canadien 2017, xx(xx): 112
12 Jennifer A. Salmond, Marc Tadaki, and Mark Dickson
... As data-intensive research is promoted and lauded, there is an increasing cost to the scientific community in terms of institutional and theoretical consolidation and a cost to society in terms of the material requirements necessary to sustain our endeavors. Will our continuing fixation on data density pay off, or has the pace of data generation outstripped the pace of attendant theoretical development to help us understand the implications of this new information (Salmond et al., 2017)? In other words, what exactly are we looking for, for what purpose, and for whom? ...
... As a community, our tendency has been to be relatively uncritical of these advances in data-acquisition and technology, without considering if they are in fact broadening our understanding of landscape processes and evolution, or are serving our most pressing societal needs. Weighing advances in technology and data against advances in scientific innovation and understanding is a difficult thing to do, since there are no objective metrics of either part of the calculation (Inkpen, 2018;Salmond et al., 2017), and yet we implicitly do so when we justify our research programs. Current justifications for engaging in geomorphological research have tended to revolve around decoding extraterrestrial processes (e.g., Conway et al., 2011;Galofre & Jellinek, 2017;Pelletier et al., 2010;Sharp, 1982;Sweeney et al., 2015) with the implied possibility of sustaining life elsewhere; protecting, mitigating, and reducing the impacts of natural hazards on habitats and infrastructure (e.g., Palucis & Lamb, 2017;Rempel et al., 2016); predicting the long-term and cascading impacts of climate change on resources (e.g., Phillips & Jerolmack, 2016;Slater & Khouakhi, 2019;Wilkinson & McElroy, 2007); and/or identifying the ecological processes and anthropogenic inputs that drive soil productivity, habitat restoration, and land cover change (e.g., Collins et al., 2004;Hancock et al., 2008;Istanbulluoglu & Bras, 2005;James & Marcus, 2006;Montgomery, 2007). ...
... We now work in a paradigm of "ground-truthing," where the field observations provide verification and validation, that is, a "check", on the "truth" that exists in the datascapes we have created. Salmond et al. (2017) point out that the phenomenon of big data can push science away from being theory-driven to an iterative process of hypothesis-testing through the analysis of big datasets. If the field observations do not fit the datascapes, they are at worst discarded, or, at best, the datascapes are trained and rerun to achieve a better statistical fit. ...
Article
Full-text available
At the start of its centennial year, AGU's surface process community revisited G. K. Gilbert's legacy of landscape description and experimental models of surface processes, as well as his embrace of critique and pragmatism in the practice of landscape science. In the 100 years, since Gilbert and especially since the dawn of the 21st century, we have seen an intensified focus on the acquisition of more and more earth observation data and the numerical modeling of landscapes, alongside widespread use of deterministic and predictive practices to find solutions to the social, economic, and environmental challenges of today. What have we gained and lost in this pursuit? Here we lay out some of the challenges for the discipline in an increasingly data‐rich and complex world in which earth science is also being called to reorient itself towards more societally relevant roles. We ask the community to ponder the following: Is the discipline serving our scientific and societal goals, or is there a need for the science of landscapes to adopt new frameworks of thinking and to question the deterministic approaches that have dominated our discipline to date, in order to attend to the needs of living in the Anthropocene?
... In the shadow of ecological crisis and swelling global populations, the "digital revolution" is pitched as a technical solution for global hunger, promising greater harvests and returns. This vision echoes a broader trend in popular and industry literature (and some scholarship) to reckon Big Data a panacea to pressing social and environmental problems (Salmond et al. 2017). Yet such boosterism heralds significant structural reconfigurations of labour, value, and control. ...
... The point is they do it, and we can track and measure it with unprecedented fidelity" (ibid). For such proponents, Big Data connotes a positivistic, pseudo-empirical disposition that takes data as an accurate representation of the world, obviating social analysis or critique (see also Salmond et al. 2017). Yet data remains subject to interpretation, acts of inclusion and exclusion (Gregg 2015), and processes of representation that impose ontological and epistemological disjunctures between data and that which it is taken as representative of (Williams 2013; Pink et al. 2018). ...
Article
Full-text available
In the face of looming environmental crises and a swelling global population, Big Data’s acolytes envision a “digital revolution” as a solution for global hunger. Interrogating this promise, we argue that Big Data’s imagined futures articulate the realms of international development and smallholder agriculture in the Global South with an ongoing digital reorganisation of global capitalism—integrating farmers into new informational modes of production, and reshaping the nature of labour and human–environment relations in the process. This reorganisation must be located within a long history of crises and spatio-technical fixes for capital accumulation. More specifically, we situate the prefigurations of Big Data along a trajectory of capitalist technical innovations implicated in the propagation of colonial logics, particularly through the apparatuses of international development—for example, through the technical regimes of the “Green Revolution”. The rhetoric of Big Data and its applications within global food systems both reproduce earlier logics of primitive accumulation and colonial biopolitics, and extend them into new forms of digital imperialism that, we suggest, express incipient mutations in the nature of surplus value itself as it is retooled for the Anthropocene era. Big Data therefore portends novel forms of expropriation that are at once material and immaterial.
... We observe a caution where digital 'narratives', for example, can have the inverse effect, not only making evident the effects of fast-scholarship -the reactive and opportunistic nature of scientific discovery today -but they can also lead scientists to more abstracted and biased versions of knowledge, constructed through big data and artificial intelligence (O'Neil, 2016; and pointed to in Waterstone, this issue) which often have limited connection to place, one of Proctor's (1998) cautions (cf., Brierley et al., 2021Salmond et al., 2017). There are risks in this kind of communication of good/bad representations of the more-than-human, where spatial mapping generates bounded territories in real life that get bundled with and abstracted to, for example, access to resources, or, values of land, of houses, of housing insurance, of life savings. ...
Article
Full-text available
This paper aims to foster an explicit geoethical orientation in physical geography. Using examples from Aotearoa New Zealand, we approach the work of physical geography with a set of ethical coordinates derived from our research, arguing that they allow for greater sensitivity in considering what is more-than-human in our research relationships. Working with these ethical coordinates lays the political groundwork for thinking and doing physical geography differently in the pursuit of less exploitative social and ecological relations. Our proposition offers new potentials for the practice of geography more generally: opportunities for enactive research encounters, those that perform generative change for a decolonised, post-productivist, physical geography. https://doi.org/10.1177/26349825221082168
... We give further examples of wicked problems in Table 1. Since this early policy focus, collective scientific responses to such "wickedness" have focused on developing sophisticated monitoring technologies (e.g., for water management, Sousa et al., 2017; or air quality, Morawska et al., 2018), harnessing big data (Salmond et al., 2017) and scaffolding intricate resource co-governance systems (e.g., Lemos & Agrawal, 2006). While these approaches are valuable, and required to try to make sense of, and "manage" a complex human-inhabited world, they lead back to a universal way of seeing the world and perpetuate a drive for a high level, and often singular, solution. ...
Article
“Wicked problems” are complex to understand and challenging to teach. Our experience of teaching about environmental concerns in Aotearoa New Zealand suggests how these concepts are taught is more important for student learning than the nature of wicked problems themselves. By offering opportunities for students to co-develop their own situated knowledges about wicked problems, they can conceptualise and tackle them more effectively at their own pace and in their own experiential contexts. Here we identify and discuss approaches to teaching and learning that can be effectively applied to any wicked problem. We demonstrate a hopeful way to teach and learn about unwieldy and overwhelming issues that many of today’s undergraduates will inevitably be expected to confront in the future. This paper provides aframework to engage students in acourse, and tools for engendering active participation insituated and tangible learning experiences when teaching wicked problems. As lecturers teaching in aSchool of Environment in the disciplinary areas of geography, environmental science, science communication, and sustainability, we discuss the value and applications of these ideas across three levels of undergraduate teaching. We identify challenges that we have experienced and show how it is possible to turn these challenges into opportunities.
... The geomorphometric question will shift towards the computational demands for handling such data in terms of modelling and landscape analysis (Inkpen, 2018;Yu et al., 2018;Chen et al., 2016;Viles, 2016;Piégay et al., 2015;Goodchild et al., 2012). The inevitable questions will be focusing on the characteristics that make data "big", both in terms of the methods of analysis used, and the models of explanation (Salmond et al., 2017). ...
Article
In recent years, the wealth of technological development revolutionised our ability to collect data in geosciences. Due to the unprecedented level of detail of these datasets, geomorphologists are facing new challenges, giving more in-depth answers to a broad(er) range of fundamental questions across the full spectrum of the Earth's (and Planetary) processes. This contribution builds on the existing literature of geomorphometry (the science of quantitative land-surface analysis) and feature extraction (translate land surface parameters into extents of geomorphological elements). It provides evidence of critical themes as well as emerging fields of future research in the digital realm, supporting the likely effectiveness of geomorphometry and feature extractions as they are advancing the theoretical, empirical and applied dimension of geomorphology. The review further discusses the role of geomorphometric legacies, and scientific reproducibility, and how they can be implemented, in the hope that this will facilitate action towards improving the transparency, and efficiency of scientific research, and accelerate discoveries in geomorphology. In the current landscape, substantial changes in landforms, ecosystems, land use, hydrological routing, and direct anthropogenic modifications impact systems across the full spectrum of geomorphological processes. Although uncertainties in the precise nature and likelihood of changes exist, geomorphometry and feature extraction can aid exploring process regimes and landscape responses. Taken together, they can revolutionise geomorphology by opening the doors to improved investigations crossing space and time scales, blurring the boundaries between traditional approaches and computer modelling, and facilitating cross-disciplinary research. Ultimately, the exploitation of the available wealth of digital information can help to translate our understanding of geomorphic processes, which is often based on observations of past or current conditions, into the rapidly changing future.
Article
Full-text available
In the face of the "crisis of reproducibility" and the rise of "big data" with its associated issues, modeling needs to be practiced more critically and less automatically. Many modelers are discussing better modeling practices, but to address questions about the transparency, equity, and relevance of modeling, we also need the theoretical grounding of social science and the tools of critical theory. I have therefore synthesized recent work by modelers on better practices for modeling with social science literature (especially feminist science and technology studies) to offer a "modeler’s manifesto": a set of applied practices and framings for critical modeling approaches. Broadly, these practices involve 1) giving greater context to scientific modeling through extended methods sections, appendices, and companion articles, clarifying quantitative and qualitative reasoning and process; 2) greater collaboration in scientific modeling via triangulation with different data sources, gaining feedback from interdisciplinary teams, and viewing uncertainty as openness and invitation for dialogue; and 3) directly engaging with justice and ethics by watching for and mitigating unequal power dynamics in projects, facing the impacts and implications of the work throughout the process rather than only afterwards, and seeking opportunities to collaborate directly with people impacted by the modeling.
Article
Advances in data acquisition and statistical methodology have led to growing use of machine-learning methods to predict geomorphic disturbance events. However, capturing the data required to parameterize these models is challenging because of expense or, more fundamentally, because the phenomenon of interest occurs infrequently. Thus, it is important to understand how the nature of the data used to train predictive models influences their performance. Using a database of cliff failure prediction and associated covariates from Auckland, New Zealand, we assess the performance of seven machine-learning algorithms under different sampling strategies. Three sampling components are investigated: (i) the number of data points used in model training (sample size), (ii) the prevalence of occurrences (presences) in the data, and (iii) random versus spatial sampling strategy. Across the seven algorithms, small sample sizes can produce models that perform relatively well, especially if the prime concern is identifying key predictors rather than quantifying risk or predicting categorical outcome. Our analyses show that for the same effort (i.e., number of samples), sampling around multiple locations provides better predictions than sampling at just one or a few locations. Predictive performance may be further improved by considering issues such as the nature of what absences actually represent and paying careful attention to decisions about hyperparameter tuning, training-testing data splits, and threshold optimization. It is well known that big data can inform complex data-driven modeling, but here we show that careful sampling can facilitate informative event prediction even from small data.
Chapter
Geography research and teaching have a long history in South Africa, and Geographers are well placed to engage with issues affecting South Africa in the twenty-first century, including climate change, resource scarcity, poverty reduction and sustainable development. Although different research topics have been investigated previously, others represent some new areas of departure for research in physical and/or human geography, and opportunities for future research growth in South Africa. This chapter identifies and discusses the significance of some of these new topic areas for South African Geography, and the application of such research to address twenty-first century local to global issues.
Article
Full-text available
Physical Geography has evolved to become a highly productive mainstream natural science, delivering on the metrics required by the accounting systems dominating the neoliberal University. I argue that the result has been: (1) a crisis of over-production (of more articles than we are capable of consuming); (2) a risk of under-production (growing scarcity in our ability to produce the research questions needed to sustain our productivity); and (3) a "disciplinary fix" involving either pursuit of the problem-solving implicit in the neoliberal impact agenda or creative destruction, aligning ourselves less with geography and more with the natural sciences. Using Isabelle Stengers' critique of 21st-century science, I argue for a slowing down in Physical Geography, by changing how we relate to the subjects that we study. I use the ideas of William Bunge to discuss the notion of geographical expedition as a means of achieving slow science, even if "expedition" is a term to be used cautiously. I illustrate these points from one of my own projects to show how slow science may allow creation of those moments that might lead to a more creative and critical Physical Geography centred on the very curiosity that makes being a scientist so interesting. © 2016 Canadian Association of Geographers / L'Association canadienne des géographes.
Article
Full-text available
While there are now an increasing number of studies that critically and rigorously engage with Big Data discourses and practices, these analyses often focus on social media and other forms of online data typically generated about users. This introduction discusses how environmental Big Data is emerging as a parallel area of investigation within studies of Big Data. New practices, technologies, actors and issues are concretising that are distinct and specific to the operations of environmental data. Situating these developments in relation to the seven contributions to this special collection, the introduction outlines significant characteristics of environmental data practices, data materialisations and data contestations. In these contributions, it becomes evident that processes for validating, distributing and acting on environmental data become key sites of materialisation and contestation, where new engagements with environmental politics and citizenship are worked through and realised.
Article
Full-text available
Scientists have been debating for centuries the nature of proper scientific methods. Currently, criticisms being thrown at data-intensive science are reinvigorating these debates. However, many of these criticisms represent long-standing conflicts over the role of hypothesis testing in science and not just a dispute about the amount of data used. Here, we show that an iterative account of scientific methods developed by historians and philosophers of science can help make sense of data-intensive scientific practices and suggest more effective ways to evaluate this research. We use case studies of Darwin's research on evolution by natural selection and modern-day research on macrosystems ecology to illustrate this account of scientific methods and the innovative approaches to scientific evaluation that it encourages. We point out recent changes in the spheres of science funding, publishing, and education that reflect this richer account of scientific practice, and we propose additional reforms.
Article
Full-text available
The capacity to collect and analyze massive amounts of data is transforming research in the natural and social sciences (1). And yet, the climate change adaptation community has largely overlooked these developments. Here, we examine how “big data” can inform adaptation research and decision-making and outline what’s needed from the adaptation community to maximize this opportunity. We contend that careful application of big data could revolutionize our understanding of how to manage the risks of climate change.
Article
Full-text available
This article considers what role critique might have in the environmental sciences, including physical geography. The intellectual traditions of critical realism and critical social science provide foundations for thinking about different modes of critique in environmental science. The critical realist mode of critique facilitates reflection upon the theoretical and methodological choices available to scientists. The critical social science mode of critique (espoused by many critical geographers, for example) emphasizes the interrogation of social power, and seeks to understand how scientific practices reproduce wider social structures and processes. While both forms of critique contribute to understanding the co-production of science and social order, their respective emphases tend to either underexplain or overdetermine scientific practices. There is a need to develop a mode of critique that seeks to link “internal” scientific choices with “external” societal structures, without bracketing either side. Such critique might begin with scientific practices and trace them towards both their “truth value” and “social structure” constitution. It would foster theoretical, methodological, and political reflexivity within the same breath, across a range of sites and contexts. Geographers can contribute to re-envisioning a critical environmental science that is committed to embracing normative as well as theoretical reflexivity and responsibility.
Article
Full-text available
By enabling the creation of networks of electronic sensors and human participants, new technologies have shaped the ways in which conservation-related organisations monitor wildlife. These networks enable the capture of data perceived as necessary to evidence conservation strategies and foster public support. We collected interview and archival data from UK-based conservation organisations with regard to their use of digital technologies for wildlife monitoring. As a conceptual device to examine these efforts, we used Benedict Anderson’s (1991) work on censuses, maps and museums as social instruments that enabled the imagining of communities. Through a critical application of this framework, the technologically-aided acquisition of wildlife data was shown to inform the new ways in which conservation organisations identify and quantify wildlife, conceptualise animal spaces, and curate conservation narratives. In so defining, delineating and displaying the non-human animal world with the backing of organisational authority, new technologies aid in the representational construction of animal censuses, maps and museums. In terms of practice, large amounts of new data can now be gathered and processed more cost-effectively. However, the use of technologies may also be the result of pressures on organisations to legitimise conservation by being seen as innovative and popular. Either way, human participants are relegated to supporting rather than participatory roles. At a more abstract level, the scale of surveillance associated with instrumentation can be read as an exercise of human dominance. Nonetheless, new technologies present conservation organisations with the means necessary for defending wildlife against exploitation.
Article
Full-text available
This paper explores what the virtual biodiversity e-infrastructure will look like as it takes advantage of advances in ‘Big Data’ biodiversity informatics and e-research infrastructure, which allow integration of various taxon-level data types (genome, morphology, distribution and species interactions) within a phylogenetic and environmental framework. By overcoming the data scaling problem in ecology, this integrative framework will provide richer information and fast learning to enable a deeper understanding of biodiversity evolution and dynamics in a rapidly changing world. The Atlas of Living Australia is used as one example of the advantages of progressing towards this future. Living in this future will require the adoption of new ways of integrating scientific knowledge into societal decision making. This article is part of the themed issue ‘From DNA barcodes to biomes’.
Article
Background Heart failure (HF) remains one of the leading causes of 30-day hospital readmissions. In this study, we examined the feasibility of incorporating RecoverLINK, a digital health technology, into the standard of care for recently discharged HF patients at a large, urban teaching hospital. RecoverLINK supplements transitional care programs by providing outpatients with a 30-day app-based education and intervention program, and delivering analytics to alert providers when an early intervention is needed. Methods We enrolled a convenience sample of 34 HF outpatients to use the RecoverLINK patient app at home for 30-days following a recent discharge, and collected clinical and operational data from the RecoverLINK case manager system as it was used by participating transitional care providers. Feasibility was measured as a function of patient and provider engagement with the technology. We also examined the association between patient engagement with the RecoverLINK app (defined as high, >50% patient-use over 30 program-days, or low, <50% patient-use over 30 program-days) and 30-day hospital readmissions. Results Patients recorded their daily health status in the RecoverLINK app a mean of 58% of program-days, or roughly every other day during the 30-day program (median of 75%). Patient- reported symptoms generated alerts sent to providers via the RecoverLINK case manager system an average of 9 of 30 (29%) program-days. Providers addressed 100% of alerts, and responded within a median of 3 business days. Among all participants, 15% (95% CI: 6%, 30%) were readmitted within 30-days. Stratified by engagement level, 5% (95% CI: 0%, 23%) of highly engaged RecoverLINK patients (n=1/21) were readmitted within 30-days versus 31% (95% CI: 13%, 58%) of patients with low engagement (n=4/13). Conclusion Patients and providers frequently interacted with the RecoverLINK application, supporting the feasibility of its use as a supplement to HF transitional care programs. Readmissions may be influenced by patient engagement with the RecoverLINK app; however, future research is needed to determine the effectiveness of the application to prevent HF readmissions.
Article
Contemporary physical geography is commonly defined as description, analysis, and modelling of the physical, chemical, and biological phenomena at or close to the earth's surface. The sub‐discipline's view of the real world is limited to phenomena that can be observed and/or measured and in this respect differs in no significant way from other physical sciences—but its connection with a geographer's view of the world is tenuous. Changes in natural hazards research are reviewed in order to exemplify the inadequacy of positivistic physical geographers’ perspectives to cope with the style and multiplicity of research questions that are current in that geographical tradition. Roy Bhaskar's version of critical realism gives high priority to philosophical inquiry that aims to understand the nature of reality and provides space for empiricism, analysis of the complex, hierarchical structure of the real world, and questions of value and human behaviour. Openness to recognition of the provisional nature of falsificationism and the limitations of the hypothetico‐deductive framework on the part of physical geographers could revitalize geographical discourse. The search for knowledge of a real world that includes human value systems would seem to require an accompanying concern for the redressing of injustice. La compréhension que les spécialistes de la géographie physique ont du monde réel La géographie physique d'aujourd'hui se définit de manière générale comme la description, l'analyse et la modélisation de phénomènes physiques, chimiques et biologiques sur la surface de la Terre ou tout près d'elle. Si la conception que la sous‐discipline se fait du monde réel se limite à des phénomènes qui peuvent être observés ou mesurés et, à cet égard, elle est comparable en tout point aux autres sciences physiques, son rapport avec la conception du monde par le géographe demeure néanmoins flou. Un tableau de l'évolution des travaux de recherche sur les risques naturels permet d'illustrer à quel point les perspectives positivistes des spécialistes de la géographie physique sont inadaptées pour répondre aux nombreux types de questions de recherche que soulève ce courant géographique. Roy Bhaskar soutient dans sa définition du réalisme critique que la priorité soit accordée à une réflexion philosophique qui s'inscrit dans une optique de comprendre la véritable nature de la réalité tout en laissant une place à l'empirisme, l'analyse de la structure hiérarchique complexe du monde réel et aux questions portant sur les valeurs et sur les comportements humains. Une ouverture de la part des spécialistes de la géographie physique visant la reconnaissance de la nature provisoire du réfutationnisme et des limites du raisonnement hypothético‐déductif pourrait revaloriser le discours géographique. La quête du savoir d'un monde réel qui comprend des systèmes de valeurs humaines exigerait aussi que l'on se préoccupe d'obtenir une réparation de l'injustice.
Article
[Extract] In case you had not yet noticed, we are in the midst of an environmental revolution – one that revolves around escalating improvements in Earth Observation, open data, and the creation of new data platforms that provide remarkable linkages among disparate kinds of information. The power and potential of these new platforms is unprecedented.