ArticlePDF Available

On the Art of Classification in Spatial Ecology: Fuzziness as an Alternative for Mapping Uncertainty

Frontiers
Frontiers in Ecology and Evolution
Authors:

Abstract and Figures

In this piece, we wanted to highlight the limitations associated with classification techniques that are based on Boolean logic (i.e., true/false) and that impose discrete boundaries to systems. We proposed to shift practices toward techniques that learn from the system under study by adopting soft classification to support uncertainty evaluation and we give reason why this is so recommended.
This content is subject to copyright.
OPINION
published: 21 December 2018
doi: 10.3389/fevo.2018.00231
Frontiers in Ecology and Evolution | www.frontiersin.org 1December 2018 | Volume 6 | Article 231
Edited by:
Miles David Lamare,
University of Otago, New Zealand
Reviewed by:
Zhi Huang,
Geoscience Australia, Australia
*Correspondence:
Dario Fiorentino
dario.fiorentino@awi.de
Specialty section:
This article was submitted to
Biogeography and Macroecology,
a section of the journal
Frontiers in Ecology and Evolution
Received: 28 June 2018
Accepted: 10 December 2018
Published: 21 December 2018
Citation:
Fiorentino D, Lecours V and Brey T
(2018) On the Art of Classification in
Spatial Ecology: Fuzziness as an
Alternative for Mapping Uncertainty.
Front. Ecol. Evol. 6:231.
doi: 10.3389/fevo.2018.00231
On the Art of Classification in Spatial
Ecology: Fuzziness as an Alternative
for Mapping Uncertainty
Dario Fiorentino 1
*, Vincent Lecours 2and Thomas Brey 1,3
1Helmholtz Institute for Functional Marine Biodiversity at the University Oldenburg (HIFMB), Oldenburg, Germany, 2Fisheries
and Aquatic Sciences, School of Forest Resources and Conservation, University of Florida, Gainesville, FL, United States,
3Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
Keywords: uncertainty, spatial ecology, discrete classification, soft boundaries, mapping transitions
INTRODUCTION
Classifications may be defined as the result of the process by which similar objects are recognized
and categorized through the separation of elements of a system into groups of response (Everitt
et al., 2011). This is done by submitting variables to a classifier, that first quantifies the similarity
between samples according to a set of criteria and then regroups (or classifies) samples in order to
maximize within-group similarity and minimize between-group similarity (Everitt et al., 2011).
Classifications have become critical in many disciplines. In spatial ecology, for example,
grouping locations with similar features may help the detection of areas driven by the same
ecological processes and occupied by same species (Fortin and Dale, 2005; Elith et al., 2006),
which can support conservation actions. In fact, classifications have been used with the aim of
investigating the spatial distribution of target categories such as habitats (Coggan and Diesing,
2011), ecoregions (Fendereski et al., 2014), sediment classes (Hass et al., 2017), or biotopes (Schiele
et al., 2015). Sometimes such classifications were found to act as surrogates for biodiversity in
data-poor regions (e.g., Lucieer and Lucieer, 2009; Huang et al., 2012), some class being known
for supporting higher biodiversity. Many of the traditional classification methods were developed
in order to reduce system complexity (Fortin and Dale, 2005) by imposing discrete boundaries
between elements of a system; it is easier for the human mind to simplify complex systems by
identifying discrete patterns (Eysenck and Keane, 2010), and grouping similar elements together
(Everitt et al., 2011). However, in natural environments, spatial and temporal transitions between
elements of a system are often gradual (e.g., an intertidal flat transitioning from land to sea) (Farina,
2010). Those transitions may display distinct properties from those of the two elements they
separate. Despite the particularities and importance of such transitions, they are often disregarded
in ecological research (Foody, 2002), leading to the adoption of approaches that, by defining sharp
boundaries, may fail to appropriately describe natural patterns and groups of a system. Such
approaches have become the norm, despite the existence of approaches such as, fuzzy logic (Zadeh,
1965) and machine learning (Kuhn and Johnson, 2013) that are able to offer a more representative
description of those natural transitional zones. In ecology for instance, machine learning approaces
have gained some traction because of their ability to predict classes distribution (area-wide) in
data-poor conditions (e.g., sparse punctual information) with a relative high performance and
with no particular assumption in building the relationship between targetted classes and physical
parameters (e.g., Barry and Elith, 2006; Brown et al., 2011; Fernández-Delgado et al., 2014).
In the present contribution, we aim at highlighting the limitations associated with classification
techniques that are based on Boolean logic (i.e., true/false) and that impose discrete boundaries to
systems. We propose to shift practices toward techniques that learn from the system under study
by adopting soft classification to support uncertainty evaluation.
Fiorentino et al. Soft Classification and Uncertainty
DISCRETE CLASSIFICATIONS
The increasing availability of tools and software to semi-
automatically perform classifications has reduced the amount
of critical thoughts put into the exercise. Classification results
are sensitive to a variety of decisions made when establishing
the methodology of a particular application (Lecours et al.,
2017). For instance, (1) the method to compare the objects
to be classified (e.g., distance-based method), (2) whether or
not the method assigns each object to one single class (i.e.,
Boolean approach) or assigns a membership for one or multiple
classes (e.g., fuzzy logic approach), (3) whether or not the
method uses samples to train the classification (i.e., supervised
or unsupervised approach), and (4) the evaluation methods
(see Foody, 2002; Borcard et al., 2011). Furthermore, a number
of other potential sources of errors resulting in potentially
misleading classifications remains: variation in the data collection
methods, the spatial, temporal, and thematic scales (i.e., the way
the data are categorized/identified), the spatial and temporal
stability of the observations and the goodness of model fit (Barry
and Elith, 2006).
One of the most challenging parts of using classifications is to
evaluate how representative of real patterns the classification is.
Due to the cumulative effect of the factors listed above (Rocchini
et al., 2011), classification results may not adequately represent
natural patterns, making the classified patterns artificial, and
misleading, for instance when used to assist decision-making
(Lecours et al., 2017; see Fiorentino et al., 2017). Even
when a robust method is developed to reduce the impact of
these factors, the concept of discrete classes may in itself be
misleading; this type of classes can provide an incomplete,
oversimplified representation of complex natural patterns, and
thus misrepresent the reality to be described. In spatial ecology,
using discrete classes involves establishing “hard” boundaries
between them, which has been shown to cause misclassification
errors (Sweeney and Evans, 2012; Lecours et al., 2017). For
instance, an object may not always belong to one of the defined
classes, thus being forced into one of those defined classes by the
classifier. When doing so, the interpretation that is often made
is that the object was misclassified or that there was an error
of the algorithm, while in fact, it is a consequence of the often-
erroneous assumption that all objects must belong to a specific
class. A solution that has been proposed to avoid those errors is
to shift practices toward “soft” classifications instead of “discrete”
classifications.
SOFT CLASSIFICATIONS
It has been argued in the literature that soft classification
approaches better represent and describe natural patterns,
including transitional areas (Ries et al., 2004). Soft classification
approaches, which include fuzzy logic (Zadeh, 1965) Bayesian,
neural networks, support vector machines, decision trees,
boosting, bagging, generalized linear models, and multiple
adaptive regression splines, among others (see Fernández-
Delgado et al., 2014 for a comprehensive review on
classification and associated problems), acknowledge that
one object may belong to more than one class This enables
the recognition of elements of the real world that are
between classes (i.e., do not belong to one specific pre-
defined class), such as transitional zones and fine-scale natural
heterogeneity.
While hard classifiers assign an object to a class following
a Boolean, binary system (true/false, in a class or not), soft
classifiers assign memberships to objects. Memberships are the
estimated probabilities of objects to belong to a class (Everitt
et al., 2011). Each object will have as many membership values
as there are classes. Membership varies along the continuum
ranging from 0 (not a member) to 1 (definitely a member). An
object that has relatively high membership values for more than
one class is thus said to be not clearly associated with one specific
class. This may be an indication that no class adequately describes
this particular object, which could inform and guide further
analyses in order to explain that pattern. Those analyses may for
instance highlight that this data object is an error, or if there are
many objects in the same situation, that a new class needs to be
defined.
CAN SOFT CLASSIFICATIONS BE USED
TO ESTIMATE UNCERTAINTY?
In remote sensing, traditional pixel-based classifications use
image pixels as objects to be classified. The use of discrete
classifiers in land cover studies often oversimplifies the actual
land cover (Foody, 2000). For instance, forested wetlands, which
in nature mark the transition between forested areas and waters
bodies, would most likely be classified as a “forest” land cover
or a “water” land cover by a hard classifier. For the purpose
of this example, we assume that it classifies it as “forest.” On
the other end, a soft classifier might assign to that same pixel a
“forest” membership of 0.65 and a “water” membership of 0.35.
As a result, different users could interpret those classifications
according to the following statements:
A) Based on the hard classification result, the pixel represents
“forest” land cover. In the absence of a ground-truth point data
for that particular pixel, it could be assumed that this result is
100% certain.
B) Based on the soft classification result, the pixel represents a
mixture of “forest” land cover and “water” land cover.
C) Based on the multiple memberships from the soft
classification, the pixel has an associated level of ambiguity
(e.g., it is 35% ambiguous that the pixel represents a “forest”
land cover and 65% ambiguous that it represents a “water”
land cover).
D) Based on the multiple memberships from the soft
classification, it is possible that the pixel belongs to a
distinct class characterized by a mixture of forest and water
(the transition), perhaps “wetland.” After validation or a
re-run of the soft classification with a new class, it could
potentially be possible to say that the pixel represents a
“wetland” land cover with much less ambiguity. While we
argue that soft classifications are more appropriate than hard
classifications, we note that a re-run of the hard classification
Frontiers in Ecology and Evolution | www.frontiersin.org 2December 2018 | Volume 6 | Article 231
Fiorentino et al. Soft Classification and Uncertainty
TABLE 1 | Methods for synthesizing membership values (p) resulting from soft classifications at each location (i) for a given number of classes (K).
Name Confusion index (CI) Pielou evenness index (J) Red Green Blue (RGB)
Formula CIi=
pmax1
pmax
Ji=
PK
i=1pilog pi
log K
Ri=255 (pmax)i
Gi=255 pmax1i
Bi=255 pmax2i
Interpretation 0 1 0 1 red, green and
blue
yellow, violet and
cyan
white
full dominance of one
class
classes have
similar probability
full dominance of one
class
classes have similar
probability
full dominance of
one class
two classes evenly
dominate
all classes have
similar probability
Number of classes 2 k 3
References (Burrough et al., 1997; e.g., in Lucieer and
Lucieer, 2009)
(Pielou, 1975; e.g., in Fiorentino et al., 2017) (Boughen, 2003; e.g., in Hass et al., 2017)
We note that the Red Green Blue (RGB) method only provides a visual representation and it is not quantitative. It combines the first three highest memberships into a composite three-band RGB image, that is the geographic overlay of
the three bands.
with a class corresponding to wetlands would also be more
accurate than the initial hard classification.
The interpretation in “A is not fully representative of the
reality as the water component of the pixel is not reported
or acknowledged at all. In “B” a nuance is added to the
interpretation as the potentially mixed nature of the pixel is
acknowledged. In “C, that nuance is interpreted as a measure of
ambiguity, i.e., that it is acknowledged that the classifier could
not distinguish between the two classes. Finally, in “D, the mixed
nature of the pixel is used to redefine the classification based
on ecological knowledge (e.g., that wetlands can be a mixture
of forest and water from a remote sensor’s perspective), and to
guide further analyses about the nature of the land cover. That
simple example illustrates uncertainty as defined by Zhang and
Goodchild (2002), i.e., as the ambiguity of a classification. Since
membership quantifies the probability of an object to belong to
multiple classes, it can be used as a measure of ambiguity, and
therefore as a measure of uncertainty (Yager, 2016).
Depending on the nature of the analysis, the spatial
representation of membership can be used to display spatial
uncertainty of classifications, or to identify transitions between
existing classes. Traditional fuzzy and model-based approaches
thus acknowledge transitions among classes.
However, such approaches have the limitation to be
algorithmic, i.e., not explicitly accounting for data statistical
properties such as randomness (Warton et al., 2015) and mean-
variance trends (Warton et al., 2012)–although resampling
or finite mixture approaches may provide likelihood-based
foundations to the clustering (Pledger and Arnold, 2014). In
fact, methods based on a single classifier offer only one piece
of evidence, which does not provide any confidence interval
of the membership estimation (Huang and Lees, 2004; Liu
et al., 2004). In turn, ensemble modeling approaches overcome
such problems in a consistent environment (Elder IV, 2003).
Despite some criticisms to ensemble modeling approaches like
the lack of interpretation capability and some tendency to
overfit (see Elder IV, 2003 for a discussion on the topic),
ensemble modeling approaches were shown to outperform other
methods (Elder IV, 2003; Fernández-Delgado et al., 2014). The
confidence interval around the membership values provided by
such models may be used to acknowledge natural transitions
in addition to the error estimates around those membership
values.
THE ADVANTAGE OF MAPPING
UNCERTAINTY
The concepts of uncertainty and error have often mistakenly
been used interchangeably (Zhang and Goodchild, 2002; Jager
and King, 2004). As a consequence, uncertainty is often perceived
negatively because of its association with the idea of error and
inaccuracy. Uncertainty is inherent to any and all data and
cannot necessarily be removed or minimized to get closer to
the truth the same way errors can (Zhang and Goodchild,
2002; Beale and Lennon, 2012). Uncertainty is part of our
perception of natural patterns (e.g., temporal dynamics, spatial
Frontiers in Ecology and Evolution | www.frontiersin.org 3December 2018 | Volume 6 | Article 231
Fiorentino et al. Soft Classification and Uncertainty
FIGURE 1 | Example of learning classifier workflow using a fuzzy clustering. The learning phase is linked to the data acquisition because fuzziness may highlight data
weakness, thereby areas where new data acquisition is required. Note that the same workflow can be translated to any other methods that allow uncertainty.
variability), data representation (e.g., positional uncertainty,
measurement uncertainty, thematic uncertainty), and modeling
(e.g., error introduced by the model) (e.g., Barry and Elith,
2006).
Maps of uncertainty, or maps of ignorance, help to identify
areas where the classification has a stable, consistent, and distinct
solution and can be used further to target the areas where
uncertainty is higher, thus highlighting the need to deepen the
investigation in those areas (Rocchini et al., 2011). It has been
demonstrated that maps of uncertainty enhance decision-making
in conservation contexts by solving issues that often appear when
hard classifications are applied (Regan et al., 2005; Langford et al.,
2009). In fact, maps of uncertainty permit a more realistic and
natural delineation of boundaries between classes (Zhang and
Goodchild, 2002).
Soft classifications and ensemble models allow users’
knowledge of the system to grow by providing an estimation
of uncertainty. When classes cannot provide an appropriate
description of the system under study (for example in cases
of high fuzziness), the investigator can choose to change the
classifier or deepen the investigation to better understand
the patterns in the system and how it translates into the
data representation. On the opposite, Boolean (true/false)
approaches only offer a static view, and while measures of
accuracy can be calculated to quantify misclassifications, they
have been shown to sometimes misrepresent the amplitude of
the misclassifications (e.g., Lecours et al., 2016) and often cannot
guide further investigations to better understand the dynamics
at play.
To assist with the interpretation of uncertainty resulting from
the application of a soft classifier, at least three solutions based
on membership assignations of objects to classes can be used
(Table 1). These solutions can be used to display uncertainty
spatially in a map, which can be interpreted, discussed, and then
used to further enhance the classification in an iterative process
(Figure 1).
CONCLUSION
We acknowledge that classification methods need to be reliable
and appropriate for the intended use, and adequately represent
the natural complexity of the systems under study. However, we
think that the main challenge of classifications is the proper and
meaningful interpretation of the associated uncertainty rather
than the method itself.
The use of soft classifiers to provide the visual and spatial
display of classification uncertainty enhances the value of
classifications. Providing a measure of uncertainty associated
with classes leads to the fulfillment of classifications’ potential,
which goes beyond the simple identification and separation of
classes. Classifications based on concepts of fuzzy logic, model-
based approaches, and ensemble modeling approaches, will help
move away from classes with hard, discrete boundaries, yielding
better solutions to represent accurately and better understand
complex systems. Acknowledging uncertainty encourages
learning from the classification process by encouraging
further investigation and hypothesizing about its causes (e.g.,
inappropriate spatial resolution, data quality, inappropriate
number of classes).
Whether the aim of an exercise is to communicate results
or to start the investigation of a system through exploratory
analyses, the spatial display of uncertainty provides directions
on which actions need to be undertaken. While stakeholders
may use map of uncertainty to find for instance the proper
conservation measure and therefore to better handle areas where
high uncertainty is displayed, scientists may use the same
information to build new hypotheses and shed light on processes
that underpin such uncertainty.
Frontiers in Ecology and Evolution | www.frontiersin.org 4December 2018 | Volume 6 | Article 231
Fiorentino et al. Soft Classification and Uncertainty
AUTHOR CONTRIBUTIONS
DF conceived and organized the manuscript. DF, VL, and TB
wrote and reviewed the manuscript.
ACKNOWLEDGMENTS
We would like to thank the reviewer and Dr. Casper Kraan whose
comments helped to improve the manuscript.
REFERENCES
Barry, S., and Elith, J. (2006). Error and uncertainty in habitat models. J. Appl. Ecol.
43, 413–423. doi: 10.1111/j.1365-2664.2006.01136.x
Beale, C. M., and Lennon, J. J. (2012). Incorporating uncertainty in predictive
species distribution modelling. Philos. Trans. R. Soc. Lond. B Biol. Sci. 367,
247–258. doi: 10.1098/rstb.2011.0178
Borcard, D., Gillet, F., and Legendre, P. (2011). Numerical Ecology With R, 1st Edn.
New York, NY: Springer.
Boughen, N. (2003). LightWave 3D 7.5 Lighting, 500th edn. Plano, TX: Wordware
Publishing, Inc.
Brown, C. J., Smith, S. J., Lawton, P., and Anderson, J. T. (2011). Benthic habitat
mapping: a review of progress towards improved understanding of the spatial
ecology of the seafloor using acoustic techniques. Estuar. Coast. Shelf Sci. 92,
502–520. doi: 10.1016/j.ecss.2011.02.007
Burrough, P. A., van Gaans, P. F. M., and Hootsmans, R. (1997). Continuous
classification in soil survey: spatial correlation, confusion and boundaries.
Geoderma 77, 115–135. doi: 10.1016/S0016-7061(97)00018-9
Coggan, R., and Diesing, M. (2011). The seabed habitats of the central english
channel: a generation on from holme and cabioch, how do their interpretations
match-up to modern mapping techniques? Cont. Shelf Res. 31, 132–150.
doi: 10.1016/j.csr.2009.12.002
Elder IV, J. F. (2003). The generalization paradox of ensembles. J. Comput. Graph.
Stat. 12, 853–864. doi: 10.1198/1061860032733
Elith, J., H., Graham, C., P., Anderson, R., Dudík, M., et al. (2006). Novel methods
improve prediction of species’ distributions from occurrence data. Ecography
29, 129–151. doi: 10.1111/j.2006.0906-7590.04596.x
Everitt, B. S., Landau, S., Leese, M., and Stahl, D. (2011). Cluster Analysis, 5th Edn.
London: John Wiley & Sons. Available online at: http://onlinelibrary.wiley.com/
book/10.1002/9780470977811 (Accessed April 24, 2013).
Eysenck, M. W., and Keane, M. T. (2010). Cognitive Psychology: A Student’s
Handbook, 6 Student Edn. New York, NY: Taylor & Francis.
Farina, A. (2010). Ecology, Cognition and Landscape: Linking Natural and Social
Systems, 2009 edn. Dordrecht ; New York, NY: Springer.
Fendereski, F., Vogt, M., Payne, M. R., Lachkar, Z., Gruber, N., Salmanmahiny, A.,
et al. (2014). Biogeographic classification of the Caspian Sea. Biogeosciences 11,
6451–6470. doi: 10.5194/bg-11-6451-2014
Fernández-Delgado, M., Cernadas, E., Barro, S., and Amorim, D. (2014). Do
we need hundreds of classifiers to solve real world classification problems? J.
Mach. Learn. Res. 15, 3133–3181. Available online at: http://jmlr.org/papers/
v15/delgado14a.html
Fiorentino, D., Pesch, R., Guenther, C.-P., Gutow, L., Holstein, J., Dannheim,
J., et al. (2017). A ‘fuzzy clustering’ approach to conceptual confusion: how
to classify natural ecological associations. Mar. Ecol. Prog. Ser. 584, 17–30.
doi: 10.3354/meps12354
Foody, G. M. (2000). Mapping land cover from remotely sensed data with a
softened feedforward neural network classification. J. Intell. Robotic Syst. 29,
433–449. doi: 10.1023/A:1008112125526
Foody, G. M. (2002). Status of land cover classification accuracy assessment.
Remote Sens. Environ. 80, 185–201. doi: 10.1016/S0034-4257(01)00295-4
Fortin, M.-J., and Dale, M. R. T. (2005). Spatial Analysis: A Guide for Ecologists.
Cambridge; New York, NY: Cambridge University Press.
Hass, H. C., Mielck, F., Fiorentino, D., Papenmeier, S., Holler, P., and Bartholom,ä,
A. (2017). Seafloor monitoring west of Helgoland (German Bight, North Sea)
using the acoustic ground discrimination system RoxAnn. Geo Mar. Lett. 37,
125–136. doi: 10.1007/s00367-016-0483-1
Huang, Z., and Lees, B. G. (2004). Combining non-parametric models for
multisource predictive forest mapping. Photogr. Eng. Remote Sens. 70, 415–425.
doi: 10.14358/PERS.70.4.415
Huang, Z., Nichol, S. L., Siwabessy, J. P. W., Daniell, J., and Brooke, B. P.
(2012). Predictive modelling of seabed sediment parameters using multibeam
acoustic data: a case study on the Carnarvon Shelf, Western Australia.
Int. J. Geograph. Inform. Sci. 26, 283–307. doi: 10.1080/13658816.2011.
590139
Jager, H. I., and King, A. W. (2004). Spatial uncertainty and ecological models.
Ecosystems 7, 841–847. doi: 10.1007/s10021-004-0025-y
Kuhn, M., and Johnson, K. (2013). Applied Predictive Modeling. New York, NY;
Heidelberg; Dordrecht; London: Springer.
Langford, W. T., Gordon, A., and Bastin, L. (2009). When do conservation
planning methods deliver? Quantifying the consequences of uncertainty. Ecol.
Inform. 4, 123–135. doi: 10.1016/j.ecoinf.2009.04.002
Lecours, V., Brown, C. J., Devillers, R., Lucieer, V. L., and Edinger,
E. N. (2016). Comparing selections of environmental variables for
ecological studies: a focus on terrain attributes. PLoS ONE 11:e0167128.
doi: 10.1371/journal.pone.0167128
Lecours, V., Devillers, R., Edinger, E. N., Brown, C. J., and Lucieer, V. L.
(2017). Influence of artefacts in marine digital terrain models on habitat maps
and species distribution models: a multiscale assessment. Remote Sens. Ecol.
Conserv. 3, 232–246. doi: 10.1002/rse2.49
Liu, W., Gopal, S., and Woodcock, C. E. (2004). Uncertainty and confidence in land
cover classification using a hybrid classifier approach. Photogram. Eng. Remote
Sens. 70, 963–971. doi: 10.14358/PERS.70.8.963
Lucieer, V., and Lucieer, A. (2009). Fuzzy clustering for se afloor classification. Mar.
Geol. 264, 230–241. doi: 10.1016/j.margeo.2009.06.006
Pielou, E. C. (1975). Ecological Diversity. New York, NY: Wiley & Sons.
Pledger, S., and Arnold, R. (2014). Multivariate methods using mixtures:
correspondence analysis, scaling and pattern-detection. Comput. Stat. Data
Anal. 71, 241–261. doi: 10.1016/j.csda.2013.05.013
Regan, H. M., Ben-Haim, Y., Langford, B., Wilson, W. G., Lundberg, P., Andelman,
S. J., et al. (2005). Robust decision-making under severe uncertainty for
conservation management. Ecol. Appl. 15, 1471–1477. doi: 10.1890/03-5419
Ries, L., Robert, J., Fletcher, J.,B attin, J., and Sisk, T. D. (2004). Ecological responses
to habitat edges: mechanisms, models, and variability explained. Annu. Rev.
Ecol. Evol. Syst. 35, 491–522. doi: 10.1146/annurev.ecolsys.35.112202.130148
Rocchini, D., Hortal, J., Lengyel, S., Lobo, J. M., Jiménez-Valverde, A.,
Ricotta, C., et al. (2011). Accounting for uncertainty when mapping species
distributions: the need for maps of ignorance. Prog. Phys. Geogr. 35, 211–226.
doi: 10.1177/0309133311399491
Schiele, K. S., Darr, A., Zettler, M. L., Friedland, R., Tauber, F., von Weber, M., et al.
(2015). Biotope map of the German Baltic Sea. Mar. Pollut. Bull. 96, 127–135.
doi: 10.1016/j.marpolbul.2015.05.038
Sweeney, S. P., and Evans, T. P. (2012). An edge-oriented approach
to thematic map error assessment. Geocarto Int. 27, 31–56.
doi: 10.1080/10106049.2011.622052
Warton, D. I., Foster, S. D., De’ath, G., Stoklosa, J., and Dunstan, P. K. (2015).
Model-based thinking for community ecology. Plant Ecol. 216, 669–682.
doi: 10.1007/s11258-014-0366-3
Warton, D. I., Wright, S. T., and Wang, Y. (2012). Distance-based multivariate
analyses confound location and dispersion effects. Methods Ecol. Evolut. 3,
89–101. doi: 10.1111/j.2041-210X.2011.00127.x
Yager, R. (2016). Uncertainty modeling using fuzzy measures. Knowl. Based Syst.
92, 1–8. doi: 10.1016/j.knosys.2015.10.001
Zadeh, L. A. (1965). Fuzzy sets. Inform. Control 8, 338–353.
doi: 10.1016/S0019-9958(65)90241-X
Zhang, J., and Goodchild, M. F. (2002). Uncertainty in Geographical Information.
London; New York, NY: Taylor & Francis.
Conflict of Interest Statement: The authors declare that the research was
conducted in the absence of any commercial or financial relationships that could
be construed as a potential conflict of interest.
Copyright © 2018 Fiorentino, Lecours and Brey. This is an open-access article
distributed under the terms of the Creative Commons Attribution License (CC BY).
The use, distribution or reproduction in other forums is permitted, provided the
original author(s) and the copyright owner(s) are credited and that the original
publication in this journal is cited, in accordance with accepted academic practice.
No use, distribution or reproduction is permitted which does not comply with these
terms.
Frontiers in Ecology and Evolution | www.frontiersin.org 5December 2018 | Volume 6 | Article 231
... To determine the spatial uncertainty underlying a classified map, it may be necessary to generate additional outputs from our HDMs, such as class probabilities. These class probabilities can in turn be used to compute uncertainty metrics such as the confusion index [27] or Pielou's evenness metric [28] recently highlighted by Fiorentino et al. [29]; or Shannon entropy [30] and related measures of dominance [31] as well as other indices used in disciplines such as soil science [32][33][34] and land-cover mapping [35]. It may also be beneficial to compute spatial uncertainty metrics based on multiple models to produce metrics that combine the frequency at which a class is predicted with its probability (e.g., [36]). ...
... Several authors have discussed and applied metrics for spatial uncertainty in classification studies. Drawing on their own work and that reported in the literature, Fiorentino et al. published an opinion paper [29] advocating increased use of soft classifications in spatial ecology. Citing several examples from seabed habitat mapping studies, the authors highlight use of the Burrough's confusion index [27], Pielou's evenness index [28], and a red-green-blue (RGB) representation of a multiband raster comprising the probability of the three most likely classes for each pixel. ...
... Similar to the Pielou evenness index [28] mentioned by Fiorentino et al. [29], the Shannon entropy index has been used by several authors as an uncertainty measure for soil classification maps (e.g., [32,70]), so was favoured in this study. Shannon entropy [30] is a generic method (from which Pielou Evenness is computed) which has recently been adopted for uncertainty quantification in several spatial classification studies broadly similar to our biotope HDMs (e.g., [32]). ...
Article
Full-text available
The use of habitat distribution models (HDMs) has become common in benthic habitat mapping for combining limited seabed observations with full-coverage environmental data to produce classified maps showing predicted habitat distribution for an entire study area. However, relatively few HDMs include oceanographic predictors, or present spatial validity or uncertainty analyses to support the classified predictions. Without reference studies it can be challenging to assess which type of oceanographic model data should be used, or developed, for this purpose. In this study, we compare biotope maps built using predictor variable suites from three different oceanographic models with differing levels of detail on near-bottom conditions. These results are compared with a baseline model without oceanographic predictors. We use associated spatial validity and uncertainty analyses to assess which oceanographic data may be best suited to biotope mapping. Our results show how spatial validity and uncertainty metrics capture differences between HDM outputs which are otherwise not apparent from standard non-spatial accuracy assessments or the classified maps themselves. We conclude that biotope HDMs incorporating high-resolution, preferably bottom-optimised, oceanography data can best minimise spatial uncertainty and maximise spatial validity. Furthermore, our results suggest that incorporating coarser oceanographic data may lead to more uncertainty than omitting such data.
... Though we provided several different statistical measures describing the accuracy of our predictions, covering the gap between reality and the accuracy assessment of our predicted maps remains a challenge [14]. Combining continuous models into thematic maps, as well as using substrate models to train biological models made techniques such as cross-validation impractical. ...
... Our maps of Hoburgs Bank are intended to be living rather than static products that can be enhanced as soon as, for instance, more or better training data becomes available. However, even with future improvements (e.g., related to data quality, reduction of human error, and more sophisticated models and techniques), we will always live with some degree of uncertainty due to natural variability that our modeling will not be able to incorporate [14,58]. ...
Article
Full-text available
Predefined classification schemes and fixed geographic scales are often used to simplify and cost-effectively map the spatial complexity of nature. These simplifications can however limit the usefulness of the mapping effort for users who need information across a different range of thematic and spatial resolutions. We demonstrate how substrate and biological information from point samples and photos, combined with continuous multibeam data, can be modeled to predictively map percentage cover conforming with multiple existing classification schemes (i.e., HELCOM HUB; Natura 2000), while also providing high-resolution (5 m) maps of individual substrate and biological components across a 1344 km2 offshore bank in the Baltic Sea. Data for substrate and epibenthic organisms were obtained from high-resolution photo mosaics, sediment grab samples, legacy data and expert annotations. Environmental variables included pixel and object based metrics at multiple scales (0.5 m–2 km), which improved the accuracy of models. We found that using Boosted Regression Trees (BRTs) to predict continuous models of substrate and biological components provided additional detail for each component without losing accuracy in the classified maps, compared with a thematic model. Results demonstrate the sensitivity of habitat maps to the effects of spatial and thematic resolution and the importance of high-resolution maps to management applications.
... Fuzzy matrix can be used to compare the species richness and abundance of different areas Juan A. Balbuena et al. [3] . Fuzzy logic can be used to identify patterns in the data, which can be used to better understand the diversity of a particular ecosystem De Cáceres, et al. [4] Fiorentino Dario et al. [5] . Fuzzy logic allows us to think about things in terms of degrees of membership and gradations of similarity Bagnaro et al. [6] ; Bandelj et al. [7] . ...
Article
Full-text available
Fuzzy logic can be used to identify patterns in the data, which can be used to better understand the diversity of a particular ecosystem. This study examines the diversity of ants in different sites of Palayamkottai using the Combined Effect Quantity Dependent Data Matrix (CEQD-Matrix). Initial raw data is collected and transformed into an Average Quantity Dependent Data matrix (AQD Matrix) by taking ants' names as rows and point count sites as columns. A defined quantity dependent data matrix (RQD Matrix) is then generated using mean and standard deviation methods. Finally, a Combined Effect Quantity Dependent Data matrix (CEQD Matrix) is produced to show the cumulative effect of all the entries. Python is used to generate the graphs of the RQD Matrix and CEQD Matrix. It was found that the species Tetraponera rufonigra predominantly occupied the all sites, followed by Oecophylla smaragdina. The matrix was also used to predict the effects of anthropogenic disturbances on the ant diversity.
... The inclusion of structural features beyond the DSM has the potential to improve the segmentation in these areas. Additionally, classifications based on fuzzy logic principles would potentially help improve results in these challenging environments as they can better represent natural transitional areas between habitat types; they do not impose discrete boundaries between adjacent habitat types [54]. Fuzzy logic and other soft classification principles also allow objects to belong to more than one class. ...
Article
Full-text available
Monitoring intertidal habitats, such as oyster reefs, salt marshes, and mudflats, is logistically challenging and often cost- and time-intensive. Remote sensing platforms, such as unoccupied aircraft systems (UASs), present an alternative to traditional approaches that can quickly and inexpensively monitor coastal areas. Despite the advantages offered by remote sensing systems, challenges remain concerning the best practices to collect imagery to study these ecosystems. One such challenge is the range of spatial resolutions for imagery that is best suited for intertidal habitat monitoring. Very fine imagery requires more collection and processing times. However, coarser imagery may not capture the fine-scale patterns necessary to understand relevant ecological processes. This study took UAS imagery captured along the Gulf of Mexico coastline in Florida, USA, and resampled the derived orthomosaic and digital surface model to resolutions ranging from 3 to 31 cm, which correspond to the spatial resolutions achievable by other means (e.g., aerial photography and certain commercial satellites). A geographic object-based image analysis (GEOBIA) workflow was then applied to datasets at each resolution to classify mudflats, salt marshes, oyster reefs, and water. The GEOBIA process was conducted within R, making the workflow open-source. Classification accuracies were largely consistent across the resolutions, with overall accuracies ranging from 78% to 82%. The results indicate that for habitat mapping applications, very fine resolutions may not provide information that increases the discriminative power of the classification algorithm. Multiscale classifications were also conducted and produced higher accuracies than single-scale workflows, as well as a measure of uncertainty between classifications.
... Image classification based on fuzzy logic has rarely been adopted in sea-or lake-bottom mapping based on underwater acoustic measurements (e.g., Tęgowski et al. 2018). However, this approach is auspicious in remote-sensing exploration of underwater seabeds or lake bottoms (Fiorentino et al. 2018). Fuzzy logic was successfully adopted in our study for the classification of a multidimensional data set of secondary features. ...
Article
Full-text available
One of the main challenges of underwater archaeology is to develop non‐invasive research of heritage sites in order to enable their further protection for future societies. This study aims to explore, identify and classify archaeological objects in a shallow lake using underwater acoustics. We solved the aforementioned challenges by developing an innovative, object‐based, fuzzy‐logic classification of nine archaeological object categories based on multibeam echosounder bathymetry, 13 secondary features of bathymetry, and 106 underwater diving prospections. We achieved an 86% correlation with ground‐truth samples, and 49% overall accuracy. The unique and repeatable workflow developed in this study can be applied to other case studies of underwater archaeology around the world.
... In areas of high marsh, for instance, oysters may be partially or entirely obscured in UAS imagery. In cases with integrated habitat patches, fuzzy logic and classifications may be more representative of natural transitional zones than traditional techniques that can oversimplify complex systems by imposing discrete boundaries to habitats [46]. The area that demonstrated the most difficulty in differentiating habitats was the eastern border of the scene, which is composed of marsh habitats. ...
Article
Full-text available
Intertidal habitats like oyster reefs and salt marshes provide vital ecosystem services including shoreline erosion control, habitat provision, and water filtration. However, these systems face significant global change as a result of a combination of anthropogenic stressors like coastal development and environmental stressors such as sea-level rise and disease. Traditional intertidal habitat monitoring techniques are cost and time-intensive, thus limiting how frequently resources are mapped in a way that is often insufficient to make informed management decisions. Unoccupied aircraft systems (UASs) have demonstrated the potential to mitigate these costs as they provide a platform to rapidly, safely, and inexpensively collect data in coastal areas. In this study, a UAS was used to survey intertidal habitats along the Gulf of Mexico coastline in Florida, USA. The structure from motion photogrammetry techniques were used to generate an orthomosaic and a digital surface model from the UAS imagery. These products were used in a geographic object-based image analysis (GEOBIA) workflow to classify mudflat, salt marsh, and oyster reef habitats. GEOBIA allows for a more informed classification than traditional techniques by providing textural and geometric context to habitat covers. We developed a ruleset to allow for a repeatable workflow, further decreasing the temporal cost of monitoring. The classification produced an overall accuracy of 79% in classifying habitats in a coastal environment with little spectral and textural separability, indicating that GEOBIA can differentiate intertidal habitats. This method allows for effective monitoring that can inform management and restoration efforts.
... Understanding uncertainty. Understanding how uncertainty informs confidence in the location of bioregions, along with the confidence in the description of physical and biological characteristics within each bioregion (Brown 1998, Fiorentino et al. 2018. Assessments of uncertainty and variance are already standard in many management actions-for example, ecosystem-based fisheries management (Koen-Alonso et al. 2019)-and are likely to become more important in bioregionalization decision-making, where economic, social and biodiversity values are often traded to meet competing objectives. ...
Article
Bioregions are important tools for understanding and managing natural resources. Bioregions should describe locations of relatively homogenous assemblages of species occur, enabling managers to better regulate activities that might affect these assemblages. Many existing bioregionalization approaches, which rely on expert-derived, Delphic comparisons or environmental surrogates, do not explicitly include observed biological data in such analyses. We highlight that, for bioregionalizations to be useful and reliable for systems scientists and managers, the bioregionalizations need to be based on biological data; to include an easily understood assessment of uncertainty, preferably in a spatial format matching the bioregions; and to be scientifically transparent and reproducible. Statistical models provide a scientifically robust, transparent, and interpretable approach for ensuring that bioregions are formed on the basis of observed biological and physical data. Using statistically derived bioregions provides a repeatable framework for the spatial representation of biodiversity at multiple spatial scales. This results in better-informed management decisions and biodiversity conservation outcomes.
Article
Full-text available
Identifying physical and ecological boundaries that limit where species can occur is important for predicting how those species will respond to global change. The island of Borneo encompasses a wide range of habitats that support some of the highest richness on Earth, making it an ideal location for investigating ecological mechanisms underlying broad patterns of species distribution. We tested variation in richness and range‐size in relation to edaphic specialization and vegetation zone boundaries using 3060 plant species from 193 families centered around the elevational gradient of Mt Kinabalu, Borneo. Across species, average range‐size increased with elevation, consistent with Rapoport's rule. However, plants associated with ultramafic soil, which is low in nutrient and water availability and often has high concentrations of heavy metals, had larger range‐sizes and greater richness than expected along the elevational gradient, as compared to a null model with randomization of edaphic association. In contrast, non‐ultramafic species had smaller range‐sizes and lower richness than expected. These results suggest that tolerance of resource limitation may be associated with wider range‐sizes, whereas species intolerant of edaphic stress may have narrower range‐sizes, possibly owing to more intense competition in favorable soil types. Using elevation as a predictor of average range‐sizes, we found that piece‐wise models with breakpoints at vegetation zone transitions explained species distributions better than models that did not incorporate ecological boundaries. The greatest relative increases in range‐size with respect to elevation occurred mid‐elevation, within the montane cloud forest vegetation zone. Expansion of average range‐size across an area without physical boundaries may indicate a shift in ecological strategy and importance of biotic versus abiotic stressors. Our results indicate that elevational range‐size patterns are structured by ecological constraints such as species' edaphic association, which may limit the ability of species to migrate up or down mountains in response to climate change.
Article
Areas that contain ecologically distinct biological content, called bioregions, are a central component to spatial and ecosystem‐based management. We review and describe a variety of commonly used and newly developed statistical approaches for quantitatively determining bioregions. Statistical approaches to bioregionalization can broadly be classified as two‐stage approaches that either ‘Group First, then Predict’ or ‘Predict First, then Group’, or a newer class of one‐stage approaches that simultaneously analyse biological data with reference to environmental data to generate bioregions. We demonstrate these approaches using a selection of methods applied to simulated data and real data on demersal fish. The methods are assessed against their ability to answer several common scientific or management questions. The true number of simulated bioregions was only identified by both of the one‐stage methods and one two‐stage method. When the number of bioregions was known, many of the methods, but not all, could adequately infer the species, environmental and spatial characteristics of bioregions. One‐stage approaches, however, do so directly via a single model without the need for separate post‐hoc analyses and additionally provide an appropriate characterization of uncertainty. One‐stage approaches provide a comprehensive and consistent method for objectively identifying and characterizing bioregions using both biological and environmental data. Potential avenues of future development in one‐stage methods include incorporating presence‐only and multiple data types as well as considering functional aspects of bioregions.
Article
Full-text available
The concept of the marine ecological community has recently experienced renewed attention, mainly owing to a shift in conservation policies from targeting single and specific objec- tives (e.g. species) towards more integrated approaches. Despite the value of communities as dis- tinct entities, e.g. for conservation purposes, there is still an ongoing debate on the nature of spe- cies associations. They are seen either as communities, cohesive units of non-randomly associated and interacting members, or as assemblages, groups of species that are randomly associated. We investigated such dualism using fuzzy logic applied to a large dataset in the German Bight (south- eastern North Sea). Fuzzy logic provides the flexibility needed to describe complex patterns of natural systems. Assigning objects to more than one class, it enables the depiction of transitions, avoiding the rigid division into communities or assemblages. Therefore we identified areas with either structured or random species associations and mapped boundaries between communities or assemblages in this more natural way. We then described the impact of the chosen sampling design on the community identification. Four communities, their core areas and probability of occurrence were identified in the German Bight: AMPHIURA-FILIFORMIS, BATHYPOREIA-TELLINA, GONIADELLA-SPISULA, and PHORONIS. They were assessed by estimating overlap and compactness and supported by analysis of beta-diversity. Overall, 62% of the study area was characterized by high species turnover and instability. These areas are very relevant for conservation issues, but become undetectable when studies choose sampling designs with little information or at small spatial scales.
Article
Full-text available
Remote sensing techniques are currently the main methods providing elevationdata used to produce Digital Terrain Models (DTM). Terrain attributes (e.g. slope,orientation, rugosity) derived from DTMs are commonly used as surrogates of spe-cies or habitat distribution in ecological studies. While DTMs’ errors are known topropagate to terrain attributes, their impact on ecological analyses is howeverrarely documented. This study assessed the impact of data acquisition artefacts onhabitat maps and species distribution models. DTMs of German Bank (off NovaScotia, Canada) at five different spatial scales were altered to artificially introducedifferent levels of common data acquisition artefacts. These data were used in 615unsupervised classifications to map potential habitat types based on biophysicalcharacteristics of the area, and in 615 supervised classifications (MaxEnt) to predictsea scallop distribution across the area. Differences between maps and models builtfrom altered data and reference maps and models were assessed. Roll artefactsdecreased map accuracy (up to 14% lower) and artificially increased models’ per-formances. Impacts from other types of artefacts were not consistent, eitherdecreasing or increasing accuracy and performance measures. The spatial distribu-tion of habitats and spatial predictions of sea scallop distributions were alwaysaffected by data quality (i.e. artefacts), spatial scale of the data, and the selection ofvariables used in the classifications. This research demonstrates the importance ofthese three factors in building a study design, and highlights the need for errorquantification protocols that can assist when maps and models are used in deci-sion-making, for instance in conservation and management.
Article
Full-text available
Selecting appropriate environmental variables is a key step in ecology. Terrain attributes (e.g. slope, rugosity) are routinely used as abiotic surrogates of species distribution and to produce habitat maps that can be used in decision-making for conservation or management. Selecting appropriate terrain attributes for ecological studies may be a challenging process that can lead users to select a subjective, potentially sub-optimal combination of attributes for their applications. The objective of this paper is to assess the impacts of subjectively selecting terrain attributes for ecological applications by comparing the performance of different combinations of terrain attributes in the production of habitat maps and species distribution models. Seven different selections of terrain attributes, alone or in combination with other environmental variables, were used to map benthic habitats of German Bank (off Nova Scotia, Canada). 29 maps of potential habitats based on unsupervised classifications of biophysical characteristics of German Bank were produced, and 29 species distribution models of sea scallops were generated using MaxEnt. The performances of the 58 maps were quantified and compared to evaluate the effectiveness of the various combinations of environmental variables. One of the combinations of terrain attributes–recommended in a related study and that includes a measure of relative position, slope, two measures of orientation , topographic mean and a measure of rugosity–yielded better results than the other selections for both methodologies, confirming that they together best describe terrain properties. Important differences in performance (up to 47% in accuracy measurement) and spatial outputs (up to 58% in spatial distribution of habitats) highlighted the importance of carefully selecting variables for ecological applications. This paper demonstrates that making a subjective choice of variables may reduce map accuracy and produce maps that do not adequately represent habitats and species distributions, thus having important implications when these maps are used for decision-making.
Article
Full-text available
Marine habitats of shelf seas are in constant dynamic change and therefore need regular assessment particularly in areas of special interest. In this study, the single-beam acoustic ground discrimination system RoxAnn served to assess seafloor hardness and roughness, and combine these parameters into one variable expressed as RGB (red green blue) color code followed by k-means fuzzy cluster analysis (FCA). The data were collected at a monitoring site west of the island of Helgoland (German Bight, SE North Sea) in the course of four surveys between September 2011 and November 2014. The study area has complex characteristics varying from outcropping bedrock to sandy and muddy sectors with mostly gradual transitions. RoxAnn data enabled to discriminate all seafloor types that were suggested by ground-truth information (seafloor samples, video). The area appears to be quite stable overall; sediment import (including fluid mud) was detected only from the NW. Although hard substrates (boulders, bedrock) are clearly identified, the signal can be modified by inclination and biocover. Manually, six RoxAnn zones were identified; for the FCA, only three classes are suggested. The latter classification based on ‘hard’ boundaries would suffice for stakeholder issues, but the former classification based on ‘soft’ boundaries is preferred to meet state-of-the-art scientific objectives.
Article
Full-text available
Most models of forest type for predictive mapping cannot produce estimates of confidence in the prediction of individual pixels, even where they provide good overall accuracy. A new strategy that combines several models based on different principles not only provides estimates of prediction confidence, but also improves the mapping accuracy. In this study, the theoretical foundation of Artificial Neural Networks, Decision Trees, and Dempster-Shafer’s Evidence Theory are briefly reviewed, compared, and applied to a common data set. Two ways for integrating the results of the three models were then evaluated. One method was to separately harden the probability results of the three models, then combine them to make a single classification. In the second method, the probabilities of the three models for each pixel were simply averaged, then hardened to a single classification. Deferring the hardening to the final stage produced the best results. The 3 percent increase in overall accuracy for the second approach compared with the best individual model is encouraging. More importantly, estimates of prediction confidence were derived, based on a comparison between a combined model and the three models, something that is impossible using a single model.
Book
Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics. This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data. Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis. Key Features: textbullet} Presents a comprehensive guide to clustering techniques, with focus on the practical aspects of cluster analysis. textbullet{ Provides a thorough revision of the fourth edition, including new developments in clustering longitudinal data and examples from bioinformatics and gene studies textbullet Updates the chapter on mixture models to include recent developments and presents a new chapter on mixture modeling for structured data. Practitioners and researchers working in cluster analysis and data analysis will benefit from this book.
Article
The number and variety of statistical techniques for spatial analysis of ecological data are burgeoning and many ecologists are unfamiliar with what is available and how the techniques should be used. This book provides an overview of the wide range of spatial statistics available to analyze ecological data, and provides advice and guidance for graduate students and practicing researchers who are either about to embark on spatial analysis in ecological studies or who have started but need guidance to proceed. © M.-J. Fortin and M.R.T. Dale 2005 and Cambridge University Press 2009.
Article
We consider the representation of information about the value of an uncertain variable using a monotonic set measure, fuzzy measure. We introduce a number of notable measures that are useful for the representation of this kind of information. We look at the formulation of the concept of entropy in this framework. The Choquet integral is introduced as a tool to help obtain expected value like formulations in the case of measure type uncertainty. It is shown how this can be used in decision making with alternatives having uncertain payoffs. A formulation for the variance associated with measure based uncertain information is provided. The issue of the fusion of multiple pieces of measure-represented information is investigated.