PreprintPDF Available

Large-scale Detection of Marine Debris in Coastal Areas with Sentinel-2

Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

Detecting and quantifying marine pollution and macro-plastics is an increasingly pressing ecological issue that directly impacts ecology and human health. Efforts to quantify marine pollution are often conducted with sparse and expensive beach surveys, which are difficult to conduct on a large scale. Here, remote sensing can provide reliable estimates of plastic pollution by regularly monitoring and detecting marine debris in coastal areas. Medium-resolution satellite data of coastal areas is readily available and can be leveraged to detect aggregations of marine debris containing plastic litter. In this work, we present a detector for marine debris built on a deep segmentation model that outputs a probability for marine debris at the pixel level. We train this detector with a combination of annotated datasets of marine debris and evaluate it on specifically selected test sites where it is highly probable that plastic pollution is present in the detected marine debris. We demonstrate quantitatively and qualitatively that a deep learning model trained on this dataset issued from multiple sources outperforms existing detection models trained on previous datasets by a large margin. Our experiments show, consistent with the principles of data-centric AI, that this performance is due to our particular dataset design with extensive sampling of negative examples and label refinements rather than depending on the particular deep learning model. We hope to accelerate advances in the large-scale automated detection of marine debris, which is a step towards quantifying and monitoring marine litter with remote sensing at global scales, and release the model weights and training source code under
Content may be subject to copyright.
Large-scale Detection of Marine Debris in Coastal Areas
with Sentinel-2
Marc Rußwurm, Sushen Jilla Venkatesa, Devis Tuia
aEPFL ECEO Laboratory, Rue de l’Industrie 17, Sion, 1950, Valais, Switzerland
Detecting and quantifying marine pollution and macro-plastics is an in-
creasingly pressing ecological issue that directly impacts ecology and human
health. Efforts to quantify marine pollution are often conducted with sparse
and expensive beach surveys, which are difficult to conduct on a large scale.
Here, remote sensing can provide reliable estimates of plastic pollution by
regularly monitoring and detecting marine debris in coastal areas. Medium-
resolution satellite data of coastal areas is readily available and can be lever-
aged to detect aggregations of marine debris containing plastic litter. In this
work, we present a detector for marine debris built on a deep segmentation
model that outputs a probability for marine debris at the pixel level. We train
this detector with a combination of annotated datasets of marine debris and
evaluate it on specifically selected test sites where it is highly probable that
plastic pollution is present in the detected marine debris. We demonstrate
quantitatively and qualitatively that a deep learning model trained on this
dataset issued from multiple sources outperforms existing detection models
trained on previous datasets by a large margin. Our experiments show, con-
sistent with the principles of data-centric AI, that this performance is due to
our particular dataset design with extensive sampling of negative examples
and label refinements rather than depending on the particular deep learning
model. We hope to accelerate advances in the large-scale automated detec-
tion of marine debris, which is a step towards quantifying and monitoring
marine litter with remote sensing at global scales, and release the model
weights and training source code1.
Keywords: Marine Debris Detection, Plastic Pollution, Sentinel-2
Preprint submitted to ArXiv July 6, 2023
arXiv:2307.02465v1 [cs.CV] 5 Jul 2023
1. Introduction
Marine litter is accumulating at alarming rates, with 19 to 23 million
metric tonnes dispersed in 2016 alone (Borrelle et al., 2020). Plastic artifacts
constitute 75% of marine litter, exceeding 5 trillion objects in numbers (Erik-
sen et al., 2014), and are causing a serious threat to marine ecosystems and
human health. Approximately 80% of marine litter originates from terres-
trial sources (Andrady, 2011). It accumulates in rivers (Van Emmerik et al.,
2019; van Emmerik and Schwarz, 2020) and lakes (Faure et al., 2012) and
eventually enters open oceans. Primary micro-plastics are purposefully man-
ufactured to carry out a specific function, like abrasive particles or powders
for injection molding. Secondary micro-plastics result from fragmentation
of larger objects (Kershaw et al., 2019). In particular transport in rivers
causes macro-plastics (>2.5 cm diameter) to decompose into meso- (5 mm
2.5 cm) and micro-plastics (<5 mm diameter) (Kershaw et al., 2019; Hanke
et al., 2013), which then enter the food chain. Micro-plastics have been
found across the entire planet and have been detected in antarctic penguins
(Bessa et al., 2019), deep-sea sediments (Van Cauwenberghe et al., 2013),
and human stool (Schwabl et al., 2019) and have been shown to affect the
growth of corals (Chapron et al., 2018). A range of economic costs can
also be associated with marine pollution, from clean-up expenses to loss of
tourism revenue (Beaumont et al., 2019). It is clear that monitoring and
mitigating water pollution is a major environmental, social, and economic
challenge, and systematic mapping is needed to both identify pollutants and
measure the success of awareness and clean-up programs. Continuous mon-
itoring and litter quantification are often limited to individual surveys that
are labor-intensive and expensive to conduct regularly (Van Dyck et al.,
2016). These approaches can only cover a comparatively small area, even
when surveyors are supported by aerial UAV imagery, as explored by Wolf
et al. (2020); Goddijn-Murphy et al. (2022); Escobar-S´anchez et al. (2022);
Topouzelis et al. (2019). Effectively, only a few developed countries, such as
the United Kingdom, can afford a systematic monitoring program (Rees and
Pond, 1995). These programs still require support from the local population
in citizen science projects to collect ground data (Hidalgo-Ruz and Thiel,
2015). This level of engagement requires a public sensitivity to the problem,
awareness, and, eventually, the technological means to report pollutants.
Satellite imagery that provides data at reasonable spatial and high tempo-
ral resolution can support this monitoring in large marine areas (Hanke et al.,
2013). Even though it is a pressing issue, remote sensing-enabled monitoring
of marine debris has only relatively recently emerged as a major research
topic, as summarized by the broad reviews of Salgado-Hernanz et al. (2021)
and Topouzelis et al. (2021). Both reviews compared drone, aircraft, and
optical and radar satellite-based acquisition methods. In particular, machine
learning models have been increasingly used for this problem, as summarized
by Politikos et al. (2023), who aggregated a comprehensive list of approaches
and locations where machine learning algorithms have been deployed in the
last years across the globe. For optical sensors, high spatial (<3 m) and
spectral resolutions beyond RGB (400 nm to 2500 nm) were found optimal
for the detection of aggregations of marine debris. Synthetic Aperture Radar
(SAR) can be potentially suitable for detecting sea-slicks (Davaasuren et al.,
2018) that are associated with surfactants and change the surface tension of
the water, which in turn reduces the radar back-scatter. These slicks consist
of microbial bio-films that can be connected with micro-plastics suspended in
the sea-surface microlayer (Salgado-Hernanz et al., 2021). However, a recent
study (Sun et al., 2023) demonstrated that only very high concentrations of
microplastics lead to a sufficiently strong dampening of waves to be detectable
with radar satellites. Similarly to sea slicks, macro-plastics can aggregate in
lines driven by environmental forces, such as wind speed, waves, or coastal
fronts. For instance, windrows are accumulations of surface debris. Their ge-
ometry allows for efficient ship-based collection efforts, which can be highly
effective, as demonstrated by Ruiz et al. (2020). Their collection campaign
lasted 68 working days during the spring and summer of 2018 and gathered
16.2 tons of floating marine litter in the Bay of Biscay. This work demon-
strated that detecting and collecting aggregated debris on the sea surface
in geographic areas with a high pollution level can be directly attributed to
macro plastic litter. Marine debris aggregations in windrows are sufficiently
large to be detectable at medium resolutions of 10 m by 10 m achievable by
Sentinel-2 and can effectively serve as a proxy for macro plastic litter in the
oceans (C´ozar et al., 2021; Arias et al., 2021). However, further distinguishing
floating objects of natural origins, such as driftwood, or patches of algae and
sargassum, from objects of human origins in large-scale medium-resolution
imagery remains challenging and is an ongoing topic of current research (Hu,
2021, 2022; Ciappa, 2021, 2022). This further fine-grained distinction may re-
quire currently unavailable sensor technology (Salgado-Hernanz et al., 2021)
and is beyond the scope of this work. Instead, we study the effectiveness of
detecting heterogeneous marine objects of both natural or anthropogenic ori-
gins at a large scale with globally available Sentinel-2 imagery. In this work,
we aim to monitor floating marine litter by detecting marine debris as a proxy
at a large scale. To do so, we evaluate our detector in selected areas where it
is likely that marine litter is present in marine debris due to local studies and
reports in the news and social media. This evaluation strategy ensures that
our detector is sensitive to plastic pollution if marine debris is detected. This
work follows the principles of data-centric AI (Whang et al., 2023), where
the methodological innovation is concentrated on carefully designing of the
dataset rather than the specificities of the particular deep learning model.
Throughout this work, we will use the term marine litter according to the
United Nations Environment Programme (2009) definition as any persistent,
manufactured, or processed solid material discarded, disposed of, or aban-
doned in the marine and coastal environment. We use marine debris more
broadly as any aggregation of floating materials on the sea surface that may
or may not contain marine litter of anthropogenic origins. The terms “lit-
ter”, “debris” and “plastic” have particular meanings to different groups of
people depending on the scientific or technical context or cultural preference
(Kershaw et al., 2019) and “marine debris” is often, especially in US-English,
used synonymously with “marine litter”. However, we believe a distinction
is necessary for technical reasons in this application: visual inspection of the
current satellite imagery (without on-site knowledge) can not reliably distin-
guish marine litter of human origins from marine debris that may also be
of natural origins. Hence, any work relying on hand annotations of satellite
images can not resolve this conflict objectively, as on-site knowledge of the
composition of the visible marine debris is only available from dedicated cam-
paigns (Topouzelis et al., 2019, 2020a) that yield few thoroughly analyzed
pixels. In prior work (Mifdal et al., 2021), we used the generic term “float-
ing object”, while others like Booth et al. (2022) chose the term “suspected
plastics”. Both terms entail their limitations by being either too broad, as
“floating objects” may include ships, or are too focused on plastics over other
forms of litter. Our definitions of anthropogenic marine litter and generic
marine debris follow the practices of Kikaki et al. (2022) who annotated sim-
ilar objects termed marine debris in the Marine Debris Archive (MARIDA)
and are used consistently throughout this work.
The rest of the paper is organized as follows: The next section summarizes
related work on detecting marine pollution with remote sensing technology.
Section 3 describes training, validation, and evaluation data used in this
study and details the implementation of the segmentation models in the Ma-
rine Debris Detector. Section 4 presents results compared to related work
and methodologies qualitatively and quantitatively. Further experiments test
the robustness of the Marine Debris Detector concerning atmospheric cor-
rection and test the transferability to higher-resolution PlanetScope imagery
that can supplement the Sentinel-2 imagery used primarily in this work. The
final Section 5 discusses the results and provides conclusions for future work.
2. Related Work
Detecting marine debris with satellite imagery at high (typically 3 m to
7 m with PlanetScope imagery) and medium resolution (mainly at 10 m with
Sentinel-2) is a rising scientific question in remote sensing research. Initial
advances were made by pixel-wise classifiers using multi-spectral spectral
reflectance in combination with dedicated spectral indices, such as the Nor-
malized Difference Vegetation Index (NDVI). Themistocleous et al. (2020)
investigated the detection of floating plastic litter from space using Sentinel-2
imagery in Cypris and proposed plastic index as the ratio of near-infrared
reflectance to the sum of red and near-infred similar to NDVI. Similarly, Bier-
mann et al. (2020) proposed a Floating Debris Index (FDI), which is a modifi-
cation of the Floating Algae Index (FAI) (Hu, 2009). They demonstrated the
effectiveness of FDI with a na¨ıve Bayes classifier in two-dimensional NDVI-
FDI feature space. However, this classifier, originally fitted on hand-selected
training and evaluation data under optimal conditions, was not accurate
enough on unfiltered satellite imagery in practice, as demonstrated by Mif-
dal et al. (2021). Kikaki et al. (2022) achieved the best accuracies with
a pixel-wise random forest classifier that utilized the Sentinel-2 reflectance
bands, a range of spectral indices, and textural features. In Mifdal et al.
(2021), we investigated the suitability of learned spatial features with a con-
volutional neural network for binary marine debris detection. While their
results showed general applicability towards detecting marine debris with
deep segmentation models, they identified several limitations and the sen-
sitivity to a range of false-positive detections that made their model not
employable in an automated way. Simultaneously, Shah et al. (2021) anno-
tated RGB PlanetScope imagery with bounding boxes and trained a deep
object detector on the localization of marine debris. Most recently, omez
et al. (2022) focused on detecting debris in rivers with Sentinel-2 and tested
likelihood of plastic debris and quality of annotations
S2Ships (negatives)
Refined FloatingObjects Accra, Ghana 2018-10-31
Durban, SA 2019-04-24
Plastic Litter Projects
2021 & 2022, Greece
Training Validation Evaluation
dataset of multiple (annotated) scenes single Sentinel-2 scene
MARIDA validation MARIDA test
various objects, high
diversity, mixed
annotation quality
mixed, objects (likely of natural
origins), high point-wise
annotation quality
high plastic probability,
high point-wise
annotation quality
Figure 1: Overview of the datasets used for training, validation, and evaluation in this
work. We focus on quantity and diversity in the training datasets while prioritizing accu-
rate annotations in validation and evaluation data. The scenes in Accra and Durban likely
contain plastic litter in the visible marine debris and are explicitly used for evaluation.
several deep segmentation models to understand and predict floating debris
accumulations. Similar to this work, Booth et al. (2022) presents a super-
vised U-Net classifier named MAP-Mapper which is learned on the MARIDA
dataset aimed to predict the density of marine debris.
Several public datasets were made available alongside the respective pub-
lications. Both the FloatingObjects dataset (Mifdal et al., 2021) and the
Marine Debris Archive (MARIDA) (Kikaki et al., 2022) contain Sentinel-2
imagery with a substantial number of hand-annotations of visually detected
marine debris hand-annotated. They differ mostly in the binary (debris vs
other, i.e., non-debris) and multiclass (types of debris) nature of the anno-
tations. The NASA Marine Debris dataset (Shah et al., 2021) focused on
3-channel RGB PlanetScope imagery with coarse bounding box annotations.
In this paper, we extend initial work of Mifdal et al. (2021)and train
a deep segmentation model on the combined datasets of FloatingObjects
(Mifdal et al., 2021) and MARIDA (Kikaki et al., 2022). We further use
additional datasets to train our detector, which we detail in the next section.
3. Materials and Methods
Defining and aggregating training data for marine debris detection is chal-
lenging due to the heterogeneous nature of objects, the novelty of the disci-
pline, and the scarcity of available datasets. This section first outlines the
sources, aggregation choices, and design decisions to generate the training,
validation, and evaluation datasets used in this work. Specifically, Section 3.1
focuses on the datasets used for training, while Section 3.2 outlines the vali-
dation and evaluation sets. An overview of the datasets is provided in Fig. 1.
For training datasets, we focused on quantity and aggregated a large dataset
of heterogeneous marine debris and other floating materials alongside nega-
tive examples focused on ships (S2Ships). The quality of this large training
pool is variable, but this also reflects the inherent difficulty of the task. In the
validation and evaluation data, we focus more on the quality and accuracy
of annotations of marine debris. The evaluation scenes were chosen explic-
itly in areas where we were certain, due to manual verification, that plastic
pollution is present among marine debris. After describing the dataset, the
models used are detailed in Sections 3.3 and 3.4, which describe our detector
and the comparison methods, respectively. Accuracy metrics are described
in Section 3.5.
3.1. Training Data
The available annotated data on the detection of marine debris is scarce.
To our knowledge, only two publicly available datasets focusing on Sentinel-2
imagery are available today. The Marine Debris Archive (MARIDA) (Kikaki
et al., 2022) provides multiple labels on polygon-wise hand-annotated Sentinel-
2 images, and the FloatingObjects (Mifdal et al., 2021) provides binary la-
bels (floating objects versus water annotations) in coarse hand-drawn lines
on Sentinel-2 scenes. We further improve the quality of these annotations
by an automated label refinement heuristic defined for this problem. Our
goal is to train a model that can predict marine debris from openly accessi-
ble satellite imagery in different conditions and therefore making it possible
process both top-of-atmosphere and atmospherically corrected bottom-of-
atmosphere data. For atmospheric correction, we further chose to use prod-
ucts corrected with Sen2COR (Main-Knorn et al., 2017) that are readily
available to download in Google Earth Engine rather than products corrected
with ACOLITE (Vanhellemont and Ruddick, 2016), where the atmospheric
correction would have to be done individually at each raw image scene. To
study the effect of atmospheric correction, we test our models on imagery at
different atmospheric processing levels (see Section 4.2). To avoid confusion
of marine debris with ships, one of the major problems highlighted in Mif-
dal et al. (2021), we also include the S2Ships dataset (Ciocarlan and Stoian,
2021) that provides negative non-debris examples of class other. All three
datasets are detailed in the next subsections.
3.1.1. FloatingObjects
The FloatingObjects dataset originates from our prior work in Mifdal
et al. (2021) and contains 26 different globally distributed Sentinel-2 scenes.
Overall, 3297 floating objects were annotated by lines when visually identified
as marine debris. In this work, we use this dataset exclusively for training,
as a certain level of label noise is present in the annotations. We decided to
exclude four regions accra 20181031,lagos 20190101,neworleans 20200202,
venice 20180630 to be re-annotated in the RefinedFloatingObjects validation
dataset described later in Section 3.2.1. The remaining 22 regions were used
for training.
We follow the data sampling strategy of Mifdal et al. (2021) and crop
a small image patch of 128 px by 128 px centered on each line segment of
the available marine debris annotations. To obtain negative examples with-
out any marine debris, we select random points within the Sentinel-2 scenes
and extract equally sized image patches. We also use both processing levels
L1C (top-of-atmosphere) and L2A (bottom-of-atmosphere), where we always
select the L2A image available in the Google Earth Engine Archive (Gore-
lick et al., 2017) and resort to L1C if no atmospherically corrected image
is available. The effect of atmospheric correction on the performance of the
detector is evaluated later in Section 4.2. In all cases, 12 Sentinel-2 bands are
used. These are all the available bands, excluding the haze-band B10, which
the Sen2COR atmospheric correction (Main-Knorn et al., 2017) algorithm
removes automatically.
Label Refinement Module. While the FloatingObjects dataset pro-
vides a large number of labels, the annotated lines do not always accurately
capture the width and geometry of the underlying marine debris. We im-
prove the hand annotations by an automated label refinement module that
generates a mask that reflects more closely the geometry of the debris in the
proximity of the line annotations (Fig. 2). The module inputs a Sentinel-2
scene and the original line annotations mask. In the first stage (left side of
Fig. 2), we buffer the hand-annotated line to obtain a region of potential
marine debris. Then, we calculate the Floating Debris Index (FDI) using the
Sentinel-2 scene and perform a segmentation of the FDI image with an Otsu
threshold (Otsu, 1979). The buffer and segmentation are then combined to
obtain a preliminary area of marine debris in the vicinity of the original an-
notations. In the second stage, we randomly sample potential marine debris
pixels, as well, as markers for non-debris pixels (class other) in the remaining
parts of the image. These markers are the starting points of a random walk
segmentation algorithm (Grady, 2006), which is a fast algorithm that requires
a few labeled pixels as markers. The markers are assumed to be accurately
annotated, while the pixels between the markers are uncertain and are then
annotated by an underlying anisotropic diffusion process that ensures that
homogeneous areas are assigned to the same class. Crucially, one set of pa-
rameters (homogeneity criterion, buffer size, marker sampling frequency) of
the random walker algorithm leads to one potential debris map. Therefore,
we vary those parameters and average all maps to capture the underlying
undefinedness of the borders of marine debris, as shown in the bottom row
of Fig. 2.
3.1.2. MARIDA
The Marine Debris Archive (MARIDA) was collected by Kikaki et al.
(2022) for developing and evaluating machine learning algorithms for marine
debris detection. MARIDA contains 63 temporally overlapping Sentinel-2
scenes from 12 distinct regions. In total, 6672 polygons were annotated, of
which 1882 are marine debris and 2447 marine water. The remaining 2343
polygons are annotated in one of 13 further classes with between 24 and
356 annotations each that we do not use in this study. We use MARIDA
as additional training, validation, and evaluation data source, but consider
only patches annotated as marine debris (positive class) and treat instances
of marine water as negatives. The MARIDA dataset contains Sentinel-2
imagery with 11-bands that have been atmospherically corrected with the
ACOLITE (Vanhellemont and Ruddick, 2018) algorithm. In this work, we
want to apply our detector on 12-band Sentinel-2 imagery that had been
atmospherically corrected with Sen2COR (Main-Knorn et al., 2017), as is
readily available, for instance, in Google Earth Engine (Gorelick et al., 2017).
This avoids reprocessing additional imagery after download and simplifies the
application on new scenes. To harmonize this dataset, we re-downloaded all
Sentinel-2 scenes from Google Earth Engine to retrieve 12-band imagery for
MARIDA compatible with the other datasets. Like FloatingObjects, we use
original mask
buffer refined mask
S2 scene
markersinput image
random walk segmentation
otsu segm.
repeated refinement with different settings
Label Refinement Module
Figure 2: Label Refinement Module for the FloatingObjects dataset. It inputs a Sentinel-2
image and the original hand annotation of the FloatingObjects dataset (left). An Otsu-
threshold segmentation buffered around the hand labels (Otsu, 1979) (center) is used to
sample marker points (shown on the right) for a random walk segmentation algorithm
(Grady, 2006) that results in a refined annotated mask (right). By varying parameters,
we generate different variants of the mask, whose average expresses the uncertainty and
fuzziness on the borders of the debris (second row).
the atmospherically corrected L2A Sentinel-2 imagery whenever available.
We also excluded one scene near Durban from MARIDA (named S2 24-4-
19 36JUN) to avoid spatial overlap, and potential positive biases with our
evaluation scene described later in Section 3.2.
3.1.3. S2Ships
Ships and their wakes can cause false positive predictions of marine debris,
as reported by Mifdal et al. (2021). We decided to explicitly add images of
ships without any annotated marine debris as negative examples. We use the
S2Ships dataset of Ciocarlan and Stoian (2021), which segmented ships with
Sentinel-2 imagery. In our training pipeline, we retrieve these ship positions,
load an image centered on each ship and show it to our detector during
training with a negative prediction mask indicating the class other.
3.2. Validation and Evaluation Sites
For finding the best neural network design and hyperparameters (i.e.,
validation), as well, as for the final independent evaluation, we used datasets
with high-quality annotations. For both sets, we combine the MARIDA
datasets, according to their validation and evaluation partitioning schemes,
with a refined version of the FloatingObjects dataset that we describe in
the next Section 3.2.1. For further qualitative evaluation, we additionally
use imagery from the Plastic Litter Projects 2021 and 2022, detailed further
in Section 3.2.2. For both validation and evaluation datasets, we focus on
using accurate annotations and we select only sites with a high probability
of plastic pollution specifically for final evaluation, as detailed further in the
next sections.
3.2.1. RefinedFloatingObjects
We create a refined version of the FloatingObjects dataset (Section 3.1.1)
with less label noise, by re-annotating some a subset of FloatingObjects
regions by individual point locations of which we are certain that they are
localized accurately on visible marine debris in the imagery. We conduct
this annotation in Google Earth Engine (GEE) (Gorelick et al., 2017) and
select the subset of regions named lagos 20190101,neworleans 20200202,
venice 20180630,accra 20181031 . We also included two new areas, which
are marmara 20210519 and durban 20190424 . By carefully annotating these
areas, we are confident that we captured the precise location of the class
marine debris in these Sentinel-2 scenes. To train a model, we also need
examples for the negative other class to calculate accuracy scores that capture
a diverse set of negatives, like open water, land, coastline, and ships, that
likely confuse the model. To obtain these negative examples, we iteratively
added negative examples by monitoring the result of a smileCART (Breiman
et al., 1984) classifier implemented online in Google Earth Engine. This
classifier serves as a proxy antagonist to us as labelers, i.e., it will highlight
areas that appear like marine debris and will be checked by annotators. We
explicitly added new negative examples in locations where this proxy classifier
incorrectly predicted marine debris. Hence, we captured meaningful negative
point locations of the other class that was difficult to distinguish from the
annotated marine debris by the smileCart classifier.
At validation and evaluation time, we extract a 128 px ×128 px patches
centered on each of these annotated points that are labeled as either marine
debris (positive) or other (negative). We can only be certain about the class
at the precise annotations of the point in the center of each image patch.
Hence, we first segment the entire patch using the semantic segmentation
model but then extract the prediction only at the center pixel corresponding
to the annotated point for accuracy estimation. This selection effectively
simplifies the segmentation problem to a classification problem at the center
of the image patch. It allows us to use standard classification metrics to
measure the accuracy (described in Section 3.5).
Among the six regions in this dataset, we use the Sentinel-2 scenes la-
gos 20190101,neworleans 20200202,venice 20180630,marmara 20210519 for
validation, as we are not certain about the composition of the visible ma-
rine debris in these images. For instance, marmara 20210519 likely contains
floating algae (sea snot), as it coincides with reported algae blooms (Kue-
bler, 2021) which are often present in this area (Hu et al., 2022). We use the
accurate annotations of this generic marine debris in these areas to calibrate
the model hyperparameters, such as the classification threshold, before final
For evaluation, we use the scenes accra 20181031 and durban 2019042 ,
as these areas very likely contain plastics in the marine debris:
Evaluation Scene Accra, Ghana, 2018-10-31. Beach surveys in
2013 showed that plastic materials made up the majority of 63.72% of
marine debris washed onto evaluated beaches (Van Dyck et al., 2016).
A recent study (Pinto et al., 2023) estimated the daily plastic mass
transport of plastic in the Odaw river running through Accra into the
sea between 140 and 380 kilogram per day. Qualitatively, one particular
area in this Sentinel-2 scene, shown in Fig. 3 (top), shows an outwash
of debris from the coast. In this image, the marine debris are visible
in yellow (high floating debris index FDI). We show a high-resolution
background map from Google Satellites for land and shoreline to pro-
vide a reference. Two zoomed-in areas (named 1 and 2 in Fig. 3) show
that coastal erosion is visible alongside waste and sewage outflows ag-
gregations. Finally, a Google Street View image (bottom row of Fig. 3)
further confirms this area’s general pollution level. Only a Sentinel-2
image at the top-of-atmosphere processing level (L1C) is available in
Google Earth Engine in Accra.
Evaluation Scene Durban, South Africa, 2019-04-24. This eval-
uation scene was first identified by Biermann et al. (2020), who used
social media and news reports to select areas of plastic pollution. It
covers marine debris that likely contains plastic litter from a flood event
in Durban following heavy rainfall starting on April 18th 2019. This
flood discharged large quantities of debris into the harbor of the Dur-
ban Metropole, as shown in Fig. 4. We acquired one Sentinel-2 image
from April 24th, shown in Fig. 4c, where visible debris originates from
the harbor area (highlighted in gray). The debris in this image likely
contains plastic litter. This image is particularly difficult to predict,
as clouds and haze from former precipitations are still visible in this
scene. The patches of marine debris visible in the FDI representation
are less pronounced than in the Accra scene, which has more clearly
identifiable objects. In this area both top-of-atmosphere (L1C) and
bottom-of-atmosphere (L2A) Sentinel-2 images are available. We com-
pare the model performance on both versions later in Section 4.2.
3.2.2. Plastic Litter Projects
The third evaluation area covers Sentinel-2 data showing explicitly de-
ployed debris targets in the Plastic Litter Projects of 2021 and 2022 (Topouzelis
et al., 2019, 2020b; Papageorgiou et al., 2022) on the island of Lesbos, Greece.
In 2021, one 28 m diameter high-density polyethylene (HDPE) mesh was de-
ployed on June 8th 2021, followed by a 28 m wooden target on June 17th
2021. Both were visible during 22 Sentinel-2 satellite overpasses until 7th of
October 2021. In the Plastic Litter Project 2022, one 5 m ×5 m inflatable
PVC target, alongside two 7m diameter HDPE meshes were deployed on
Figure 3: Evaluation scene in Accra, Ghana. The top row shows an FDI visualization of
the Sentinel-2 image of October 31st 2018, where marine debris is washed into the open
waters. Closer investigations with high-resolution satellite images (center row) show that
coastal erosion is present, and this area is generally polluted with human litter. This is
also confirmed by a Google Street View image shown on the bottom row.
(a) Photo: Ash Erasmus (b) Photo: Ash Erasmus
(c) Sentinel-2 evaluation scene with debris annotations.
Figure 4: Evaluation scene from Durban, South Africa. Additional imagery shared by
local news and social media (top row) show the level of plastic pollution on 24th of April
2019. The Sentinel-2 image (bottom image) shows the corresponding Sentinel-2 scene we
use for evaluation.
June 16th 2022. One HDPE mesh was cleaned regularly, while the other was
subject to natural fouling and algae. The objects were deployed until the 11th
of October 2022 and were visible in 23 Sentinel-2 acquisitions. Additional
smaller 1 m2and 3 m2targets were also deployed throughout the project
phase to study visibility and the material’s decomposition in water but were
too small to be visible in the Sentinel-2 scenes. We use the Sentinel-2 data
of the 2021 campaign to qualitatively test the ability of our detector and
comparison models to detect the deployed targets in the Sentinel-2 imagery.
3.3. Marine Debris Detector Implementation
This section describes the implementation of the Marine Debris Detector
as a deep segmentation model that inputs a 12-channel Sentinel-2 image and
estimates the probability of marine debris’s presence for each pixel.
3.3.1. Segmentation Model Architectures
We implemented the UNet (Ronneberger et al., 2015) and Unet++
(Zhou et al., 2018) architectures, as shown in Fig. 5. The Unet segmen-
tation model of Ronneberger et al. (2015) was developed for medical image
segmentation and is heavily used in remote sensing due to the fine-grained
segmentation masks it can produce. The success of the Unet is strongly
related to its early skip connections, which help maintain the details of the
image in the final map. As such, skip connections enable the propagation of
a high-resolution representation of the input image through the entire net-
work. This network was the one used previously by Mifdal et al. (2021) for
marine debris detection.
The Unet++ (Zhou et al., 2018) variant extends the original Unet
by replacing the original encoder with a ResNet (He et al., 2016) with
four blocks (indicated as li). ResNets are the de-facto standard feature
extractor in computer vision, as they can learn complex representation while
requiring fewer weights than many earlier networks. The decoder consists of
three double-convolutional blocks (indicated with bi). Each block consists of
two convolution-batchnorm-relu transformations. While the original Unet
directly connects the output of each encoder layer with the corresponding
decoder layer of same resolution, the Unet++ adds additional double-conv
blocks in these skip pathways that are connected densely in the spirit of
DenseNet neural networks (Zhu and Newsam, 2017).
12x128x128 128x128
input image debris probability
Marine Debris Detector
UNet model
double conv block
resnet encoder layer
basic block
basic block
basic block
basic block
resnet encoder
dense connections
UNet++ model
deep segmentation model
Figure 5: Schematic of the Marine Debris Detector implementation with an underlying
Unet (Ronneberger et al., 2015) or Unet++ (Zhou et al., 2018) segmentation model.
A 12-channel input image (top-left) is encoded to hidden feature representations in sev-
eral levels of resolution (vertical pathways) and decoded to a probability of marine debris
(top-right). Higher-resolution pathways ensure that the resulting segmentation map is fine-
grained, while lower-resolution encode global information on the entire scene. Unet++
(Zhou et al., 2018) extends the original unet (Ronneberger et al., 2015) by adding addi-
tional dense connections in the skip pathways indicated in blue.
3.3.2. Implementation and Training Details
We train Unet and Unet++ models with a learning rate of 0.01 and
weight decay 1×106for 100 epochs. The Unet implementation in this
work has 31 million trainable parameters, while the Unet++ has 26 mil-
lion parameters. Regarding the label refinement module (Section 3.1.1), we
compute multiple refined segmentation masks with different parameters and
choose a buffer size of 0, 1, or 2 pixels, the β-parameter of the random
walker (a penalization coefficient for the walker motion) of 1 or 10, and the
marker density for marine debris of 5%, 25%, 50% or 75% (the density of
other markers is fixed at 5%). Combined with the original mask, this yields
25 different target masks consistent with the hand annotations and the FDI
image but of varying shapes and sizes, as shown in the bottom row of Fig. 2.
During training, we choose one of these target masks randomly, which, in
our opinion, reflects best the undefined borders of the marine debris that we
aim to detect and acts as a form of natural label-data augmentation. During
training, we monitor the area under the ROC curve (AUROC) on the refined
FloatingObjects dataset (Section 3.2.1) and MARIDA validation set. We
store the model weights each time the highest (best) validation AUROC has
been reached. We observe that the model systematically underestimates the
probability of marine debris due to a heavy class imbalance in the training
data. This results in a low precision but high recall when we assign the class
marine debris for probability scores above 0.5. We counteract this imbalance
by calibrating the classification threshold to balance precision and recall on
the validation set.
For the Unet++ model, we trained models from different random seeds
with validation-optimal thresholds of 0.132 0.0639, and 0.0254 during the
experiments shown in this paper. For the Unet, the thresholds were 0.0895,
0.0712, and 0.0643.
Training a Unet++ and Unet took eight and nine hours on an NVIDIA
RTX 3090 graphics card with multi-threaded data loading with 32 workers.
The estimated carbon footprint for one model training run was 2.8 kg.eCO2.
3.4. Comparison Methods
We compare models trained within our training framework to approaches
from recent literature. In particular, the Unet trained by Mifdal et al. (2021)
on the original FloatingObjects dataset, and a Random Forest classifier, de-
noted by rf, trained on the original MARIDA dataset (Kikaki et al., 2022).
For the Unet, we use the provided pre-trained weights for their model. Sim-
ilarly to our segmentation models, we also determine the best classification
threshold based on the validation set to achieve results with balanced pre-
cision and recall, which is 0.039. For the random forest classifier (rf), we
train the random forest on 11 Sentinel-2 bands, as in the original paper with
12 output classes, and combine the predictions into a binary scheme by con-
sidering marine debris as the positive class and treat all other 11 non-debris
classes as other. In the results section, we denote these two models as Unet
and rf and indicate that they have been trained on the “original data” of
their respective papers.
We also train the random forest on the combined training dataset de-
scribed in Section 3.1, which we denote as “trained on our dataset”. For the
random forest, we use an identical feature extraction pipeline as described
in Kikaki et al. (2022), which results in 26 features containing the original
spectral bands, spectral indices, and textural features. As the random forest
is a pixel-wise classifier, we treat each pixel separately and create a roughly
balanced training pixel dataset set from our image training dataset. We se-
lect five positive pixels (annotated as marine debris) and five negative other
pixels from each image. This results in a 70 000 training pixels. As for to the
other comparison approaches, we tune the classification threshold based on
the validation dataset, which is 0.663.
3.5. Evaluation Metrics
We compare all models trained on “original data” and “our dataset” on
several metrics on the evaluation sets of Durban, Accra, and the MARIDA
test partitions.
We include the overall accuracy ratio of correct classifications to
total samples. It is straightforward to interpret, but susceptible to
class imbalance. Our selected validation and evaluation sets, however,
have a general balance between positive and negative samples.
f-score is the harmonic mean between precision and recall that, in
contrast to individual precision and recall scores, is more robust to the
choice of the classification threshold.
The area under the receiver operator curve (auroc) is a metric that
is independent of the classification thresholds but easily saturates for
relatively accurate classifiers with values close to 1.
The jaccard index, also known as intersection over union, is com-
monly used for object detection and measures the number of intersec-
tions of two sets (predictions and ground truth) divided by their union.
The kappa statistic compares two classifiers: the model and a ran-
domly guessing baseline. Values of zero indicate that the tested model
is not better than a random baseline, while positive correlations indi-
cate that the tested model outperforms the trivial baseline.
Higher values are better for all metrics, and values of 1 indicate a perfect
4. Results
We first compare the models quantitatively and qualitatively in Sec-
tion 4.1. We then predict one entire Sentinel-2 scene (Durban) in Section 4.2
and quantify the false positive predictions on both bottom-of-atmosphere and
top-of-atmosphere Sentinel-2 imagery. In the final experiment Section 4.3, we
test how a re-trained 4-channel detector can predict marine debris on higher-
resolution PlanetScope imagery, which can complement Sentinel-2 imagery
in practice.
4.1. Numerical Comparisons
Table 1 shows the quantitative results of rf and Unet models trained
on the respective original data in comparison to rf,unet, and unet++
trained with our training setting on the combined training dataset and re-
finement strategies described in Section 3.1. We see that models trained in
our combined training framework achieve the best accuracy metrics in all ex-
periments including those where the label refinement is not used (column “no-
ref”). As expected, the deep learning-based UNet and the Unet++ models
outperform the pixel-wise random forest classifier. This is likely due to the
advantage of convolutional neural networks to learn spatial patterns within
their convolutional perceptive field. Both Unet and Unet++ achieve equal
accuracies within one standard deviation on the Marida test set, while the
Unet++ achieves a better accuracy on the Durban and Accra scenes. The
label refinement module also improves the Unet++ performance on Marida-
test and Durban. However, on Accra, the best scores are achieved with a
Unet++ model without refinement module (indicated by “no-ref”). For
trained on original data our train set
ACC URACY 0.653 0.882 0.680 0.924 ±0.016 0.930 ±0.016 0.948 ±0.008
F-SCORE 0.464 0.871 0.545 0.920 ±0.018 0.926 ±0.018 0.948 ±0.008
AUROC 0.246 0.965 0.899 0.978 ±0.008 0.981 ±0.006 0.989 ±0.005
JAC CAR D 0.302 0.772 0.374 0.852 ±0.030 0.862 ±0.031 0.900 ±0.014
KA PPA 0.301 0.764 0.357 0.848 ±0.031 0.859 ±0.031 0.897 ±0.017
trained on original data our train set
ACC URACY 0.781 0.587 0.811 0.908 ±0.010 0.934 ±0.018 0.905 ±0.011
F-SCORE 0.105 0.497 0.708 0.756 ±0.032 0.837 ±0.053 0.776 ±0.026
AUROC 0.376 0.765 0.862 0.850 ±0.030 0.914 ±0.018 0.886 ±0.053
JAC CAR D 0.055 0.330 0.548 0.609 ±0.042 0.722 ±0.048 0.635 ±0.034
KA PPA 0.082 0.245 0.569 0.704 ±0.037 0.797 ±0.063 0.717 ±0.031
Marida-test set
trained on original data our train set
ACC URACY 0.697 0.838 0.811 0.865 ±0.006 0.867 ±0.005 0.851 ±0.006
F-SCORE 0.288 0.701 0.708 0.741 ±0.012 0.749 ±0.009 0.710 ±0.015
AUROC 0.488 0.764 0.862 0.738 ±0.012 0.746 ±0.021 0.733 ±0.006
JAC CAR D 0.168 0.539 0.548 0.589 ±0.015 0.598 ±0.012 0.551 ±0.018
KA PPA 0.197 0.593 0.569 0.654 ±0.016 0.661 ±0.012 0.615 ±0.017
20 40 60 80 100
validation loss
unet++ mean
unet++ std
unet mean
unet std
Table 1: Quantitative comparison of models trained on original data (rf (Kikaki et al.,
2022), Unet (Mifdal et al., 2021)), versus models trained on the training data compiled in
this work. We also test a Unet++ model without label refinement module, indicated by
the “no-ref” suffix in the last column. The bottom plot shows the validation loss during
training of three Unet++ and Unet models, each. The Unet++ finds an optimum
earlier and has less variance (shown in 1σstandard deviation) between the models in the
early states of training.
input (12-bands) target model predictions
our training data FlObs-only
RGB FDI label UN ET ++ no-ref R F UNE T
Figure 6: Qualitative predictions of the three models on images covering each 2.56 km
by 2.56 km from the Accra and Durban sets. Our Unet++ produces marine debris
predictions similar to the hand annotations (target/label) with the fewest false positives.
An interactive qualitative comparison is available under https://marcrusswurm.users.
the remaining paper, we use the Unet++ model in the Marine Debris De-
tector, as it has fewer parameters and finds an optimum earlier and more
consistently between random seeds (1σstandard deviation shown) than the
Unet in the training process, as shown in the bottom plot of Table 1.
Figure 6 compares models qualitatively on selected 256 px ×256 px each
patches covering 2.56 km by 2.56 km. The tiles are from the Accra and Dur-
ban evaluation scenes, where it is highly plausible that plastic pollution is
present in marine debris. We compare the Unet++ model with and without
label refinement, the random forest rf with features of (Kikaki et al., 2022),
trained on our dataset, and the Unet from Mifdal et al. (2021) trained on
the original FloatingObjects (FlObs) dataset only. The first two columns
show RGB and FDI representations of the multi-spectral Sentinel-2 scenes.
The third column shows hand-annotated masks (shown in red). We generally
see the quantitative results mirrored in these qualitative examples, where the
deep learning model trained on our combined training set produces the most
truthful masks of floating marine debris. While none of the models captured
the hand annotations perfectly, the Unet++ produced the visually most
accurate predictions with the fewest false positives across most evaluation
scenes. The Unet++ without label refinement (indicated by “no-ref”) pro-
vides generally thinner predictions than the Unet++ with refinement mod-
ule, which we connect to the refinement module always enlarging the target
mask of marine debris to some degree during training. In Accra-1, Unet++
and Unet (Mifdal et al., 2021) capture the general location of the objects,
while the random forest rf (Kikaki et al., 2022) detected natural waves along
the entire coastline as marine debris. The Unet++ without refinement mod-
ule appears to merge multiple patches of debris here and does not accurately
capture the individual objects. Accra-2 shows several sargassum patches in
between ships. Generally, all models predict these patches well, while still
some ships are confused with marine debris. The Durban scenes are more
challenging and show more atmospheric perturbations through clouds and
haze. The Unet++ predicts the general locations of the annotated marine
debris well until the cloud coverage is too dense, as seen in Durban-3. The
original Unet (Mifdal et al., 2021) predicts a large number of false positives,
which was also stated as a limitation in their original work. The random
forest rf of Kikaki et al. (2022) tends to under-predict the marine debris in
all three Durban scenes and only identifies a few individual floating object
patches in Durban-1.
Finally, we compare different Unet++ models trained on different ini-
tialization seeds, with and without label refinement on images of the Plastic
Litter Projects 2021 (Fig. 7). Most models capture the general location of
the deployed targets on all scenes. However, some models (seed 3; no la-
bel refinement and seed 2 with label refinement) confuse the coastline and
some water areas for marine debris. Seed 1 with label refinement appears to
miss the deployed targets on June 21st and July 1st, similarly to the model
trained on seed 2 with label refinement on July 1st. Similarly to the previous
result, models trained with refined labels predict larger but also less defined
patches compared to models trained without. This experiment demonstrates
the challenges associated with detecting individual objects that span only
few pixels. However, we would like to highlight that these deployed targets
are not representative of the marine debris seen in open waters, on which the
RGB FDI with label refinement no label refinement
seed 1 seed 2 seed 3 seed 1 seed 2 seed 3
June 11th
June 21st
July 1st
July 31st
Aug. 10th
Figure 7: Classification probabilities for Sentinel-2 scenes of deployed targets in during the
Plastic Litter Projects 2021 (Topouzelis et al., 2019). All models assign higher probabilities
to the deployed targets. Still, only few models detect both targets. Other pixels, such
as coastlines, are sometimes assigned a higher marine debris probability. Models trained
with the label refinement module tend to predict larger patches with less spatial detail.
models have been trained on. These objects typically form long lines rather
than round shapes, and we believe that the difference in geometrical shape,
rather than spectral appearance, is a major feature that the deep learning
models use for their predictions.
4.2. Role of Atmospheric Correction
In this experiment, we follow a realistic deployment scenario and pre-
dict the entire Durban scene of 3122 px ×3843px with the Unet++ model
in overlapping 480px ×480 px patches. We then consider pixels predicted
with a probability higher than the prediction threshold and treat each local
maximum as a marine debris detection. We set a minimum distance of 3 px
between local maxima to avoid marine debris detections being too close to
each other. Furthermore, we compare predictions of the same model using ei-
ther a top-of-atmosphere (TOA) Sentinel-2 scene on a bottom-of-atmosphere
(BOA) atmospherically corrected Sentinel-2 scene, to assess the effect of at-
mospheric correction on the model predictions.
We show both images alongside the locations of detections (scatter points)
in Figs. 8a and 8b, respectively. The red scatter points indicate correctly de-
(a) TOA: top-of-atmosphere Sentinel-2 scene (b) BOA: bottom-of-atmosphere Sentinel-2 scene
debris t. hz. d. hz. clouds ships land coast water
(c) number and confusions of detections. Classes evaluated are marine debris (the correct class) and con-
fusions with transparent haze (t.hz.), dense haze (d.hz), cummulus clouds (clouds), ships ,land,coastline
(coast), and water.
Figure 8: Analysis of confusions of detections in atmospherically corrected bottom-of-
atmosphere (BOA) and not correction top-of-atmosphere (TOA) Sentinel-2 imagery of
the Durban scene. In (a) and (b) panels, detections are colored according to the classes
of panel (c).
tected marine debris. Points of other colors indicate false-positives with other
classes transparent haze (t.hz.), dense haze (d.hz), cumulus clouds (clouds),
ships,land,coastline (coast), and water, alongside marine debris (debris).
Figure 8c further shows a quantitative summary of the confusion between
classes. We generally see a comparable number of marine debris detected
at both BOA (136 detections) and TOA (164 detections) processing levels.
This shows that the classifier is sensitive to marine debris in both top-of-
atmosphere and bottom-of-atmosphere satellite imagery. However, predic-
tions based on top-of-atmosphere data had more false positive predictions
leading to a lower precision. This is especially visible in the transparent-
(t.hz.) and dense haze (d.hz) categories as well, as in water, as shown in
the bar plot of Fig. 8c. Overall and not shown in the figure: 609 objects
were detected in the bottom-of-atmosphere (BOA) scene, and 1484 objects
as marine debris in the top-of-atmosphere scene. For comparison, the Unet
trained only on the FloatingObjects dataset of (Mifdal et al., 2021) detected
20 830 objects in the BOA scene and 33 665 at TOA processing level, which is
more than one order of magnitude more false positive predictions compared
to the Unet++ shown in Fig. 8. This demonstrates ever more the impor-
tance of compiling larger and more precise training datasets with a rich pool
of negative examples that account for objects easily confused with marine
debris. It demonstrates the current limitations and general difficulty of de-
tecting marine debris automatically on Sentinel-2 imagery with the current
technology. The extreme imbalance between a very low number of marine de-
bris pixels (if any) and everything else visible in the Sentinel-2 scene poses a
severe challenge to the automated detection of marine debris. Overall in this
experiment, only 6448 of 11 997 846 pixels were annotated as marine debris,
which represents coverage of only 0.05%. In this circumstance, identifying
less than potential 1000 objects in a 31 km by 38 km is an achievement and
allows to validate these detections visibly with limited manual effort in prac-
tice. This work can be further reduced by additional targeted post-processing
by masking clouds, land, and shoreline explicitly, which we consider outside
of the scope of this work.
4.3. Transferability to PlanetScope Resolution
In this final experiment, we test how well the Unet++ model trained on
Sentinel-2 imagery can predict on PlanetScope without being fine-tuned on
PlanetScope imagery specifically. For this experiment, we had to downsam-
ple the PlanetScope imagery from 3 m to 5 m as the resolution gap between
(a) Double-acquisition of Sentinel-2 and PlanetScope of sar-
gassum patches in Accra (2018-10-30) with 4 minutes 32-
second delay.
(b) Daily PlanetScope imagery fills the observation gaps of Sentinel-2 (every 5 days) for the Plastic
Litter Project (Island of Lesbos, Greece) where marine plastic and wooden targets were deployed in 2022
Topouzelis et al. (2019)
Figure 9: A four-channel RGB+NIR model trained on Sentinel-2 imagery can classify
marine debris in 5 m ×5 m downsampled Planetscope images, while being trained on 4-
channel Sentinel-2 imagery. We showcase two use cases. In a) a simultaneous acquisition
of S2 and PS in Accra show the drift direction of Sargassum patches. In b) PlanetScope
images augment S2 observations in the Plastic Litter Project.
trained 10 m resolution and full 3 m PlanetScope imagery was too large. On
the original resolution, the model created artifacts in the predictions, which
disappeared at downsampled 5 m PlanetScope imagery. For the Sentinel-
2 image, we use the same model with 12 input channels as in the previ-
ous experiments. For the 4-channel PlanetScope imagery, we re-trained the
Unet++ model on the identical Sentinel-2 training data but removed all
spectral bands except B2, B3, B4, and B8 for RGB+NIR. This 4-channel
model achieves a slightly worse validation accuracy (0.01-0.03 in f-score)
than the 12-channel model. This slight decrease in accuracy also indicates
that the four high-resolution 10 m bands are the most informative for ma-
rine debris detection, which is reasonable given the small size of debris and
previous literature (Biermann et al. (2020)).
We consider two use cases in Fig. 9, where PlanetScope imagery comple-
ments Sentinel-2.
First, double acquisitions of Sentinel-2 and PlanetScope during the
same day can be used to determine the debris’s short-term surface drift
direction. It shows one PlanetScope with a corresponding Sentinel-2
image over Accra, Ghana, on 30th of October 2018, with four minutes
and 32 seconds time difference. Both models detected marine debris,
as visible in the probability map.
Second, daily PlanetScope imagery can be used to gap-fill the periods
in which the weekly Sentinel-2 imagery is unavailable. This is demon-
strated in Fig. 9b, where the deployed targets from the Plastic Litter
Project 2022 are predicted from Sentinel-2 and PlanetScope imagery
with the Unet++ model. The Sentinel-2 images are available only
on July 16th and 21st. Daily PlanetScope imagery can fill this tempo-
ral gap and enable continuous monitoring of the deployed targets at
a higher spatial-, but lower spectral resolution. We can see that the
4-channel model successfully predicts marine debris for the rectangu-
lar 5 m ×5 m inflatable PVC target deployed during the Plastic Litter
Project. The two circular (7 m diameter) HDPE-mesh targets are not
Thanks to these two examples, we emphasize that the UNet++ model
in our Marine Debris Detector trained on Sentinel-2 imagery worked with
PlanetScope images without explicitly having seen annotated PlanetScope
imagery. This highlights the broader applicability of the Unet++ model on
both satellite modality and the synergy between PlanetScope and Sentinel-2
satellite constellations for marine debris detection.
5. Discussion
This work presented and evaluated a training strategy including a dataset,
targeted negative sampling and a segmentation model to automatically iden-
tify marine debris of human or natural origins with readily available Sentinel-
2 imagery. Our main contribution is the aggregation and harmonization of all
annotated Sentinel-2 data for marine debris detection available today. We de-
signed a sampling rule to gather a large number of diverse negative examples
and a refinement module to automatically improve hand-annotations present
in current datasets, which yields a combined training dataset in which deep
learning models achieve the best results across different model architectures.
The model performances were compared quantitatively and qualitatively on
evaluation scenes where the visible marine debris in these scenes is highly
likely to contain plastic pollutants. The performance improvements observed
are consistent across datasets and model settings. They highlight the impor-
tance of designing good datasets for the tasks at hand and prove the necessity
to collect, aggregate and further refine globally distributed datasets of marine
debris in future research.
Role of atmospheric correction. Atmospheric correction with Sen2Cor
has proven beneficial in reducing the number of false positive examples and
improving precision. Still, the detector remained sensitive to marine de-
bris also with top-of-atmosphere data, which highlights the the sensitivity
of the model to marine debris. We believe that reliably detecting marine
debris from available satellite data is within reach with more annotation and
targeted post-processing, such as automatic masking of clouds, land, and
shoreline, which we considered beyond the scope of this work. In this work,
we trained the detector with Sentinel-2 images of both top-of-atmosphere
(L1C-level) and bottom-of-atmosphere (L2A-level with the Sen2Cor algo-
rithm) to ensure that the final model is capable of detecting marine debris
from Sentinel-2 imagery at different processing levels. However, further at-
mospheric correction specific for coastal and aquatic environments, as with
the ACOLITE algorithm (Vanhellemont and Ruddick, 2016), is likely to im-
prove the detection accuracy further.
Marine debris as proxy for marine litter. The detection of marine
debris remains a proxy objective targeted toward the long-term goal of en-
abling continuous monitoring of marine litter including plastics and other
anthropogenic pollutants from medium-resolution satellite data. Here, auto-
matically establishing the link between detected marine debris and marine
pollution is a key question to be addressed in the future. Similar to related
work (Biermann et al., 2020), we analyzed social media (Durban scene) and
in-situ studies (Accra scene) on a case-by-case basis to deduce that marine
plastics are present in marine debris visible in the satellite scenes. Automat-
ing this connection remains a challenge that may require integrating in-situ
knowledge (citizen science, or river monitoring) or a targeted acquisition and
analysis of high-resolution imagery. Studies (C´ozar et al., 2021; Ruiz et al.,
2020) have demonstrated that plastics are present in marine debris by on-
site ship-based collection. This establishes that marine debris detection is a
suitable, yet rough, proxy for plastic pollution mapping. Ongoing research
(Hu, 2021; Hu et al., 2022; Ciappa, 2021) in this field demonstrates that
distinguishing anthropogenic marine litter from natural types of debris using
only features is possible, but remains challenging and is largely unsolved to-
day. Our work concentrated on the prior step of automating the detection of
generic marine debris at a large scale largely based on their geometric shape,
which can be seen as a first step preceeding the aforementioned litter types
Relevance for of Algae and Sargassum Detection. While the eval-
uation datasets in our work aimed to measure the detector’s sensitivity to
marine litter, we see that the model is also sensitive to detections of floating
algae patches and sargassum. This sensitivity is inherently connected to the
annotations in the training dataset that were made by visually inspecting
the Floating Debris Index (Biermann et al., 2020) that is derived from the
Floating Algae Index (Hu, 2009). Hence, exploration and modification of
the training framework presented in this work and initialization from model
weights and fine-tuning towards detecting patches of algae and sargassum
would be an interesting follow-up work in an active research field (Wang and
Hu, 2021; Cuevas et al., 2018).
Transfer to other satellite products. The synergy of Sentinel-2 with
daily available PlanetScope (or other high-resolution imagery) is particularly
suitable for further analysis of detected debris and establishing a connection
to marine litter. Large-scale monitoring with commercial high-resolution im-
agery may be infeasible due to the high image acquisition costs. However,
selecting a few images with PlanetScope in locations where a Sentinel-2 detec-
tor has identified potential marine debris appears feasible. We explored this
transferability in Section 4.3 where a model trained on 4-channel Sentinel-2
imagery was still sensitive to marine debris in (downsampled) planet scope
data. Targeted model training on annotated PlanetScope data will likely
improve this performance further, which we leave for future work.
Spatial and spectral features. A further direction to be explored is the
heterogeneous composition of objects in marine debris, which varies depend-
ing on circumstances (e.g., Flood event in Durban) or the general pollution
of the area (Accra scene). This heterogeneity in spectral response further
emphasizes the importance and descriptiveness of the shape and geometry
in marine debris, which often form elongated lines due to oceanic processes,
such as windrows and waterfronts. Further, the geometry of objects is also
a suitable descriptor to exclude a variety of negatives, such as ships, clouds,
coastline, and wakes, that can have similar spectral responses (e.g., a high
FDI index) to marine debris but are distinguishing from marine debris by
spatial context. In particular, convolutional neural networks are suitable
to learn these patterns in their filter banks if they are trained with large
annotated datasets with a diverse set of negative examples.
6. Conclusion
Remote sensing combined with current machine learning frameworks has
the potential to become an efficient and reliable tool to monitor large marine
areas (Hanke et al., 2013). Still, the data quality used to learn detection
models is paramount. We are confident that automated detection of marine
debris with satellite remote sensing imagery will provide a repeatable low-
cost technology to detect and quantify the level of marine pollution on our
planet. Automated detection and quantification will be necessary to inform
clean-up operations and measure local policy decisions’ effect. Identifying
and quantifying pollution hotspots and addressing the drivers and sources are
crucial to create a cleaner environment to plant, animal, and human life in a
sustainable future. Still, further efforts are needed in data collection and on-
site validation to build models that can reliably estimate the level of marine
pollution from readily available satellite data in a completely automated
way. In this research, we made a step toward automated satellite-based
monitoring of marine pollution via detecting marine debris in coastal waters
and providing model weights and training scripts in a dedicated package 2.
2The source code and data:
We hope this work helps accelerate the progress toward large-scale marine
litter monitoring within the canon of trans-disciplinary machine learning,
remote sensing, and marine science research.
Andrady, A.L., 2011. Microplastics in the marine environment. Marine
pollution bulletin 62, 1596–1605.
Arias, M., Sumerot, R., Delaney, J., Coulibaly, F., Cozar, A., Aliani, S.,
Suaria, G., Papadopoulou, T., Corradi, P., 2021. Advances on remote
sensing of windrows as proxies for marine litter based on Sentinel-2/MSI
datasets, in: 2021 IEEE International Geoscience and Remote Sensing
Symposium IGARSS, IEEE. pp. 1126–1129.
Beaumont, N.J., Aanesen, M., Austen, M.C., orger, T., Clark, J.R., Cole,
M., Hooper, T., Lindeque, P.K., Pascoe, C., Wyles, K.J., 2019. Global
ecological, social and economic impacts of marine plastic. Marine pollution
bulletin 142, 189–195.
Bessa, F., Ratcliffe, N., Otero, V., Sobral, P., Marques, J.C., Waluda, C.M.,
Trathan, P.N., Xavier, J.C., 2019. Microplastics in gentoo penguins from
the antarctic region. Scientific reports 9, 1–7.
Biermann, L., Clewley, D., Martinez-Vicente, V., Topouzelis, K., 2020. Find-
ing plastic patches in coastal waters using optical satellite data. Scientific
reports 10, 1–10.
Booth, H., Ma, W., Karakus, O., 2022. High-precision density mapping of
marine debris and floating plastics via satellite imagery. arXiv preprint
arXiv:2210.05468 .
Borrelle, S.B., Ringma, J., Law, K.L., Monnahan, C.C., Lebreton, L., Mc-
Givern, A., Murphy, E., Jambeck, J., Leonard, G.H., Hilleary, M.A., Erik-
sen, M., Possingham, P.H., De Frond, H., Gerber, L.R., Polidoro, B., Tahir,
A., Bernard, M., Mallos, N., Barnes, M., Rochmal, C.M., 2020. Predicted
growth in plastic waste exceeds efforts to mitigate plastic pollution. Science
369, 1515–1518.
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification
and regression trees. Routledge.
Chapron, L., Peru, E., Engler, A., Ghiglione, J., Meistertzheim, A., Pruski,
A., Purser, A., etion, G., Galand, P., Lartaud, F., 2018. Macro-and
microplastics affect cold-water corals growth, feeding and behaviour. Sci-
entific reports 8, 1–8.
Ciappa, A.C., 2021. Marine plastic litter detection offshore hawai’i by
sentinel-2. Marine Pollution Bulletin 168, 112457. URL: https://,
Ciappa, A.C., 2022. Marine litter detection by sentinel-2: A case study in
north adriatic (summer 2020). Remote Sensing 14. URL: https://www., doi:10.3390/rs14102409.
Ciocarlan, A., Stoian, A., 2021. Ship detection in sentinel 2 multi-spectral
images with self-supervised learning. Remote Sensing 13, 4255.
Cuevas, E., Uribe-Mart´ınez, A., Liceaga-Correa, M.d.l.´
A., 2018. A satellite
remote-sensing multi-index approach to discriminate pelagic sargassum in
the waters of the yucatan peninsula, mexico. International Journal of
Remote Sensing 39, 3608–3627.
ozar, A., Aliani, S., Basurko, O.C., Arias, M., Isobe, A., Topouzelis, K.,
Rubio, A., Morales-Caselles, C., 2021. Marine litter windrows: A strategic
target to understand and manage the ocean plastic pollution. Frontiers in
Marine Science 8. doi:10.3389/fmars.2021.571796.
Davaasuren, N., Marino, A., Boardman, C., Alparone, M., Nunziata, F., Ack-
ermann, N., Hajnsek, I., 2018. Detecting microplastics pollution in world
oceans using sar remote sensing, in: 2018 IEEE International Geoscience
and Remote Sensing Symposium IGARSS, pp. 938–941.
van Emmerik, T., Schwarz, A., 2020. Plastic debris in rivers. Wiley Inter-
disciplinary Reviews: Water 7, e1398.
Eriksen, M., Lebreton, L.C., Carson, H.S., Thiel, M., Moore, C.J., Borerro,
J.C., Galgani, F., Ryan, P.G., Reisser, J., 2014. Plastic pollution in the
world’s oceans: more than 5 trillion plastic pieces weighing over 250,000
tons afloat at sea. PloS one 9, e111913.
Escobar-S´anchez, G., Markfort, G., Berghald, M., Ritzenhofen, L., Sch-
ernewski, G., 2022. Aerial and underwater drones for marine litter mon-
itoring in shallow coastal waters: factors influencing item detection and
cost-efficiency. Environmental monitoring and assessment 194, 1–28.
Faure, F., Corbaz, M., Baecher, H., de Alencastro, L.F., 2012. Pollution due
to plastics and microplastics in lake geneva and in the mediterranean sea.
Archives de Science 65, 157–164.
Goddijn-Murphy, L., Williamson, B.J., McIlvenny, J., Corradi, P., 2022.
Using a uav thermal infrared camera for monitoring floating marine plastic
litter. Remote Sensing 14, 3179.
omez, `
A.S., Scandolo, L., Eisemann, E., 2022. A learning approach for river
debris detection. International Journal of Applied Earth Observation and
Geoinformation 107, 102682.
Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., Moore,
R., 2017. Google earth engine: Planetary-scale geospatial analysis for
everyone. Remote Sensing of Environment 202, 18–27.
Grady, L., 2006. Random walks for image segmentation. IEEE transactions
on pattern analysis and machine intelligence 28, 1768–1783.
Hanke, G., Galgani, F., Werner, S., Oosterbaan, L., Nilsson, P., Fleet, D.,
Kinsey, S., Thompson, R., Palatinus, A., Van Franeker, J., et al., 2013.
Guidance on monitoring of marine litter in european seas: a guidance doc-
ument within the common implementation strategy for the marine strategy
framework directive. .
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image
recognition, in: Proceedings of the IEEE conference on computer vision
and pattern recognition, pp. 770–778.
Hidalgo-Ruz, V., Thiel, M., 2015. The contribution of citizen scientists to
the monitoring of marine litter. Marine Anthropogenic Litter 16, 429–447.
Hu, C., 2009. A novel ocean color index to detect floating algae in the global
oceans. Remote Sensing of Environment 113, 2118–2129.
Hu, C., 2021. Remote detection of marine debris using satel-
lite observations in the visible and near infrared spectral range:
Challenges and potentials. Remote Sensing of Environment 259,
112414. URL:
pii/S0034425721001322, doi:
Hu, C., 2022. Remote detection of marine debris using sentinel-2 imagery: A
cautious note on spectral interpretations. Marine Pollution Bulletin 183,
114082. URL:
pii/S0025326X22007640, doi:
Hu, C., Qi, L., Xie, Y., Zhang, S., Barnes, B.B., 2022. Spectral characteristics
of sea snot reflectance observed from satellites: Implications for remote
sensing of marine debris. Remote Sensing of Environment 269, 112842.
Kershaw, P., Turra, A., Galgani, F., et al., 2019. Guidelines for
the monitoring and assessment of plastic litter and microplas-
tics in the ocean. URL:
Kikaki, K., Kakogeorgiou, I., Mikeli, P., Raitsos, D.E., Karantzalos, K., 2022.
Marida: A benchmark for marine debris detection from sentinel-2 remote
sensing data. PloS one 17, e0262247.
Kuebler, M., 2021. Turkey’s ’sea snot’ is part of a growing environmental
threat. URL:
Main-Knorn, M., Pflug, B., Louis, J., Debaecker, V., uller-Wilm, U., Gas-
con, F., 2017. Sen2cor for sentinel-2, in: Image and Signal Processing for
Remote Sensing XXIII, SPIE. pp. 37–48.
Mifdal, J., Long´ep´e, N., Rußwurm, M., 2021. Towards detecting float-
ing objects on a global scale with learned spatial features using sen-
tinel 2. ISPRS Annals of the Photogrammetry, Remote Sensing
and Spatial Information Sciences V-3-2021, 285–293. doi:10.5194/
Otsu, N., 1979. A threshold selection method from gray-level histograms.
IEEE transactions on systems, man, and cybernetics 9, 62–66.
Papageorgiou, D., Topouzelis, K., Suaria, G., Aliani, S., Corradi, P., 2022.
Sentinel-2 detection of floating marine litter targets with partial spectral
unmixing and spectral comparison with other floating materials (plastic
litter project 2021) Under review.
Pinto, R., Barendse, T., van Emmerik, T., van der Ploeg, M., Annor, F.,
Duah, K., Udo, J., Uijlenhoet, R., 2023. Exploring plastic transport dy-
namics in the odaw river, ghana. Frontiers in Environmental Science 11.
Politikos, D.V., Adamopoulou, A., Petasis, G., Galgani, F., 2023. Using
artificial intelligence to support marine macrolitter research: A content
analysis and an online database. Ocean & Coastal Management 233,
106466. URL:
pii/S0964569122004422, doi:
Rees, G., Pond, K., 1995. Marine litter monitoring programmes—a review
of methods with special reference to national surveys. Marine Pollution
Bulletin 30, 103–108.
Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks
for biomedical image segmentation, in: International Conference on Med-
ical Image Computing and Computer-Assisted Intervention, pp. 234–241.
Ruiz, I., Basurko, O.C., Rubio, A., Delpey, M., Granado, I., Declerck, A.,
Mader, J., ozar, A., 2020. Litter windrows in the south-east coast of the
Bay of Biscay: an ocean process enabling effective active fishing for litter.
Frontiers in marine science 7, 308.
Salgado-Hernanz, P.M., Bauz`a, J., Alomar, C., Compa, M., Romero, L.,
Deudero, S., 2021. Assessment of marine litter through remote sensing:
recent approaches and future goals. Marine Pollution Bulletin 168, 112347.
Schwabl, P., oppel, S., onigshofer, P., Bucsics, T., Trauner, M., Reiberger,
T., Liebmann, B., 2019. Detection of various microplastics in human stool:
a prospective case series. Annals of Internal Medicine 171, 453–457.
Shah, A., Lillianne, T., Manil, M., 2021. Marine debris dataset for object de-
tection in planetscope imagery. URL:
Sun, Y., Bakker, T., Ruf, C., Pan, Y., 2023. Effects of microplastics and
surfactants on surface roughness of water waves. Scientific Reports 13,
Themistocleous, K., Papoutsa, C., Michaelides, S., Hadjimitsis, D., 2020.
Investigating detection of floating plastic litter from space using sentinel-2
imagery. Remote Sensing 12, 2648.
Topouzelis, K., Papageorgiou, D., Karagaitanakis, A., Papakonstantinou, A.,
Arias Ballesteros, M., 2020a. Remote sensing of sea surface artificial float-
ing plastic targets with sentinel-2 and unmanned aerial systems (Plastic
Litter Project 2019). Remote Sensing 12, 2013.
Topouzelis, K., Papageorgiou, D., Karagaitanakis, A., Papakonstantinou, A.,
Arias Ballesteros, M., 2020b. Remote sensing of sea surface artificial float-
ing plastic targets with sentinel-2 and unmanned aerial systems (plastic
litter project 2019). Remote Sensing 12, 2013.
Topouzelis, K., Papageorgiou, D., Suaria, G., Aliani, S., 2021. Float-
ing marine litter detection algorithms and techniques using optical
remote sensing data: A review. Marine Pollution Bulletin 170,
112675. URL:
pii/S0025326X21007098, doi:
Topouzelis, K., Papakonstantinou, A., Garaba, S.P., 2019. Detection of
floating plastics from satellite and unmanned aerial systems (plastic lit-
ter project 2018). International Journal of Applied Earth Observation and
Geoinformation 79, 175–183.
United Nations Environment Programme, 2009. United nations environment
programme - annual report 2009: Seizing the green opportunity. URL:
Van Cauwenberghe, L., Vanreusel, A., Mees, J., Janssen, C.R., 2013. Mi-
croplastic pollution in deep-sea sediments. Environmental Pollution 182,
Van Dyck, I.P., Nunoo, F.K., Lawson, E.T., 2016. An empirical assessment
of marine debris, seawater quality and littering in Ghana. Journal of Geo-
science and Environment Protection 4, 21–36.
Van Emmerik, T., Tramoy, R., Van Calcar, C., Alligant, S., Treilles, R.,
Tassin, B., Gasperi, J., 2019. Seine plastic debris transport tenfolded
during increased river discharge. Frontiers in Marine Science 6, 642.
Vanhellemont, Q., Ruddick, K., 2016. Acolite for sentinel-2: Aquatic appli-
cations of msi imagery, in: Proceedings of the 2016 ESA Living Planet
Symposium, Prague, Czech Republic, pp. 9–13.
Vanhellemont, Q., Ruddick, K., 2018. Atmospheric correction of metre-scale
optical satellite data for inland and coastal water applications. Remote
Sensing of Environment 216, 586–597.
Wang, M., Hu, C., 2021. Satellite remote sensing of pelagic sargassum
macroalgae: The power of high resolution and deep learning. Remote
Sensing of Environment 264, 112631.
Whang, S.E., Roh, Y., Song, H., Lee, J.G., 2023. Data collection and quality
challenges in deep learning: A data-centric ai perspective. The VLDB
Journal , 1–23.
Wolf, M., van den Berg, K., Garaba, S.P., Gnann, N., Sattler, K., Stahl, F.,
Zielinski, O., 2020. Machine learning for aquatic plastic litter detection,
classification and quantification (aplastic-q). Environmental Research Let-
ters 15, 114042.
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J., 2018. Unet++:
A nested U-Net architecture for medical image segmentation, in: Deep
learning in medical image analysis and multimodal learning for clinical
decision support. Springer, pp. 3–11.
Zhu, Y., Newsam, S., 2017. Densenet for dense flow, in: 2017 IEEE Interna-
tional Conference on Image Processing (ICIP), IEEE. pp. 790–794.
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
The last couple of years has been ground-breaking for marine pollution monitoring purposes. It has been suggested that combining multi-spectral satellite information and machine learning approaches are effective to monitor plastic pollutants in the ocean environment. Recent research has made theoretical progress in identifying marine debris and suspected plastic (MD&SP) through machine learning whereas no study has fully explored the application of these methods for mapping and monitoring marine debris density. Therefore, this article consists of three main components: (1) the development and validation of a supervised machine learning marine debris detection model, (2) to map the MD&SP density into an automated tool called MAP-Mapper and finally (3) evaluation of the entire system for out-of-distribution (OOD) test locations. Developed MAP-Mapper architectures provide users with options to achieve high precision (abbv. -HP) or optimum precision-recall (abbv. -Opt) values in terms of training/test dataset. Our MAP-Mapper-HP model greatly increases the MD&SP detection precision to 95%, while the MAP-Mapper-Opt achieves 87–88% precision–recall pair. To efficiently measure density mapping findings at OOD test locations, we propose the Marine Debris Map (MDM) index, which combines the average probability of a pixel belonging to the MD&SP class and the number of detections in a given time frame. The high MDM findings of the proposed approach are found to be consistent with existing marine litter and plastic pollution areas, and these are presented with available evidence citing literature and field studies.
Full-text available
Plastic pollution in rivers threatens ecosystems, increases flood risk due to its accumulations at hydraulic structures and its final emissions into the ocean threaten aquatic life, especially and probably most in coastal urbanized areas. Previous work suggests that plastic pollution in these urban rivers is influenced by hydrometeorological and anthropogenic factors. However, the transport dynamics of the plastics in such rivers are non-linear and complex and remain largely unresolved. Here, we show that tidal dynamics can be the main driver of plastic transport closest to the river mouth. Outside the tidal zone, rainfall and river discharge were identified to be more important drivers. We monitored plastic transport in the Odaw river, Ghana during the dry season. The Odaw drains the densely populated city of Accra and discharges into the Gulf of Guinea. Data were collected between March and May 2021 (dry season), using visual counting at four bridges along the river, of which two were located within the tidal zone. We explored the correlations between river plastic transport, and rainfall, tidal dynamics, and river discharge. Finally, we estimated the total plastic mass transport by using item-to-mass conversion data from previously published literature. We observed a peak in plastic transport at the upstream bridge within the tidal zone after an increase in rainfall (7.3 times larger). We found a gradient of the hydrometeorological factors driving plastic transport. Closer to the river mouth, tidal dynamics were more strongly correlated with plastic transport than upstream. The daily mass transport was estimated to be between 1.4–3.8 × 102 kg/d, which is lower than previous model estimates. These results add to the evidence of inconsistent correlations between plastic transport and hydrometeorological variables. Long-term monitoring data is required to further investigate this. The results also support the hypothesis that tidal dynamics are a crucial factor in controlling the emissions of plastics from rivers into the ocean. The findings provide a baseline for the Odaw river during the dry season and allow for comparison with the wet season. The approach adopted here also serves as a blueprint for similar urban river systems, regionally and globally.
Full-text available
The flow physics is examined underlying recent remote sensing capability of detecting oceanic microplastics, which is based on the reduction in surface roughness induced by the presence of microplastics on the ocean surface. At issue is whether this reduction is caused by the microplastics themselves, or by surfactants which follow similar transport paths. We experimentally test the effects of floating particles and surfactants with waves generated by a mechanical wave maker or by wind. For microplastics, we find that their impact on the damping of roughness critically depends on the surface area fraction of coverage. The damping is observed only for fractions above O (5–10%), much higher than the realistic ocean condition. For surfactants, their damping is shown to be much more significant. The implications of these experimental results on remote sensing are discussed.
Full-text available
Data-centric AI is at the center of a fundamental shift in software engineering where machine learning becomes the new software, powered by big data and computing infrastructure. Here, software engineering needs to be re-thought where data become a first-class citizen on par with code. One striking observation is that a significant portion of the machine learning process is spent on data preparation. Without good data, even the best machine learning algorithms cannot perform well. As a result, data-centric AI practices are now becoming mainstream. Unfortunately, many datasets in the real world are small, dirty, biased, and even poisoned. In this survey, we study the research landscape for data collection and data quality primarily for deep learning applications. Data collection is important because there is lesser need for feature engineering for recent deep learning approaches, but instead more need for large amounts of data. For data quality, we study data validation, cleaning, and integration techniques. Even if the data cannot be fully cleaned, we can still cope with imperfect data during model training using robust model training techniques. In addition, while bias and fairness have been less studied in traditional data management research, these issues become essential topics in modern machine learning applications. We thus study fairness measures and unfairness mitigation techniques that can be applied before, during, or after model training. We believe that the data management community is well poised to solve these problems.
Full-text available
Large-area, artificial floating marine litter (FML) targets were deployed during a controlled field experiment and data acquisition campaign: the Plastic Litter Project 2021. A set of 22 Sentinel-2 images, along with UAS data and ancillary measurements were acquired. Spectral analysis of the FML and natural debris (wooden planks) targets was performed, along with spectral comparison and separability analysis between FML and other floating materials such as marine mucilage and pollen. The effects of biofouling and submersion on the spectral signal of FML were also investigated under realistic field conditions. Detection of FML is performed through a partial unmixing methodology. Floating substances such as pollen exhibit similar spectral characteristics to FML, and are difficult to differentiate. Biofouling is shown to affect the magnitude and shape of the FML signal mainly in the RGB bands, with less significant effect on the infrared part of the spectrum. Submersion affects the FML signal throughout the range of the Sentinel-2 satellite, with the most significant effect in the NIR part of the spectrum. Sentinel-2 detection of FML can be successfully performed through a partial unmixing methodology for FML concentrations with abundance fractions of 20%, under reasonable conditions.
Full-text available
Although marine litter monitoring has increased over the years, the pollution of coastal waters is still understudied and there is a need for spatial and temporal data. Aerial (UAV) and underwater (ROV) drones have demonstrated their potential as monitoring tools at coastal sites; however, suitable conditions for use and cost-efficiency of the methods still need attention. This study tested UAVs and ROVs for the monitoring of floating, submerged, and seafloor items using artificial plastic plates and assessed the influence of water conditions (water transparency, color, depth, bottom substrate), item characteristics (color and size), and method settings (flight/dive height) on detection accuracy. A cost-efficiency analysis suggests that both UAV and ROV methods lie within the same cost and efficiency category as current on-boat observation and scuba diving methods and shall be considered for further testing in real scenarios for official marine litter monitoring methods.
Full-text available
In recent years, the remote sensing of marine plastic litter has been rapidly evolving and the technology is most advanced in the visible (VIS), near-infrared (NIR), and short-wave infrared (SWIR) wavelengths. It has become clear that sensing using VIS-SWIR bands, based on the surface reflectance of sunlight, would benefit from complementary measurements using different technologies. Thermal infrared (TIR) sensing shows potential as a novel method for monitoring macro plastic litter floating on the water surface, as the physics behind surface-leaving TIR is different. We assessed a thermal radiance model for floating plastic litter using a small UAV-grade FLIR Vue Pro R 640 thermal camera by flying it over controlled floating plastic litter targets during the day and night and in different seasons. Experiments in the laboratory supported the field measurements. We investigated the effects of environmental conditions, such as temperatures, light intensity, the presence of clouds, and biofouling. TIR sensing could complement observations from VIS, NIR, and SWIR in several valuable ways. For example, TIR sensing could be used for monitoring during the night, to detect plastics invisible to VIS-SWIR, to discriminate whitecaps from marine litter, and to detect litter pollution over clear, shallow waters. In this study, we have shown the previously unconfirmed potential of using TIR sensing for monitoring floating plastic litter.
Full-text available
Aggregates of floating materials detected in North Adriatic in six Sentinel-2 scenes of August 2020 have been investigated. Most of the floating materials were identified by the chlorophyll red edge and consisted of vegetal materials, probably conveyed by rivers and exchanged with the lagoons. Traces of marine litter were looked for in the spectral anomalies of the Red Edge bands, assuming changes of the red edge in pixels where marine litter was mixed with vegetal materials. About half of the detected patches were unclassified due to the weakness of the useful signal (pixel filling percentage < 25%). The classification produced 59% of vegetal materials, 16% of marine litter mixed with vegetal materials and 22% of intermediate cases. A small percentage (2%) was attributed to submerged vegetal materials, found in isolated patches. The previous percentages were obtained with a separation criterion based on arbitrary thresholds. The patches were more concentrated at the mouths of the northern rivers, less off the Venice lagoon, and very few outside the Po River, with the minimal river outflow during the period. Sentinel-2 is a valid tool for the discrimination of marine litter in aggregates of floating matter. The proposed method requires validation, and the North Adriatic is an excellent site for field work, as in summer many patches of floating matter form in proximity to the coast.
Marine scientists use a variety of collection and monitoring methods to survey macrolitter in aquatic environments, aiming to assess the level of pollution and design mitigation actions. However, the large volume of collected data often makes the visual recognition and identification of macrolitter items a time-consuming and labor-intensive task, indicating the need for automated and low-cost solutions. In addition, modelling approaches are needed to identify which environmental and anthropogenic factors shape the variability of observed litter concentrations. Artificial intelligence (AI) has emerged over the last years as a promising tool to address these issues. This study provides a literature review of published research that uses AI to process macrolitter datasets derived from imagery and tabular data. The focus is on diverse topics (litter domain, dataset source, sampling system, data type, task to be resolved, region, proposed methodologies, usability) with the aim of identifying the versatile contribution of AI on this theme and providing a reference resource for marine litter scientists. To do so, we release an online database, in which the user can seek publications based on several categories and tags. Current limitations, challenges and potential future directions are also discussed.
Remote detection of marine debris (also called marine litter) has received increased attention in the past decade, with the Multispectral Instruments (MSI) onboard the Sentinel-2A and Sentinel-2B satellites being the most used sensors. However, because of their mixed band resolutions and small sub-pixel coverage of debris within a pixel (e.g., <10 %), caution is required when interpreting the spectral shapes of MSI pixels. Otherwise, the spectrally distorted shapes may be misused as spectral endmembers (signatures) or interpreted as from certain types of floating matters. Here, using simulations and MSI data, I show the origin of the spectral distortions and emphasize why both pixel averaging and pixel subtraction are critical in algorithm design and spectral interpretation for the purpose of remote detection of marine debris using Sentinel-2 MSI sensors.