Conference PaperPDF Available
A DEEP NEURAL NETWORK FOR OIL SPILL SEMANTIC SEGMENTATION IN SAR
IMAGES
Georgios Orfanidis, Konstantinos Ioannidis, Konstantinos Avgerinakis, Stefanos Vrochidis, Ioannis Kompatsiaris
Centre for Research and Technology Hellas (CERTH)-Information Technologies Institute (ITI)
ABSTRACT
Oil spills pose a major threat of the oceanic and coastal
environments, hence, an automatic detection and a contin-
uous monitoring system comprises an appealing option for
minimizing the response time of any relevant operation. Nu-
merous efforts have been conducted towards such solutions
by exploiting a variety of sensing systems. Previous studies,
including neural networks, have shown that the use of satel-
lite Synthetic Aperture Radar (SAR) can effectively identify
oil spills over sea surfaces in any environmental conditions
and operational time. Moreover, in recent years, deep Con-
volutional Neural Networks (CNN) have presented some re-
markable abilities to surpass previous state-of-the-Art perfor-
mances in a great diversity of fields including identification
tasks. This paper describes the development of an approach
that combines the merits of a deep CNN with SAR imagery
in order to provide a fully automated oil spill detection sys-
tem. The deployed CNN was trained using multiple SAR im-
ages acquired from the sentinel-1 satellite provided by ESA
and based on EMSA records for maritime pollution events.
Experiments on such challenging benchmark datasets demon-
strate that the algorithm can accurately identify oil spills lead-
ing to an effective detection solution.
Index TermsOil pollution, Synthetic aperture radar,
Convolutional neural networks.
1. INTRODUCTION
Oil spill pollution is tightly connected not only with the ocean
ecosystem but also with the increase of maritime commerce
and activities. Since early measures in such cases are of major
importance, numerous algorithms have been presented to ac-
curately and automatically identify such pollution spots. The
vast majority of relevant methods exploits data acquired from
satellites which are equipped with Synthetic Aperture Radar
(SAR) capabilities due to the advantages they display. More
specific, such satellites can cover large areas of interest with-
out the necessity of deploying extra equipment and vehicles,
This work was supported by ROBORDER and EOPEN projects partially
funded by the European Commission under grant agreements No 740593 and
No 776019, respectively.
while on the other hand SAR imagery is indispensable be-
cause it manifests light and weather condition invariability.
A typical process for oil spill detection can be concluded
into four separate stages [1]. The first stage includes the de-
tection of the dark formations in SAR images while in the
second process, features for such formations are extracted.
Sequentially, the extracted features are compared with some
predefined values and a decision making model follows for
labeling each formation. The approach poses several disad-
vantages, mainly due to the necessity of extracting a number
of features, the non-unanimous agreement over their nature
and the lack of research over their effectiveness. In addition,
the limitation of providing a single label to each of the input
image poses constraints regarding the identified objects.
The most common approaches involve a two-class clas-
sification process, one class that includes only oil spills and
a second, more diverse class that corresponds to dark forma-
tions of all similar phenomena that resembles oil spills. The
latter class can be further divided into subclasses like current
shear, internal waves, algae blooms, shoals, floating vegeta-
tion and grease ice [2]. Also, contextual information, like
existence of similar formations, ship routes etc around the de-
tected instances may affect significantly the final characteri-
zation of these ”dark spots”.
In contrast to the above approach, the presented work
aims at deploying a deep Convolutional Neural Network
(CNN) in the field of oil spill detection in order to allevi-
ate the corresponding shortcomings. While there have been
some efforts dealing with oil spill detection using conven-
tional neural networks [1, 3], spatial features are required to
be initially computed. Also these algorithms were limited
to image classification, i.e. labeling the whole image and
not semantic segmented territories inside the image. The
problem can be considered as a rather complicated task since
it requires specialized knowledge of the various phenom-
ena related to the marine environments while optical means
often fail to provide adequate solutions. Thus, the final pol-
lution affirmation includes the mobilization of the relevant
national/regional authorities and in situ identification. The
proposed approach comprises a promising solution that could
accurately discriminate oil spills from similar instances with-
out the prerequisite of extracting additional features leading
to a completely automatic detection system.
2. RELATED WORK
One of the first attempts for oil spill detection was relied on
the use of visible spectrum images. Various approaches were
proposed such as utilizing polarized lenses [4] and hyper-
spectral imaging [5] among others. Although all relevant re-
searches have proven that there is no wider distinction be-
tween oil and water in this spectrum, the field is active and
the research is ongoing.
On the contrary, microwave sensors including radars are
widely utilized for such applications in order to overcome
the constraints that the optical sensors pose (weather and
operation time dependency). For radar imaging, Synthetic
Aperture Radar (SAR) is predominantly used [6] as they have
been proven to be at a large scale invariant to light condition
changes and clouds/fog occurrences [7].
Capillary waves produce ”bright” image regions known
as sea clutters which in the case of oil spills are depressed
and appear as dark formations. This type of detection is not
exclusively observed for oils spills but also include among
others wind slicks, wave shadows behind land, algae blooms,
land territories etc. [7, 2]. Since the identification of all the
above instances requires the definition of separate classes, an
acceptable simplification regarding the desired oil spill iden-
tification could include the reduction of the problem into a
two class problem, i.e. oil spills vs the rest of the phenomena.
In addition, until recently, oil spill detection approaches were
based on the initial extraction of features that represent and/or
simulate the physics of the oil dispersion. Such features in-
clude geometrical, physical or textural distinctive marks [1, 8]
based on which the models are trained to identify the oil spills.
Contrary to the aforementioned approaches, the proposed
method introduces a new deep CNN which does not require
the extraction of any feature and can semantically annotate
multiple regions in SAR images. In addition, the deployed
model was properly modified aiming at reducing the total
computational cost and thus, the operational time.
3. METHODOLOGY
The proposed approach for oil spill detection aims at seman-
tically segmenting the input images and highlighting the in-
cluded objects instead of just specifying a simple label to the
entire representation. The assignment of multiple labels/tags
in each image [9] or the extraction of bounding boxes with
the use of object detection techniques [10] could be potential
alternatives for the presented problem. Nonetheless, since oil
spills display a large variety of irregular shapes and could be
heavily intersected with look-alike objects, semantic segmen-
tation could be considered as the most effective solution. The
advantage of the approach relies on manipulating images that
can contain multiple objects of different nature without the
prerequisite of splitting the image into multiple image patches
manipulating oil spills, look-alikes etc.
Fig. 1. High-level representation of the altered ”DeepLab”
model.
The model was inspired from the ”DeepLab”, initially
proposed in [11], which has been proven to be sufficiently
effective in multi-class segmentation. In the referred work,
multiple experiments were conducted on a variety of methods
and network models, including VGG-16 [12] and ResNet-101
[13] with the latter giving the best performance.
Similar to the ”DeepLab”, the proposed oil spill detec-
tor includes the use of a deep convolutional neural network
trained in the task of semantic segmentation and the convo-
lution with upsampled filters, which originally developed in
[14] and utilized in the DCNN context by [15]. To efficiently
extract the required dense features and widen the field-of-
view of filters, atrous convolution was utilized as well as an
Atrous Spatial Pyramid Pooling (ASPP) to employ parallel
filters with various rates. The resulted maps are enlarged with
bilinear interpolation to restore their original resolution. A
higher level representation is provided in Fig. 1.
More specific, the proposed model for the required appli-
cation initially uses a DCNN in a fully convolutional fashion
and relies on a ResNet-101 network model. The ResNet-101
network was selected due to the highest detection rates it re-
sults in image semantic segmentation objectives. The model
was redefined so as to fine-tuning it in our case. However,
the repeated combination of max-pooling and striding at sub-
sequent layers decreases the final resolution of the extracted
feature maps and significantly increases the overall computa-
tional cost. Thus, atrous convolution was applied to explicitly
control the resolution at which feature responses are extracted
within DCNN. In the context of DCNNs, atrous convolution
can be utilized in a chain of layers to effectively compute the
networks’ responses at a randomly high resolution.
For example in one-dimensional signals, the output y[i]of
atrous convolution of a 1-Dinput signal x[i]with a filter w[k]
of length Kis defined as:
y[i] =
K
X
k=1
x[i+r·k]w[k](1)
The rate parameter rcorresponds to the stride with which we
sample the input signal. Basic convolution is a special case
for r= 1.
Fig. 2. Atrous Spatial Pyramid Pooling (ASPP).
Despite their ability to sufficiently represent scale by
trained on multi-resolution images, DCNNs’ competence for
object scale can still be improved to detect both large and
small objects. In applications that involve satellite image
processing where operational heights vary, and oil spill de-
tection where the size and shape of the object display extreme
diversity, the scale problem can significantly affect the de-
tection results. Therefore, to handle the scale variability, the
deployed model adopted an approach which was based on an
R-CNN spatial pyramid pooling method initially proposed
in [16] where regions of a random scale can be efficiently
classified by resampling features extracted at a single scale.
These features are extracted for each sampling rate and fur-
ther processed and fused to compute the final result. The
deployed ASPP is depicted in Fig. 2. The final processing
step involves the feature map resolution increment to restore
the original resolution by applying the basic bilinear interpo-
lation as in the ”DeepLab” system. It should be highlighted
that the final CRF module of the ”DeppLab” system was
excluded from the deployed model since it mostly used for
refining the segmentation results. For the oil spill detection,
the instances in SAR display vague optical limits, hence, the
CRF will not significantly improve the segmented regions
and thus, pointless computational overhead will be applied.
4. EXPERIMENTAL RESULTS
4.1. DATASET
One of the main challenges that the research community has
to face is the lack of a publicly available dataset for such
applications. Previous works [1, 17, 8] confronted this prob-
lem by utilizing a manually created dataset. Nonetheless, the
comparison with other related works using typical standards
is limited. These restrictions in utilizing public benchmark
datasets also limited our options into collecting SAR data
from European Space Agency (ESA) databases, the Coper-
nicus Open Access Hub1. The downloaded SAR images
were acquired using the Sentinel-1 European Satellite. The
required geographic coordinates and time of the confirmed
1https://scihub.copernicus.eu/
Intersection-over-union (IoU)
mIoU Oil spills Look-alikes Background
0.6098 0.4130 0.4564 0.9599
Table 1. Segmentation results using mIoU/IoU.
oil spills were provided by the European Maritime Safety
Agency (EMSA) based on the CleanSeaNet service and its
records covering a period from 28/09/2015 up to 31/10/2017.
SAR raw data were properly preprocessed using funda-
mental remote sensing preprocessing algorithms including:
1. All potential oil spills were localized.
2. Cropped regions from the initial SAR data with the oil
spills were extracted. All resulted images were rescaled
to have the same resolution of 1252x609 pixels.
3. A radiometric calibration of the image was applied to
project the images into the same plane.
4. A speckle filtering process followed to mitigate the ef-
fects of the sensor noise. A basic median filter of mask
7x7 was applied since speckle noise in remote sensing
is similar to the salt-n-pepper in image processing.
5. A linear transformation from db to actual luminosity
values was finally applied.
Numerous SAR images were processed in order to create
a convenient database with a sufficient number of images
which include confirmed oil spills, look-alikes and other ge-
ographical regions. The annotation of the images was based
on information provided by EMSA and human identification
(manually annotation). This process produced image masks
where every desired object was marked with a distinct color
(2 foreground+1 background). The processed images were
randomly divided into a training and a testing set comprised
of 571 and 106 images, respectively. Finally, it must be high-
lighted that the database is updated continuously and will be
publicly available after proper confirmations to the commu-
nity in order to provide a common basis as a benchmark.
4.2. RESULTS
For our experiments, two foreground classes were defined for
the classification process, for oil spills and for look-alikes as
well as one class for background pixels. The performance of
the deployed model was initially measured in terms of pixel
intersection-over-union (IoU) averaged across all classes
(mIoU). In addition, the resulted IoU for each class is also
provided so as to clarify the individual performance of the
model and its effectiveness in each class. Table 1 includes the
results from the initial experiments.
Fig. 3. Example of 4 testing images (from top to bottom): SAR images, ground truth masks and resulted detection masks.
For comparison reasons, we additionally utilized the accu-
racy so that the model could be somehow comparable to pure
image classification models. Every image pair (ground truth
and resulted detection mask) were cropped automatically in a
predefined number of overlapping patches. In order to acquire
a dataset which complies with the rules of image representa-
tion, some constraints were imposed:
1. A minimum number of pixels belonging to either of the
two classes should be present in the patch. A threshold
equal to 2% was applied meaning that the number of
pixels of the largest class compared to the number of
pixels of the background class should be at least 2%.
2. One of the basic classes should be dominant to label
the image patch. The applied threshold was set at most
50% of the pixels of the non dominant class in relation
to the pixels of the dominant class.
3. Patches recalcitrant to the above rules were discarded
in accuracy calculation.
The results of patch image classification are provided in
Table 2. The values are dependent of the number of patches
cropped from each sample. Two different pair values for the
horizontal and vertical number of patches, respectively, are
presented. The results provided in Tables 1 and 2 are some-
how dissimilar due to the fact that for the second metric a
single label per patch is evaluated and not on every pixel.
The comparison with relevant approaches could be con-
sidered as invalid due to the lack of a common image dataset,
nonetheless, some results are compared. The neural network
based method in [3] reported a 91.6% and 98.3% accuracy
for oil spills and look-alikes, respectively, with much higher
number of look-alikes. The method in [1] using a decision
Image patch classification precision results
Number of patches: 3,3 Number of patches: 5,3
Overall Oil spills Look-alikes Overall Oil spills Look-alikes
0.8063 0.8621 0.7588 0.8166 0.8932 0.7540
Table 2. Segmentation results with accuracy.
tree forest resulted an 85.0% accuracy as the highest value.
Also, relevant methods like the probabilistic based in [18] re-
ported results equal to 78% for oil spill and 99% for look-
alikes. Thus, the initial results of the deep CNN based ap-
proach are similar to the corresponding state-of-the-art meth-
ods nonetheless without the need of extracting relevant fea-
tures and with the merit of semantically annotated regions.
5. CONCLUSIONS
In this paper, we introduced a new approach for oil spill de-
tection using SAR images for maritime applications. With
the adaptation of accurate DCNNs, oil spill detection can be
further automated and be incorporated into a larger detec-
tion pipeline. The extracted results are similar to the state-
of-the-art results for general classification problems. For the
oil spill detection problem, the use of similar deep learning
techniques may improve further the identification of such pol-
lution spots. Potential improvements could also be achieved
utilizing a more advanced dataset which may include images
acquired with improved SAR sensors and larger training sets.
Though it is a preliminary work, the initial results could be
considered as promising for further exploitation of deep learn-
ing algorithms in the oil spill detection field.
6. REFERENCES
[1] Konstantinos Topouzelis and Apostolos Psyllos, “Oil
spill feature selection and classification using decision
tree forest on sar image data,” ISPRS journal of pho-
togrammetry and remote sensing, vol. 68, pp. 135–143,
2012.
[2] A Yu Ivanov and Victoria V Zatyagalova, “A gis ap-
proach to mapping oil spills in a marine environment,
International Journal of Remote Sensing, vol. 29, no.
21, pp. 6297–6313, 2008.
[3] Suman Singha, Tim J Bellerby, and Olaf Trieschmann,
“Satellite oil spill detection using artificial neural net-
works,IEEE Journal of Selected Topics in Applied
Earth Observations and Remote Sensing, vol. 6, no. 6,
pp. 2355–2363, 2013.
[4] Hui-yan Shen, Pu-cheng Zhou, and Shao-ru Feng,
“Research on multi-angle near infrared spectral-
polarimetric characteristic for polluted water by spilled
oil,” in International Symposium on Photoelectronic
Detection and Imaging 2011: Advances in Infrared
Imaging and Applications. International Society for Op-
tics and Photonics, 2011, vol. 8193, p. 81930M.
[5] Carlos Gonzalez, Sergio S´
anchez, Abel Paz, Javier Re-
sano, Daniel Mozos, and Antonio Plaza, “Use of fpga
or gpu-based architectures for remotely sensed hyper-
spectral image processing,” INTEGRATION, the VLSI
journal, vol. 46, no. 2, pp. 89–103, 2013.
[6] G Aalet Mastin, JJ Manson, JD Bradley, RM Axline,
and GL Hover, “A comparative evaluation of sar and
slar, Tech. Rep., Sandia National Labs., Albuquerque,
NM (United States), 1993.
[7] Merv Fingas and Carl Brown, “Review of oil spill re-
mote sensing,” Marine pollution bulletin, vol. 83, no. 1,
pp. 9–23, 2014.
[8] Marta Konik and Katarzyna Bradtke, “Object-oriented
approach to oil spill detection using envisat asar im-
ages,” ISPRS Journal of Photogrammetry and Remote
Sensing, vol. 118, pp. 37–52, 2016.
[9] Oriol Vinyals, Alexander Toshev, Samy Bengio, and
Dumitru Erhan, “Show and tell: A neural image cap-
tion generator, in Computer Vision and Pattern Recog-
nition (CVPR), 2015 IEEE Conference on. IEEE, 2015,
pp. 3156–3164.
[10] Andrej Karpathy and Li Fei-Fei, “Deep visual-semantic
alignments for generating image descriptions,” in Pro-
ceedings of the IEEE conference on computer vision and
pattern recognition, 2015, pp. 3128–3137.
[11] Liang-Chieh Chen, George Papandreou, Iasonas Kokki-
nos, Kevin Murphy, and Alan L Yuille, “Deeplab:
Semantic image segmentation with deep convolutional
nets, atrous convolution, and fully connected crfs,”
arXiv preprint arXiv:1606.00915, 2016.
[12] Karen Simonyan and Andrew Zisserman, “Very deep
convolutional networks for large-scale image recogni-
tion,” arXiv preprint arXiv:1409.1556, 2014.
[13] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian
Sun, “Deep residual learning for image recognition,” in
Proceedings of the IEEE conference on computer vision
and pattern recognition, 2016, pp. 770–778.
[14] Matthias Holschneider, Richard Kronland-Martinet,
Jean Morlet, and Ph Tchamitchian, “A real-time algo-
rithm for signal analysis with the help of the wavelet
transform,” in Wavelets, pp. 286–297. Springer, 1990.
[15] Alessandro Giusti, Dan C Ciresan, Jonathan Masci,
Luca M Gambardella, and Jurgen Schmidhuber, “Fast
image scanning with deep max-pooling convolutional
neural networks, in Image Processing (ICIP), 2013
20th IEEE International Conference on. IEEE, 2013, pp.
4034–4038.
[16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian
Sun, “Spatial pyramid pooling in deep convolutional
networks for visual recognition, in european confer-
ence on computer vision. Springer, 2014, pp. 346–361.
[17] Marco Cococcioni, Linda Corucci, Andrea Masini, and
Fabio Nardelli, “Svme: an ensemble of support vec-
tor machines for detecting oil spills from full resolution
modis images,” Ocean Dynamics, vol. 62, no. 3, pp.
449–467, 2012.
[18] Anne HS Solberg, Camilla Brekke, and Per Ove Husoy,
“Oil spill detection in radarsat and envisat sar images,
IEEE Transactions on Geoscience and Remote Sensing,
vol. 45, no. 3, pp. 746–755, 2007.
... Since the last few decades, spaceborne synthetic 25 aperture radar (SAR) has been widely used for the detection and classification of oil spills and 26 look-alikes. Oil on a sea surface can generally be seen as a dark stretch in SAR images because it 27 dampens the capillary waves and reduces the backscatter [2]. Nevertheless, dark stretches can 28 also occur as a result of natural phenomena such as low wind areas, algae blooms, grease ice, etc. 29 [1], [3]. ...
... However, the model is also based on and limited 79 to classification of SAR images into two classess i.e. oil spill and look-alikes. The authors in 80 [27] proposed a deep DCNN for semantic segmentation of SAR images into multiple regions 81 of interest. The deployed model was trained on a publicly available oil spill dataset [28]. ...
... of oil spills remains a challenging problem for the research community. Due to 135 the absence of a common benchmark dataset, earlier work on oil spill detection and classification 136[27],[37],[38] utilized different custom datasets corresponding to the specific approaches used137 at the time. Until recently, to the best of our knowledge, there has been no common baseline 138 available in the literature for comparison of different deep learning based semantic segmentation 139 approaches. ...
Preprint
Oil spillage over a sea or ocean’s surface is a threat to marine and coastal ecosystems. Spaceborne synthetic aperture radar (SAR) data has been used efficiently for the detection of oil spills due to its operational capability in all-day all-weather conditions. The problem is often modeled as a semantic segmentation task. The images need to be segmented into multiple regions of interest such as sea surface, oil spill, look-alikes, ships and land. Training of a classifier for this task is particularly challenging since there is an inherent class imbalance. In this work, we train a convolutional neural network (CNN) with multiple feature extractors for pixel-wise classification; and introduce to use a new loss function, namely ‘gradient profile’ (GP) loss, which is in fact the constituent of the more generic Spatial Profile loss proposed for image translation problems. For the purpose of training, testing and performance evaluation, we use a publicly available dataset with selected oil spill events verified by the European Maritime Safety Agency (EMSA). The results obtained show that the proposed CNN trained with a combination of GP, Jaccard and focal loss functions can detect oil spills with an intersection over union (IoU) value of 63.95%. The IoU value for sea surface, look-alikes, ships and land class is 96.00%, 60.87%, 74.61% and 96.80%, respectively. The mean intersection over union (mIoU) value for all the classes is 78.45%, which accounts for a 13% improvement over the state of the art for this dataset. Moreover, we provide extensive ablation on different Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) based hybrid models to demonstrate the effectiveness of adding GP loss as an additional loss function for training. Results show that GP loss significantly improves the mIoU and F1 scores for CNNs as well as ViTs based hybrid models. GP loss turns out to be a promising loss function in the context of deep learning with SAR images.
... However, the model is also based on and limited to classification of SAR images into two classess i.e., oil spill and lookalikes. The authors in [27] proposed a deep DCNN for semantic segmentation of SAR images into multiple regions of interest. The deployed model was trained on a publicly available oil spill dataset [28]. ...
... The detection of oil spills remains a challenging problem for the research community. Due to the absence of a common benchmark dataset, earlier work on oil spill detection and classification [27,39,40] utilized different custom datasets corresponding to the specific approaches used at the time. Until recently, to the best of our knowledge, there has been no common baseline available in the literature for comparison of different deep-learningbased semantic segmentation approaches. ...
Article
Full-text available
Oil spillage over a sea or ocean surface is a threat to marine and coastal ecosystems. Spaceborne synthetic aperture radar (SAR) data have been used efficiently for the detection of oil spills due to their operational capability in all-day all-weather conditions. The problem is often modeled as a semantic segmentation task. The images need to be segmented into multiple regions of interest such as sea surface, oil spill, lookalikes, ships, and land. Training of a classifier for this task is particularly challenging since there is an inherent class imbalance. In this work, we train a convolutional neural network (CNN) with multiple feature extractors for pixel-wise classification and introduce a new loss function, namely, “gradient profile” (GP) loss, which is in fact the constituent of the more generic spatial profile loss proposed for image translation problems. For the purpose of training, testing, and performance evaluation, we use a publicly available dataset with selected oil spill events verified by the European Maritime Safety Agency (EMSA). The results obtained show that the proposed CNN trained with a combination of GP, Jaccard, and focal loss functions can detect oil spills with an intersection over union (IoU) value of 63.95%. The IoU value for sea surface, lookalikes, ships, and land class is 96.00%, 60.87%, 74.61%, and 96.80%, respectively. The mean intersection over union (mIoU) value for all the classes is 78.45%, which accounts for a 13% improvement over the state of the art for this dataset. Moreover, we provide extensive ablation on different convolutional neural networks (CNNs) and vision transformers (ViTs)-based hybrid models to demonstrate the effectiveness of adding GP loss as an additional loss function for training. Results show that GP loss significantly improves the mIoU and F1 scores for CNNs as well as ViTs-based hybrid models. GP loss turns out to be a promising loss function in the context of deep learning with SAR images.
... The first two bars ([0, 20] km) in Figure 16 shows both spill and seep instances, indicating that few slicks near the infrastructure are missed (blue). The R-CNN mask behaves similarly, detecting fewer slicks and missing some slicks farther from the infrastructure ( [30,40] km). The information on the proximity of the infrastructure represents relevant information for identifying the oil slick, notably the oil spill. ...
Article
Full-text available
Ocean surface monitoring, emphasizing oil slick detection, has become essential due to its importance for oil exploration and ecosystem risk prevention. Automation is now mandatory since the manual annotation process of oil by photo-interpreters is time-consuming and cannot process the data collected continuously by the available spaceborne sensors. Studies on automatic detection methods mainly focus on Synthetic Aperture Radar (SAR) data exclusively to detect anthropogenic (spills) or natural (seeps) oil slicks, all using limited datasets. The main goal is to maximize the detection of oil slicks of both natures while being robust to other phenomena that generate false alarms, called “lookalikes”. To this end, this paper presents the automation of offshore oil slick detection on an extensive database of real and recent oil slick monitoring scenarios, including both types of slicks. It relies on slick annotations performed by expert photo-interpreters on Sentinel-1 SAR data over four years and three areas worldwide. In addition, contextual data such as wind estimates and infrastructure positions are included in the database as they are relevant data for oil detection. The contributions of this paper are: (i) A comparative study of deep learning approaches using SAR data. A semantic and instance segmentation analysis via FC-DenseNet and Mask R-CNN, respectively. (ii) A proposal for Fuse-FC-DenseNet, an extension of FC-DenseNet that fuses heterogeneous SAR and wind speed data for enhanced oil slick segmentation. (iii) An improved set of evaluation metrics dedicated to the task that considers contextual information. (iv) A visual explanation of deep learning predictions based on the SHapley Additive exPlanation (SHAP) method adapted to semantic segmentation. The proposed approach yields a detection performance of up to 94% of good detection with a false alarm reduction ranging from 14% to 34% compared to mono-modal models. These results provide new solutions to improve the detection of natural and anthropogenic oil slicks by providing tools that allow photo-interpreters to work more efficiently on a wide range of marine surfaces to be monitored worldwide. Such a tool will accelerate the oil slick detection task to keep up with the continuous sensor acquisition. This upstream work will allow us to study its possible integration into an industrial production pipeline. In addition, a prediction explanation is proposed, which can be integrated as a step to identify the appropriate methodology for presenting the predictions to the experts and understanding the obtained predictions and their sensitivity to contextual information. Thus it helps them to optimize their way of working.
... Through their experiments, the relevant authorities have demonstrated that the DCNN model can be used to distinguish oil spills from other instances, providing a valuable tool to manage the upcoming disaster. Orfanidis et al. (2018), also used deep neural networks to provide a fully automated system for oil spill detection and argued that automatic spill detection and the use of a continuous monitoring system could reduce the time to detect oil spills. The CNN network, introduced with SAR imagery from the Sentinel-1 satellite, was trained by ESA and is based on EMSA records of marine pollution incidents. ...
Article
Full-text available
Synthetic Aperture Radar (SAR) imagery of marine areas can be very useful for segmenting oil spills, which are a common environmental hazard. Oil spill detection in SAR imagery faces several challenges, including speckle noise, heterogeneous backgrounds, blurred edges, and a lack of comprehensive datasets with multiple images. ShuffleNet is one of the deep networks used for semantic segmentation of images, but it has never been used for oil spill segmentation. In this paper, ShuffleNet network blocks are used to detect oil spills in SAR images, which is more effective than other methods. Besides the main network block design, six other blocks were evaluated, and the most valuable one was selected. We use group convolutions, shuffle channels, and atrous convolutions in this model with minimum number of layers of ReLU. The methods are evaluated based on Intersection Over Union (IoU) parameter, so that the proposed method improved the mIoU by 7.1% over the best results of some previous methods.
... However, it considers only two classes: oil spills and look-alikes. The authors in [10] proposed a DCNN for multi-class classification with three classes viz., background, oil spills and look-alikes. The authors in [11] used a DCNN based on the architecture of DeepLab [9] for multi-class semantic segmentation of SAR images. ...
... CNNs have been widely used to perform segmentation of SAR images to identify oil spills [12][13][14][15][16][17][18][19][20][21][22][23][24][25]. One example of this approach can be found in [26], where a NN, specifically the Multilayer Perceptron (MLP), was applied to SAR images for the first time. ...
Article
Full-text available
Oil spills represent one of the major threats to marine ecosystems. Satellite synthetic-aperture radar (SAR) sensors have been widely used to identify oil spills due to their ability to provide high resolution images during day and night under all weather conditions. In recent years, the use of artificial intelligence (AI) systems, especially convolutional neural networks (CNNs), have led to many important improvements in performing this task. However, most of the previous solutions to this problem have focused on obtaining the best performance under the assumption that there are no constraints on the amount of hardware resources being used. For this reason, the amounts of hardware resources such as memory and power consumption required by previous solutions make them unsuitable for remote embedded systems such as nano and micro-satellites, which usually have very limited hardware capability and very strict limits on power consumption. In this paper, we present a CNN architecture for semantically segmenting SAR images into multiple classes. The proposed CNN is specifically designed to run on remote embedded systems, which have very limited hardware capability and strict limits on power consumption. Even if the performance in terms of results accuracy does not represent a step forward compared with previous solutions, the presented CNN has the important advantage of being able to run on remote embedded systems with limited hardware resources while achieving good performance. The presented CNN is compatible with dedicated hardware accelerators available on the market due to its low memory footprint and small size. It also provides many additional very significant advantages, such as having shorter inference times, requiring shorter training times, and avoiding transmission of irrelevant data. Our goal is to allow embedded low power remote devices such as satellite systems for remote sensing to be able to directly run CNNs on board, so that the amount of data that needs to be transmitted to ground and processed on ground can be substantially reduced, which will be greatly beneficial in significantly reducing the amount of time needed for identification of oil spills from SAR images.
Chapter
Environmental pollution is one of the most crucial problems around the globe, where the oil spill has a significant impact on the environment. Oil spill detection at an early stage can save the environment, and especially the marine life greatly benefits from recovery systems. Synthetic aperture radars (SAR) are utilized to capture the images from the satellite, which is the primary and accurate source of the detection system. In SAR sensors, oil spills are captured as black spots where distinguishing between the real oil spill and the look-alike enables more challenging objectives. The research community has taken different approaches into account to detect and classify SAR black spots. However, most of them utilize a custom dataset, which makes the result unsuitable for comparison. The scenario worsens when only one label is assigned to the whole SAR image. Hence, deep convolutional neural networks (DCNNs) are suggested in the literature as a proficient method. Thus with SAR images, deep learning techniques can be utilized to detect the oil spill efficiently with other essential classes. This research work implements a DCNN architecture to identify oil spills and relevant classes (i.e., ship, land, look-alike) with a very recent and well-defined dataset for oil spill detection. The data set is preprocessed from raw SAR images collected from sentinel-1 European satellite images. In this work, FCN-8s architecture is utilized to identify oil spills and relevant classes with semantic segmentation techniques. The main objective of this study is to investigate the best appropriate setting of hyperparameters for oil spill detection. The evaluation results of this work reflect that the proposed DNN architecture performs best with the Adadelta optimizer for oil spill detection.
Article
Marine pollution poses a humongous threat to oceanic life especially large scale oil spills. The paper addresses this concern by offering a comparative study of novel deep learning architectures trained on identical datasets and computational conditions. We explore the concept of dataset amplification through non-learning image manipulations techniques like Horizontal and Vertical flipping along with random rotations. The models are thoroughly attested using parameters like Mean-IoU, F1 scores and percentage accuracy. The paper concludes by showing V-Net supremacy, as it outperforms it’s fellow implementations with an accuracy of 90.65% and a Dice-Coefficient of 90.34.
Article
Full-text available
The technical aspects of oil spill remote sensing are examined and the practical uses and drawbacks of each technology are given with a focus on unfolding technology. The use of visible techniques is ubiquitous, but limited to certain observational conditions and simple applications. Infrared cameras offer some potential as oil spill sensors but have several limitations. Both techniques, although limited in capability, are widely used because of their increasing economy. The laser fluorosensor uniquely detects oil on substrates that include shoreline, water, soil, plants, ice, and snow. New commercial units have come out in the last few years. Radar detects calm areas on water and thus oil on water, because oil will reduce capillary waves on a water surface given moderate winds. Radar provides a unique option for wide area surveillance, all day or night and rainy/cloudy weather. Satellite-carried radars with their frequent overpass and high spatial resolution make these day-night and all-weather sensors essential for delineating both large spills and monitoring ship and platform oil discharges. Most strategic oil spill mapping is now being carried out using radar. Slick thickness measurements have been sought for many years. The operative technique at this time is the passive microwave. New techniques for calibration and verification have made these instruments more reliable.
Article
Full-text available
Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing. In this paper, we present a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image. The model is trained to maximize the likelihood of the target description sentence given the training image. Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions. Our model is often quite accurate, which we verify both qualitatively and quantitatively. For instance, while the current state-of-the-art BLEU score (the higher the better) on the Pascal dataset is 25, our approach yields 59, to be compared to human performance around 69. We also show BLEU score improvements on Flickr30k, from 55 to 66, and on SBU, from 19 to 27.
Conference Paper
Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models for addressing the task of pixel-level classification (also called "semantic image segmentation"). We show that responses at the final layer of DCNNs are not sufficiently localized for accurate object segmentation. This is due to the very invariance properties that make DCNNs good for high level tasks. We overcome this poor localization property of deep networks by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF). Qualitatively, our "DeepLab" system is able to localize segment boundaries at a level of accuracy which is beyond previous methods. Quantitatively, our method sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 71.6% IOU accuracy in the test set. We show how these results can be obtained efficiently: Careful network re-purposing and a novel application of the 'hole' algorithm from the wavelet community allow dense computation of neural net responses at 8 frames per second on a modern GPU.
Technical Report
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.
Technical Report
We present a model that generates free-form natural language descriptions of image regions. Our model leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between text and visual data. Our approach is based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural Networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding. We then describe a Recurrent Neural Network architecture that uses the inferred alignments to learn to generate novel descriptions of image regions. We demonstrate the effectiveness of our alignment model with ranking experiments on Flickr8K, Flickr30K and COCO datasets, where we substantially improve on the state of the art. We then show that the sentences created by our generative model outperform retrieval baselines on the three aforementioned datasets and a new dataset of region-level annotations.
Article
In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit. First, we highlight convolution with upsampled filters, or 'atrous convolution', as a powerful tool in dense prediction tasks. Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks. It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation. Second, we propose atrous spatial pyramid pooling (ASPP) to robustly segment objects at multiple scales. ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales. Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed "DeepLab" system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code is made publicly available online.
Article
The growing importance of oil spill detection as part of a rapid response system to oil pollution requires the ongoing development of algorithms. The aim of this study was to create a methodology for improving manual classification at the scale of entire water bodies, focusing on its repeatability. This paper took an object-oriented approach to radar image analysis and put particular emphasis on adaptation to the specificity of seas like the Baltic. Pre-processing using optimised filters enhanced the capability of a multilevel hierarchical segmentation, in order to detect spills of different sizes, forms and homogeneity, which occur as a result of shipping activities. Confirmed spills detected in ENVISAT/ASAR images were used to create a decision-tree procedure that classifies every distinct dark object visible in SAR images into one out of four categories, which reflect growing probability of the oil spill presence: look-alikes, dubious spots, blurred spots and potential oil spills. Our objective was to properly mark known spills on ASAR scenes and to reduce the number of false-positives by eliminating (classifying as background or look-alike) as many objects as possible from the vast initial number of objects appearing on full-scale images. A number of aspects were taken into account in the classification process. The method’s performance was tested on a group of 26 oil spills recorded by HELCOM: 96.15% of them were successfully identified. The final target group was narrowed down to about 4% of dark objects extracted from ASAR images. Although a specialist is still needed to supervise the whole process of oil spill detection, this method gives an initial view, substantial for further evaluation of the scenes and risk estimation. It may significantly accelerate the pace of manual image analysis and enhance the objectivity of assessments, which are key aspects in operational monitoring systems.