Content uploaded by Xenia Ivashkovych
Author content
All content in this area was uploaded by Xenia Ivashkovych on Oct 11, 2022
Content may be subject to copyright.
OBPDC 2022
28-30 SEPTEMBER 2022
Xenia Ivashkovych(1), Lisa Landuyt(2), Tanja Van Achteren(3)
(1) Flemish institute for technological Research (VITO),
Boeretang 200 | 2400 Mol | Belgium
(2) Flemish institute for technological Research (VITO),
Boeretang 200 | 2400 Mol | Belgium
(3) Flemish institute for technological Research (VITO),
Boeretang 200 | 2400 Mol | Belgium
As the increasing volume of Earth Observation data being produced risks outpacing the capacity to downlink said data
and storing it back on Earth, much effort has been poured into compression algorithms or on-board applications. Both are
capable of significantly decreasing the amount of data to be downlinked and stored, but, until the publication of a paper
earlier this year, no solution combining both ideas had been proposed. The paper in question introduces CORSA, a
compression algorithm based on deep learning, having also the benefit of producing meaningful semantic representations
usable in downstream AI-tasks. The compression and reconstruction capabilities of the algorithm, as well as the usability
of the representations it produces for classification had been established in the original publication, but there had yet to
be a quantitative assessment of the representation’s usability for a downstream task such as semantic segmentation. The
aim of the present paper is therefore to asses the usability of CORSA’s representations for water detection applications.
The paper compares tried methods, such as the NDWI, U-Net on multispectral data, to new methods using CORSA
representations as a prior, such as U-Net on reconstructed multispectral data and neural networks using CORSA’s
representations directly as input. Sentinel-2 data and manual annotations were used during the exercise to ensure quality.
The ensuing results show that the compression performed by CORSA does not negatively affect the efficiency of a U-
Net, moreover we have created new networks using CORSA representations directly and exploiting their underlying
hierarchical structure, which also perform water detection successfully. Hence, with this paper, we have demonstrated
that a versatile and generic algorithm such as CORSA, efficient for both on-board compression and on-board processing
in the realms of classification and semantic segmentation, is a viable solution to the increasing demands in data downlink
and storage.
As the amount of remote sensing observations and their quality keeps increasing, downlinking and exploiting this vast
amount of data becomes more and more challenging. One of the most obvious answers to this challenge is reducing the
amount of data downlinked by going beyond lossless compression. The CCSDS reflects this change in focus introducing
the CSDSS 123.0-B-2[1], which is a new standard covering not only lossless but also near lossless compression for
hyperspectral and multispectral satellites.
Since then, many research teams have attempted to increase the compression ratio by various means, including machine
learning and deep learning (DL). Diego Valsesia and Enrico Magli have attempted to use a convolutional neural network
(CNN) on ground in order to decrease the noise distortion induced by lossy compression upon distortion to great effect
for lower rates[2]. Dimitri Lebedeff’s team have elaborated an on-board hyperspectral selective spatial/spectral
compression targeting clouded pixels with higher compression rates, which are unexploitable in the Earth Observation
domain, using an support vector machine (SVM)[3]. A DL-based cloud-detection network has been tested on Phi-Sat
2021[4] in order to transmit only non-clouded pixels down to Earth. All these algorithms however only have compression
as their target and are not exploiting the full range of possibilities that Artificial Intelligence (AI) offers.
OBPDC 2022
On the other hand, some teams have explored the possibility of unsupervised deep learning to enable change detection
for extreme events without the need for analysis on the ground[5]; but without explicitly addressing compression. In a
recent work, Bart Beusen and his team at VITO have attempted to elaborate a deep-learning algorithm that allows for
simultaneously competitive data compression and for direct down-stream AI tasks: CORSA[6]. CORSA can be
understood as both a compression algorithm and a generic model producing meaningful semantic representations for
downstream applications. One of the biggest drawbacks of DL-based applications is the need for vast amounts of
annotated data. Since CORSA already produces meaningful semantic representations, the labelled dataset necessary to
produce an downstream DL-based application can be divided by ten without affecting performance negatively, which
results in huge saving in terms of development time and labour. Moreover, since the algorithm can do both compression
and generic data pre-processing, CORSA can save space on-board for missions that require both downlinking and no-
board processing.
In this paper, we will showcase CORSA’s usability as a pretrained generic model for a specific downstream task: water
detection. First, we will show that the usage of compressed images does not negatively impact the efficiency of the
downstream task. Second, we will show that the representations generated by CORSA themselves are usable in this same
down-stream task.
2.1. Water detection
Water presence, being a key surface parameter, both from an ecological and socio-economic point of view, has been
studied extensively and is included in a vast amount of literature. Feyisa et al.[7]identified four categories of common
water classification methods for multispectral imagery: thematic classification, linear unmixing, single -band thresholding
and two-band spectral water indices. The latter is by far the most established, and different indices have been developed
throughout the years[8], [9].
More recently, new DL based methods have been applied to the problem and proven to be even more successful than the
aforementioned water indices. Isikdogan et al.[10] reported an F1 score of 0.93 for their DeepWaterMapV2 approach
applied on Landsat 8 imagery compared to 0.72 using the Modified Normalized Difference Water Index (MNDWI), while
Wieland et al.[11]reported an overall accuracy of 0.99 compared to 0.93 for the Normalized Difference Water Index
(NDWI) on a test dataset of Landsat TM, ETM+, OLI and Sentinel-2 (S2) imagery. Finally, Mateo Garcia et al.[12]
significantly outperformed the NDWI using both the default and an optimized threshold (Jaccard score of 0.40 and 0.65
respectively) with both a U-Net and a simple convolutional neural network (Jaccard score of 0.72 and 0.71) on Sentinel-
2 imagery resampled to 10 m.
The data available on water detection makes it a suitable problem to assess the usability of the representations produced
by CORSA in a downstream semantic segmentation task.
2.2. CORSA
CORSA is an unsupervised compression deep learning algorithm developed by VITO earlier in this year 2022. It uses a
multi-level variational auto-encoder architecture to train an encoder and a decoder simultaneously. This training has the
effect of not only producing a decoder capable of reconstructing the original images with a high amount of fidelity but
also producing meaningful representations at the bottleneck.
The three-layer architecture, showcased in Fig. 1, offers a significant boost in reconstruction quality with little effect on
the compression ratio. This gain in performance can be understood by having seen these different levels of representation
not as an arbitrary division but as conceptual hierarchy of information. The highest level of representation, corresponding
to the smallest spatial map, contains information relevant to the background of an image, its larger structures. The lowest
level of representation, corresponding to the largest spatial map, contains information more relevant to the foreground of
an image, its details and finer structures. The middle level of the representation contains information lying somewhere
OBPDC 2022
Fig. 1: CORSA architecture
CORSA has achieved a compression ratio of 24.38 for a peak signal-to-noise ratio (PSNR) or 69.80, a structural similarity
index measure (SSIM) of 0.95 and a mean squared error (MSE) of 0.01 on BGRNIR (blue, green, red, near infra-red) S2
images upon reconstruction. The high compression rate is achieved by quantizing the representations at the bottle-neck,
or more precisely mapping them to a code-book. Once every vector has been mapped, the indices are sent back to Earth,
meaning the codebook is needed both on-board and down on Earth for encoding and decoding. This code-book is trained
together with the encoders and decoders in order to retain a high degree of expressivity for quantized representations. It’s
these quantized representations that will be used further in the paper as inputs for water detection algorithms.
In this paper, we have opted for a smaller but high-quality dataset on which multiple benchmarks and networks were
trained. In order to stay consistent with the literature, U-Net on uncompressed images and NDWI served as benchmarks
to measure our different networks against.
3.1. Dataset
As part of a wider project on the digitalisation of Flanders, it is this region, for which multiple high-resolution and
medium-resolution datasets are easily available, that has been chosen to test the expressivity of CORSA representations.
The Flemish Institute for Nature and Forest Research (INBO) manages a dataset of natural closed water bodies, hence
excluding industrial water surfaces and water courses, last updated in 2020 (Leyssen et al., 2020). The Basemap Flanders
or Grootschalig Referentie Bestand (GRB) also comprises water bodies and courses at very high resolution (scale 1/250
to 1/5000). In order to produce a ground truth of the highest possible quality for S2 imagery, we have chosen to rely on
not only the aforementioned static high-resolution datasets, but have also performed manual annotations based on the
INBO and GRB datasets, topographic information, high resolution aerial imagery acquired yearly in winter and 3-yearly
in summer as well as the S2 imagery itself.
The ground truth was generated for a selection of cloud-free acquisitions over seven Areas Of Interest (AOIs), shown in
Fig. 2. For AOIs used for testing (in orange) only timeframes with overlap between aerial imagery and cloud-free Sentinel-
2 images were selected. For AOIs used for training and validation, ten cloud-free S-2 images uniformly spread across the
year were selected. Note that since the ground truth was generated using high-resolution imagery, some water bodies
present on these datasets are invisible on the S-2 acquisitions, either due to being covered by vegetation, too shallow or
too small. These water bodies do exist in the dataset but are labelled as ‘difficult water’ and were considered as land
during this exercise.
OBPDC 2022
Fig. 2: AOIs selected across Flanders.
In order to match CORSA’s input specifications, 2100 120x120x4 px patches were selected for the training and validation
datasets. The maximal overlap between tiles was set to 80% and 90% of the patches were required to comprise water
pixels. For the latter, the minimal fraction of water pixels was set at 0.5%.
3.2. Network Architectures
In this paper, our goal is to answer two questions: does the compression of the data negatively affect the efficiency of the
downstream DL algorithm, and can the generic representations generated by CORSA be usable in downstream DL
To answer the first question, a U-Net on reconstructed images was benchmarked against a U-Net on uncompressed images
and the NDWI. The U-Net on reconstructed images is, in practice, equivalent to a pipeline consisting of a frozen CORSA
decoder and U-Net which takes CORSA representations as inputs. This network is called ‘Reconstruction + U-Net’.
To answer the second question, two types of networks taking CORSA representations as inputs have been tested. The
first type is called ‘Decoder + U-Net’, as it is equivalent to a pipeline consisting of a trainable CORSA decoder and a U-
Net, as shown in Fig. 3. Both are trained simultaneously as one network, which reflects a naïve approach to crafting a
downstream algorithm with the aforementioned inputs.
The second type of network aims to leverage the underlying hierarchical structure of CORSA representations in addition
to their semantic meaningfulness, as shown in Fig. 4. For this type of network, the architectural possibilities are very
numerous and only two examples, chosen for their simplicity, are showcased in this paper. The structures are still U-Net-
like but contain separate different ‘input blocks’, ensemble of operations and layers, for different levels of representation.
These input blocks can have differing architectures depending on the level, but even given the same architecture are very
likely to have differing weights after training, depending on which level of representation (background, middle-ground,
foreground) the application leverages the most. The two networks used in this paper, respectively called ‘Crow-Net’ and
‘Shrike-Net’, each use the same input block across the three levels, the difference mainly lying in the type of input blocks
they use. These two networks aim to show the variety of ways in which the generic representations produced by CORSA
can be leveraged for downstream DL applications. They reflect a more sophisticated and flexible approach to our problem.
Figure 3: Architecture schematics for the ‘Decoder + U-Net’ network.
OBPDC 2022
Fig. 4: Architecture schematics blueprint for networks such as ‘Crow-Net’ and ‘Shrike-Net’.
3.3. Training and Testing Procedures
All networks were trained with the same parameters and with the same training data to enable later comparisons. Power
scaling was applied to the input data to match the distribution on which CORSA was trained, no data augmentations were
applied, a decaying learning rate starting at 0.0001, early stopping and batch size of 128 were used during training.
For each network, the testing was performed as follows. Each test AOI was cut into overlapping patches. These patches
were predicted by a network and woven back together into the AOI by discarding the padding to minimize border effects.
The predicted image was then sharpened with a threshold of 0.5, anything above this threshold would be considered
water. The ground truth was then compared to the aforementioned image by means of 3 metrics: the Jaccard score (also
known as the Intersection over Union), precision and recall. As for the training, the pixels belonging to the ‘difficult
water’ class in the ground truth were considered as land. The scores were then aggregated using area-based weights.
In addition to assessing the proposed networks in this manner, they were benchmarked against two other methods: a U-
Net on the uncompressed S2 BGRNIR imagery and NDWI with the general threshold of 0.
To give the reader a visual idea of each network’s performance, contingency maps for the AOI2 dated of the 1st of April
2019, the hardest or second-hardest region to predict, depending on the algorithm, will be discussed in this section. Fig.
5 shows the original inputs in BGR.
Fig. 5: S2 BGR uncompressed image
OBPDC 2022
Table 1. Result comparison between NDWI, U-Net on uncompressed images and U-Net on reconstructed images.
(in parameters)
Input type
Original Image
353 665
Original Image
Reconstruction + U-Net
1 212 165
CORSA Representations
First, we were concerned with whether or not the CORSA compression algorithm would negatively impact the water
detection exercise. As seen in Table 1, the result of the U-Net is significantly better than the NDWI, as expected from
reading the literature. More importantly however, the compression appears to have no significant impact on the efficiency
of the network. The bigger number of parameters necessary for the ‘Reconstruction + U-Net’ pipeline compared to the
regular U-Net is explained by the addition of the CORSA decoder. Both the U-Net that operated on uncompressed images
and the U-Net component in the ‘Reconstruction + U-Net’ pipeline are exactly the same network, with 3 levels and 32
starting filters. As further seen on Fig. 6, both networks have similar contingency maps, suggesting that the compression
and decompression of the images leaves the most relevant information for this water detection task intact.
Second, we wanted to assess how well the CORSA representations themselves could be used in different networks for
water detection. Table 2 shows that not only algorithms with CORSA representations as an input can perform just as
well as a U-Net on uncompressed images, but the algorithms that allow for separate encoding for different levels of
representation allow for significantly better performance. Adding separate pipelines for each level of representation
induces extra parameters however, which must partially account for the gain in accuracy. Here again, each network
comprises 3 levels and 32 starting parameters for better comparison.
Fig. 6: Contingency maps for ‘U-Net' (left) and ‘Reconstructed + U-Net’ (right)
OBPDC 2022
Table 2. Result comparison between different downstream DL algorithms.
(in parameters)
Input type
Decoder + U-Net
1 212 165
CORSA representations
1 791 165
CORSA representations
3 168 345
CORSA representations
These experiments demonstrate that CORSA is both useful for compression and for producing generic semantically
meaningful representations for downstream DL applications. These results are likely to be reproductible for many other
The first aspect of CORSA that has remained unexplored in this paper, since the focus was on the application of water
detection, is the size of the representations produced by the algorithm. Larger representations’ positive effects on quality
has been documented in the original paper, but its effects on downstream applications have remained unexplored. It is
possible that a CORSA network producing larger representations would result in a boost in performance in the task of
water detection.
The second aspect of CORSA which might also have an impact on downstream applications is initial scaling. During
experiments, we have hesitated between different types of input scaling, an operation necessary for the inputs of neural
networks. We have settled for power scaling, more specifically the same one used for the training of CORSA in the
original paper, because the inputs to the CORSA pipeline needed to be consistent with those applied for the training of
the algorithm. Since consistency is also necessary across all networks in the experiment, the same power scaling was
chosen for every network. However, the most characteristic feature of water surfaces is the high energy absorption at NIR
wavelengths and beyond, leading to very low reflectance values, Power scaling leads to a more Gaussian-like distribution
and, as such, reduces the extremity of these values and consequently the difference between water and other surfaces.
Preliminary tests on visible water tend to show that the impact of scaling might be significant on the performance of the
downstream water detection algorithm. CORSA could benefit from having this effect assessed and quantified.
Fig. 7: Contingency map for ‘Decoder + U-Net' and ‘Shrike-Net’
OBPDC 2022
We have shown that the representations that CORSA produces can be directly used in a downstream water detection task
without any loss in efficiency compared to the same task performed on raw images. With the compression capabilities of
CORSA being established, our observation further demonstrates that CORSA is a versatile framework that can be used
for multiple purposes simultaneously.
[1] The Consultative Committee for Space Data Systems, Image Data Compression, vol. 1, 1 vols. 2017.
[2] D. Valsesia and E. Magli, “High-throughput Onboard Hyperspectral Image Compression with Ground-based CNN
Reconstruction,” IEEE Trans. Geosci. Remote Sensing, vol. 57, no. 12, pp. 9544–9553, Dec. 2019, doi:
[3] D. Lebedeff, M. F. Foulon, R. Camarero, R. Vitulli, and Y. Bobichon, “ON-BOARD CLOUD DETECTION AND
MISSIONS,” p. 9.
[4] G. Giuffrida et al., “The Φ-Sat-1 Mission: The First On-Board Deep Neural Network Demonstrator for Satellite
Earth Observation,” IEEE Trans. Geosci. Remote Sensing, vol. 60, pp. 1–14, 2022, doi:
[5] V. Růžička et al., “Unsupervised Change Detection of Extreme Events Using ML On-Board.” arXiv, Nov. 04,
2021. Accessed: Sep. 22, 2022. [Online]. Available:
[6] B. Beusen, X. Ivashkovyc, and T. V. Achteren, “Image compression using vector-quantized auto-encoders with
semantically meaningful feature extraction,” 2022.
[7] G. L. Feyisa, H. Meilby, R. Fensholt, and S. R. Proud, “Automated Water Extraction Index: A new technique for
surface water mapping using Landsat imagery,” Remote Sensing of Environment, vol. 140, pp. 23–35, Jan. 2014,
doi: 10.1016/j.rse.2013.08.029.
[8] S. McFeeters, “Using the Normalized Difference Water Index (NDWI) within a Geographic Information System
to Detect Swimming Pools for Mosquito Abatement: A Practical Approach,” Remote Sensing, vol. 5, no. 7, pp.
3544–3561, 1996, doi: 10.3390/rs5073544.
[9] H. Xu, “Modification of normalised difference water index (NDWI) to enhance open water features in remotely
sensed imagery,” International Journal of Remote Sensing, vol. 27, no. 14, pp. 3025–3033, Jul. 2006, doi:
[10] L. F. Isikdogan, A. Bovik, and P. Passalacqua, “Seeing Through the Clouds With DeepWaterMap,” IEEE Geosci.
Remote Sensing Lett., vol. 17, no. 10, pp. 1662–1666, Oct. 2020, doi: 10.1109/LGRS.2019.2953261.
[11] M. Wieland and S. Martinis, “A Modular Processing Chain for Automated Flood Monitoring from Multi -Spectral
Satellite Data,” Remote Sensing, vol. 11, no. 19, p. 2330, Oct. 2019, doi: 10.3390/rs11192330.
[12] G. Mateo-Garcia et al., “Towards global flood mapping onboard low cost satellites with machine learning,” Sci
Rep, vol. 11, no. 1, p. 7249, Dec. 2021, doi: 10.1038/s41598-021-86650-z.