Conference PaperPDF Available

Practice and Experience in Using Parallel and Scalable Machine Learning in Remote Sensing from HPC Over Cloud to Quantum Computing

Authors:

Abstract

Using computationally efficient techniques for transforming the massive amount of Remote Sensing (RS) data into scientific understanding is critical for Earth science. The utilization of efficient techniques through innovative computing systems in RS applications has become more widespread in recent years. The continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., 'big data') requires powerful computing resources with equally increasing performance. This paper reviews recent advances in High-Performance Computing (HPC), Cloud Computing (CC), and Quantum Computing (QC) applied to RS problems. It thus represents a snapshot of the state-of-the-art in ML in the context of the most recent developments in those computing areas, including our lessons learned over the last years. Our paper also includes some recent challenges and good experiences by using Europeans fastest supercomputer for hyper-spectral and multi-spectral image analysis with state-of-the-art data analysis tools. It offers a thoughtful perspective of the potential and emerging challenges of applying innovative computing paradigms to RS problems.
PRACTICE AND EXPERIENCE IN USING PARALLEL AND SCALABLE MACHINE
LEARNING IN REMOTE SENSING FROM HPC OVER CLOUD TO QUANTUM COMPUTING
Morris Riedel1,2, Gabriele Cavallaro2, J´
on Atli Benediktsson1
1School of Engineering and Natural Sciences, University of Iceland, Iceland
2J¨
ulich Supercomputing Centre, Forschungszentrum J¨
ulich, Germany
ABSTRACT
Using computationally efficient techniques for transforming
the massive amount of Remote Sensing (RS) data into sci-
entific understanding is critical for Earth science. The utiliza-
tion of efficient techniques through innovative computing sys-
tems in RS applications has become more widespread in re-
cent years. The continuously increased use of Deep Learning
(DL) as a specific type of Machine Learning (ML) for data-
intensive problems (i.e., ’big data’) requires powerful com-
puting resources with equally increasing performance. This
paper reviews recent advances in High-Performance Com-
puting (HPC), Cloud Computing (CC), and Quantum Com-
puting (QC) applied to RS problems. It thus represents a
snapshot of the state-of-the-art in ML in the context of the
most recent developments in those computing areas, includ-
ing our lessons learned over the last years. Our paper also
includes some recent challenges and good experiences by us-
ing Europeans fastest supercomputer for hyper-spectral and
multi-spectral image analysis with state-of-the-art data anal-
ysis tools. It offers a thoughtful perspective of the potential
and emerging challenges of applying innovative computing
paradigms to RS problems.
Index TermsHigh performance computing, cloud
computing, quantum computing, machine learning, deep
learning, parallel and distributed algorithms, remote sensing.
1. INTRODUCTION
Already ten years ago, Lee et al. [1] highlighted in a sur-
vey paper for RS a trend in the design of HPC systems for
data-intensive problems is to utilize highly heterogeneous
computing resources. Today, the use of HPC systems be-
comes more and more mainstream with many machines being
heavily used by RS researchers, especially with new forms
of multi-spectral or hyper-spectral image analysis techniques
through DL. DL requires massive processing power and large
quantities of data and open-source frameworks, whereby all
This work was performed in the Center of Excellence (CoE) Research
on AI- and Simulation-Based Engineering at Exascale (RAISE) receiving
funding from EU’s Horizon 2020 Research and Innovation Framework Pro-
gramme H2020-INFRAEDI-2019-1 under grant agreement no. 951733.
of these three factors contributed to the high uptake of this
unique ML technique in the RS community. CC evolved as an
evolution of Grid computing to make parallel and distributed
computing more straightforward to use than traditional rather
complex HPC systems. In this context RS researchers often
take advantage of Apache open-source tools with parallel
and distributed algorithms (e.g., map-reduce [2] as a specific
form of divide and conquer approach) based on Spark [3] or
the larger Hadoop ecosystem [4]. Inherent in many ML and
DL approaches are optimization techniques while many of
them are incredibly fast solvable by QCs [5] that represent
the most innovative type of computing today. Despite being
in its infancy, Quantum Annealer (QA)s are specific forms of
QC used by RS researchers [6, 7] to search for solutions to
optimization problems already today.
In this experience and short review paper, we specifically
focus on describing recent advances in the fields of HPC, CC,
and QC applied to RS problems, covering innovative comput-
ing architectures utilizing multi-core processors, many-core
processors, and specialized hardware components such as
Tensor Processing Unit (TPU)s and QA chips. Relevant ex-
amples of using innovative architectures with parallel and
scalable data science methods such as ML and DL techniques
for RS show the community’s uptake, including cutting-edge
distributed training algorithms and tools. Also, we share
lessons learned and thus add our own long year experience of
using several of those innovative systems with ML and DL
algorithms on many different RS datasets. This paper does
not address the benefits of using such innovative computing
systems or Field Programmable Gate Arrays (FPGA)s for
on-board airborne or real-time processing on satellite sensor
platforms due to the page restriction.
The remainder of the paper is structured as follows. Sec-
tion 2 describes different innovative computing systems and
their technological advances with examples in applying var-
ious ML and DL methods for RS datasets. Section 3 con-
cludes the paper with some remarks and anticipates future di-
rections and challenges in applying HPC, CC, or QC to RS
problems. The main processing-intensive application areas
and challenges for RS are briefly discussed in context.
2. TECHNOLOGICAL ADVANCES DRIVING
INNOVATIVE COMPUTING SYSTEMS
Fig. 1. Technology advances and our identified challenges.
Using multiple processors simultaneously with parallel
computing and becoming a standard in applying HPC tech-
niques to RS was already predicted ten years ago by Lee et
al. [1]. Today, multi-core processors with high single-thread
performance are broadly available via supercomputers for sci-
ence and offer thousands to millions of cores1to be used by
HPC techniques as shown in Figure 1. At the time of writing,
for example, the J¨
ulich Wizard for European Leadership Sci-
ence (JUWELS)2system at the J¨
ulich Supercomputing Centre
(JSC) in Germany represents the fastest European supercom-
puter offering 122,768 CPU cores in its cluster module. While
such systems and multi-core processors offer tremendous per-
formance, the particular challenge to exploit this data analy-
sis performance for ML is that those systems require specific
parallel and scalable techniques. In other words, using such
HPC systems with RS data effectively requires parallel and
scalable algorithm implementations opposed to using plain
sci-kit-learn3, R4, or different serial algorithms. Parallel ML
algorithms for those systems are typically programmed using
the Message Passing Interface (MPI) standard, and OpenMP
that jointly leverage the power of shared memory and dis-
tributed memory via low latency interconnects (e.g., Infini-
band) and parallel filesystems (e.g., Lustre).
Given our experience and frequent literature surveys, the
availability of open-source parallel and scalable machine
learning implementations that go beyond Artificial Neural
Network (ANN)s or more recent DL networks is still rela-
tively rare. The reason is the complexity of parallel program-
ming and thus using HPC can be a challenge when the amount
of data is relatively moderate (i.e., DL not always succesful).
1Top500 list of supercomputers, online: https://www.top500.org/
2https://www.fz-juelich.de/ias/jsc/EN/Expertise/Supercomputers/JUWELS
3https://scikit-learn.org/stable/
4https://www.r-project.org/
One example in this context is using a more robust classifier
such as a parallel and scalable open-source Support Vector
Machine (SVM) that we developed and used to speed up the
classification of hyper-spectral RS images [8].
The many-core processor era with accelerators brought
many advancements to both simulation sciences and data
sciences, including many new approaches for analysing RS
data with innovative DL techniques. The idea of using many
numerous simpler processors with hundreds to thousands of
independent processor cores enabled a high degree of par-
allel processing that fits very nicely to the demands of DL
training whereby lots of matrix-matrix multiplications are
performed. Today, hundreds to thousands of accelerators
like Nvidia Graphics Processing Unit (GPU)s are used in
large-scale HPC systems, offering unprecedented processing
power for satellite image processing. For example, Euro-
peans’ fastest supercomputer JUWELS offers 3744 GPUs
(booster module) of the most recent innovative type of Nvidia
A100 cards. Over the years, our experience shows clearly
that open-source DL packages such as TensorFlow5(now
including Keras6) or pyTorch7are powerful tools that work
very good for large-scale RS data analysis. Still, it can be
challenging to have the right versions of python code match-
ing the available tools and libraries versions on HPC systems
with GPUs given the fast advancements of DL libraries, ac-
celerators, and HPC systems. But the real challenge in using
GPUs is not using one or a couple of GPUs connected by
NVLlink or NVSwitches, but to scale beyond a large-scale
HPC node setup using distributed DL training tools such as
Horovod8or, more recently, DeepSpeed9. Our experience
in using our supercomputer JUWELS with Horovod with a
cutting-edge RESNET-50 DL network indicates a significant
speed-up of training time without loosing accuracy [9]. In
this initial study, we used 96 GPUs while in a later study
driven by Sedona et al. [10], we achieved even a better speed-
up on JUWELS using 128 interconnected GPUs after having
more experience with Horovod. Using GPUs in the context
of Unmanned Aerial Vehicle (UAV)s is shown in [11].
Beside HPC systems, also CC vendors offer multi-core
and many-core processing power that is used by RS re-
searchers with parallel and scalable tools such as Apache
Spark [4] in the last years. For example, Haut et al. [3]
uses Spark to develop a cloud implementation of a DL net-
work for non-linear RS data compression known as AutoEn-
coder (AE). CC techniques such as Spark pipelines offer
also the possibility to work in conjunction with DL tech-
niques such as recently shown by Lunga et al. [12] for RS
datasets. Over the years, we observed that CC makes par-
allel and distributed computing more straightforward to use
than traditional HPC systems, such as using the very con-
5https://www.tensorflow.org/
6https://keras.io/
7https://pytorch.org/
8https://horovod.ai/
9https://www.deepspeed.ai/
venient Jupyter10 toolset that abstracts the complexities of
underlying computing systems. Unfortunately, our experi-
ence revealed that the look and feel of seamlessly using CC
services are very different when working with various ven-
dors such as Amazon Web Services (AWS)11 , MS Azure12 ,
Google Collaboratory13 or Google Cloud 14. Also, the use of
Jupyter for interactive supercomputing is also becoming more
widespread as shown by Goebbert et al. for HPC systems of
JSC in [13]. Even more challenging in our experience are
the high cost when using CC services of vendors like AWS
whereby using Nvidia V100 GPUs (i.e., p3.16xlarge) would
cost more than 24 USD per hour. Our RESNET-50 studies
mentioned above is using 128 GPUs for many hours, hence,
we believe that for the time being RS researchers need to use
still the cost-free HPC computational time grants to be fea-
sible unless specific cooperations are formed with vendors.
Such HPC grants are provided by e-infrastructures such as
Partnership for Advanced Computing in Europe (PRACE)15
in the EU (e.g., that includes free of charge A100 GPUs in
JUWELS) or Extreme Science and Engineering Discovery
Environment (XSEDE)16 in the US. Our experience reveals
further that free CC resources of commercial vendors typi-
cally have drawbacks like the Google Collaboratory example
getting just different types of GPUs assigned that make it
relatively hard to perform proper speed-up studies not even
mentioning the missing possibility to interconnect GPUs for
large-scale usage. Other examples of RS approaches of using
Spark with CC are distributed parallel algorithms for anomaly
detection in hyper-spectral images as shown in [14].
QA emerged as a promising and highly innovative com-
puting approach used for simple RS data analysis problems
to solve ML algorithms’ optimisation problems [5]. Our
experience with using quantum SVMs reveals that on QA
architectures such as a D-Wave system17 with 2000 qubits,
RS researchers’ possibilities are still limited by having only
binary classification techniques or the requirement to sub-
sample from large datasets and using ensemble methods [7].
Recent experience revealed that QA evolutions bears a lot of
potentials since we are already using D-Wave Leap18 with
the QQ Advantage system that offers more than 5000 qubits
and 35000 couplers enabling powerful RS data analysis. An-
other recent example of using QA for feature extraction and
segmentation is shown by Otgonbaatar et al. in [15].
We observe a massive increase in complexity in the tech-
nological advances and an unprecedented heterogeneity in
available HPC, CC, and QC systems that will raise challenges
10https://jupyter.org/
11https://aws.amazon.com/
12https://azure.microsoft.com/en-us/
13https://colab.research.google.com/
14https://cloud.google.com/
15https://prace-ri.eu/
16https://www.xsede.org/
17https://www.dwavesys.com/quantum-computing
18https://cloud.dwavesys.com/leap/
for their seamlessly use by domain-specific scientists such as
in the RS community. Our studies on recent highly heteroge-
nous Modular Supercomputing Architecture (MSA)s reveals
quite some complexity for using cutting-edge HPC systems
for RS researchers while still offering enormous benefits as
shown by Erlingsson et al. in [16]. But not only is the pace of
complex technologies fast (e.g., GPUs from K40s to A100s
today), also new methods of Artificial Intelligence (AI) with
innovative DL approaches or hyper-parameter optimization
techniques like Neural Architecture Search (NAS) [17] are
moving forward with an ever-increasing pace. Applying those
cutting-edge computing systems with innovative AI technolo-
gies in complex RS application research questions requires
more inter-disciplinary expertise than ever. One possible so-
lution to deal with this complexity based on our experience
at the JSC is establishing so-called domain-specific Simu-
lation and Data Laboratories (SimDataLab)s19 that consists
of multi-disciplinary teams w.r.t. technologies but focus on
one particular area of science (e.g., RS, neurosciences, etc.).
Substantial multi-disciplinary efforts are needed in order to
enable RS researchers to solve problems with high scientific
impact through efficient use of HPC, CC, or QC resources
today and even more in the future.
To help meet this challenge the JSC in Germany success-
fully operates since 2004 a wide variety of SimDataLabs as
domain-specific research and support structure. A similar
AI oriented research and support structure like SimDataLabs
are the newly formed Helmholtz AI20 local units in Germany
at the German Aerospace Center and JSC that both support
RS applications. Based on these succesful models, the Uni-
versity of Iceland has recently created the ’SimDataLab Re-
mote Sensing’21 with significant expertise to tackle comput-
ing challenges in RS funded by the Icelandic National Com-
petence Center (NCC) under the umbrella of the EuroCC EC
project 22. Selected activities of this SimDataLab include sup-
porting HPC, CC, and QC users in using parallel codes (e.g.,
maintaining the above mentioned parallel SVM code) or scal-
ing their code on multiple GPUs by using tools (e.g., Deep-
Speed or Horovod) in conjunction with deep learning pack-
ages (e.g., Keras or TensorFlow). Another benefit of Sim-
DataLabs with domain-specific scientists is that the datasets
become more manageable, leading to reduced storage usage
by having large RS datasets centrally managed on computing
resources instead of having them downloaded by each user
separately. Even though the SimDataLab RS in Iceland is pri-
marily supporting the Icelandic RS community23, it also en-
gages in strong international collaborations with many other
countries (e.g., China, Germany, Italy, Spain, etc.) interested
in overcoming computational challenges in RS.
19https://www.fz-juelich.de/ias/jsc/EN/Expertise/SimLab/simlab node.html
20https://helmholtz.at/
21https://ihpc.is/community/
22https://www.eurocc-project.eu/
23Icelandic Center for Remote Sensing, online: https://crs.hi.is/
3. CONCLUSIONS
We conclude that RS research entails many highly processing-
intensive application areas for HPC, CC, and QC systems.
Most notably, many current and future RS applications re-
quire real- or near real-time processing capabilities that those
systems offer with significant speed-ups instead of just con-
ventional desktop computers, small workstations, or laptops.
Examples include environmental studies, military applica-
tions, tracking and monitoring hazards such as wildland and
forest fires, oil spills and chemical/biological contaminations.
Fast processing is also necessary for distributed training of
DL networks and the computational-intensive process of
hyper-parameter optimization using innovative approaches
such as NAS. Future challenges include the broader use of
transfer learning and intertwined usage of RS approaches
with simulation sciences (i.e., using known physical laws
with iterative numerical methods). Recently, the Center of
Excellence (CoE) Research on AI- and Simulation-Based
Engineering at Exascale (RAISE)24 EU project started to
identify such intertwined methodologies by using seismic
imaging with remote sensing for oil and gas exploration and
well maintenance in conjunction with simulations that raises
high requirements for processing power.
4. REFERENCES
[1] C. Lee et al., “Recent Developments in High Perfor-
mance Computing for Remote Sensing: A Review,”
IEEE Journal of Selected Topics in Applied Earth Ob-
servations and Remote Sensing, vol. 4, no. 3, pp. 508–
527, 2011.
[2] Q. Zou et al., “MapReduce Functions to Remote Sens-
ing Distributed Data Processing - Global Vegetation
Drought Monitoring as Example,” Journal of Software:
Practice and Experience, vol. 48, no. 7, pp. 1352–1367,
2018.
[3] J.M. Haut et al., “Cloud Deep Networks for Hyperspec-
tral Image Analysis,” IEEE Transactions on Geoscience
and Remote Sensing, vol. 57, no. 12, pp. 1–17, 2019.
[4] I. Chebbi et al., “A Comparison of Big Remote Sensing
Data Processing with Hadoop MapReduce and Spark,”
in 4th International Conference on Advanced Technolo-
gies for Signal and Image Processing, 2018, pp. 1–4.
[5] M. Henderson et al., “Methods for Accelerating
Geospatial Data Processing Using Quantum Comput-
ers,” 2020.
[6] R. Ayanzadeh et al., “An Ensemble Approach for Com-
pressive Sensing with Quantum Annealers, in IEEE In-
ternational Geoscience and Remote Sensing Symposium
(IGARSS), 2020, to appear.
24Center of Excellence RAISE, online: coe-raise.eu
[7] G. Cavallaro et al., “Approaching Remote Sensing Im-
age Classification with ensembles of Support Vector
Machines on the D-Wave Quantum Annealer,” in IEEE
International Geoscience and Remote Sensing Sympo-
sium (IGARSS), 2020, to appear.
[8] G. Cavallaro et al., “On Understanding Big Data Im-
pacts in Remotely Sensed Image Classification Using
Support Vector Machine Methods,” IEEE Journal of
Selected Topics in Applied Earth Observations and Re-
mote Sensing, vol. 8, no. 10, pp. 4634–4646, 2015.
[9] R. Sedona et al., “Remote Sensing Big Data Classifi-
cation with High Performance Distributed Deep Learn-
ing,” MDPI Remote Sensing, vol. 11, no. 24, 2019.
[10] R. Sedona et al., “Scaling up a Multispectral RESNET-
50 to 128 GPUs,” in IEEE International Geoscience and
Remote Sensing Symposium (IGARSS), 2020, to appear.
[11] R. Wang et al., “An Effective Image Denoising Method
for UAV Images via Improved Generative Adversarial
Networks,Sensors, vol. 18, no. 7, 2018.
[12] D. Lunga et al., “Apache Spark Accelerated Deep
Learning Inference for Large Scale Satellite Image An-
alytics,” IEEE Journal of Selected Topics in Applied
Earth Observations and Remote Sensing, vol. 13, pp. 1–
17, 2020.
[13] J.H. Goebbert et al., “Enabling Interactive Supercom-
puting at JSC Lessons Learned,” in Proceedings of In-
ternational Conference on High Performance Comput-
ing, 2019, Lecture Notes in Computer Science Vol.
11203.
[14] Y. Zhang et al., “A Distributed Parallel Algorithm Based
on Low-Rank and Sparse Representation for Anomaly
Detection in Hyperspectral Images,” MDPI Sensors,
vol. 18, no. 11, 2018.
[15] S. Otgonbaatar et al., “Quantum Annealing Approach:
Feature Extraction and Segmentation of Synthetic Aper-
ture Radar Image,” in IEEE International Geoscience
and Remote Sensing Symposium (IGARSS), 2020, to ap-
pear.
[16] E. Erlingsson et al., “Scalable Workflows for Remote
Sensing Data Processing with the Deep-Est Modular
Supercomputing Architecture,” in IEEE International
Geoscience and Remote Sensing Symposium (IGARSS),
2019, pp. 5905–5908.
[17] C. Peng et al., “Efficient Convolutional Neural Archi-
tecture Search for Remote Sensing Image Scene Classi-
fication,” IEEE Transactions on Geoscience and Remote
Sensing, pp. 1–14, 2020.
... In the last years the exponential growth of RS data for 30 EO, with huge heterogeneity in terms of spatial, spectral, solve important problems handled with difficulty on traditional 45 computers [1], [2]. Therefore, QC and ML have started to 46 be jointly considered in RS-related applications, as discussed 47 in [3]- [7], showing several advantages such as improvements 48 in the run-time, learning capacity and efficiency improve-49 ments, just to mention a few. ...
Preprint
Full-text available
p>Quantum Machine Learning (QML) is an emerging technology that only recently has begun to take root in the research fields of Earth Observation (EO) and Remote Sensing (RS), and whose state of the art is roughly divided into one group oriented to fully quantum solutions, and in another oriented to hybrid solutions. Very few works applied QML to EO tasks, and none of them explored a methodology able to give guidelines on the hyperparameter tuning of the quantum part. As a first step in the direction of quantum advantage for RS data classification, this letter opens new research lines, allowing us to demonstrate that there are more convenient solutions to simply increasing the number of qubits in the quantum part. To pave the first steps for researchers interested in the above, the structure of a new hybrid quantum neural network is proposed with a strategy to choose the number of qubits to find the most efficient combination in terms of both system complexity and results accuracy. We sampled and tried a number of configurations, and using the suggested method we came up with the most efficient solution (in terms of the selected metrics). Better performance is achieved with less model complexity when tested and compared with state-of-the-art techniques for identifying volcanic eruptions chosen as a case study. Additionally, the method makes the model more resilient to dataset imbalance, a major problem when training classical models. Lastly, the code is freely available so that interested researchers can reproduce and extend the results.</p
... In the last years the exponential growth of RS data for 30 EO, with huge heterogeneity in terms of spatial, spectral, solve important problems handled with difficulty on traditional 45 computers [1], [2]. Therefore, QC and ML have started to 46 be jointly considered in RS-related applications, as discussed 47 in [3]- [7], showing several advantages such as improvements 48 in the run-time, learning capacity and efficiency improve-49 ments, just to mention a few. ...
Preprint
p>Quantum Machine Learning (QML) is an emerging technology that only recently has begun to take root in the research fields of Earth Observation (EO) and Remote Sensing (RS), and whose state of the art is roughly divided into one group oriented to fully quantum solutions, and in another oriented to hybrid solutions. Very few works applied QML to EO tasks, and none of them explored a methodology able to give guidelines on the hyperparameter tuning of the quantum part. As a first step in the direction of quantum advantage for RS data classification, this letter opens new research lines, allowing us to demonstrate that there are more convenient solutions to simply increasing the number of qubits in the quantum part. To pave the first steps for researchers interested in the above, the structure of a new hybrid quantum neural network is proposed with a strategy to choose the number of qubits to find the most efficient combination in terms of both system complexity and results accuracy. We sampled and tried a number of configurations, and using the suggested method we came up with the most efficient solution (in terms of the selected metrics). Better performance is achieved with less model complexity when tested and compared with state-of-the-art techniques for identifying volcanic eruptions chosen as a case study. Additionally, the method makes the model more resilient to dataset imbalance, a major problem when training classical models. Lastly, the code is freely available so that interested researchers can reproduce and extend the results.</p
... Recent studies have demonstrated the efficiency of innovative computing systems like QML in classifying multispectral remote sensing datasets compared with their classical counterparts (Delilbasic et al. 2021;Riedel, Cavallaro and Benediktsson 2021;Zaidenberg et al. 2021;Sebastianelli et al. 2022). However, there is no study on using QSVM for hyperspectral image classification to investigate the accuracy, training and predicting speed for increment in a number of qubits. ...
Article
Quantum machine learning (QML) focuses on machine learning models developed explicitly for quantum computers. Availability of the first quantum processor led to further research, particularly exploring possible practical applications of QML algorithms in the remote sensing field. The demand for extensive field data for remote sensing applications has started creating bottlenecks for classical machine learning algorithms. QML is becoming a potential solution to tackle big data problems as it can learn from fewer data. This paper presents a QML model based on a quantum support vector machine (QSVM) to classify Holm Oak trees using PRISMA hyperspectral Imagery. Implementation of quantum models was carried on a quantum simulator and a real-time superconducting quantum processor of IBM. The performance of the QML model is validated in terms of dataset size, overall accuracy, number of qubits, training and predicting speed. Results were indicative that (i) QSVM offered 5% higher accuracy than classical SVM (CSVM) with 50 samples and ≥12 qubits/feature dimensions whereas with 20 samples at 16 Qubits/feature dimension, (ii) training time for QSVM at maximum accuracy was 284 s with 50 samples and with 20 samples was 53.68 s and (iii) predicting time for 400 pixels using the QSVM model trained with 50 samples dataset was 5243 s whereas with 20 samples dataset was 2845 s. Results were indicative that QML offers better accuracy but lack training and predicting speed for hyperspectral data. Another observation is that predicting speed of QSVM depends on the number of samples used to train the model.
... Given the need to help expand the processing techniques to deal with this high resolution Big Data, EO is now looking towards new and innovative computation technologies [7]. This is where Quantum Computing (QC) will play a fundamental A [8]. Today, there is a number of differing quantum devices, such as programmable superconducting processors [9], quantum annealers [10], and photonic quantum computers [11]. ...
Article
Full-text available
This article aims to investigate how circuit-based hybrid Quantum Convolutional Neural Networks (QCNNs) can be successfully employed as image classifiers in the context of remote sensing. The hybrid QCNNs enrich the classical architecture of CNNs by introducing a quantum layer within a standard neural network. The novel QCNN proposed in this work is applied to the Land Use and Land Cover (LULC) classification, chosen as an Earth Observation (EO) use case, and tested on the EuroSAT dataset used as reference benchmark. The results of the multiclass classification prove the effectiveness of the presented approach, by demonstrating that the QCNN performances are higher than the classical counterparts. Moreover, investigation of various quantum circuits shows that the ones exploiting quantum entanglement achieve the best classification scores. This study underlines the potentialities of applying quantum computing to an EO case study and provides the theoretical and experimental background for futures investigations.
... Image classification techniques are constantly being improved to keep up with the ever expanding stream of Big Data, and as a consequence, artificial intelligence (AI) techniques are becoming increasingly necessary tools [5], [6]. Given the need to help expand the processing techniques to deal with these high-resolution Big Data, EO is now looking toward new and innovative computation technologies [7]. This is where quantum computing (QC) will play a fundamental role [8]. ...
Preprint
Full-text available
This article aims to investigate how circuit-based hybrid Quantum Convolutional Neural Networks (QCNNs) can be successfully employed as image classifiers in the context of remote sensing. The hybrid QCNNs enrich the classical architecture of CNNs by introducing a quantum layer within a standard neural network. The novel QCNN proposed in this work is applied to the Land Use and Land Cover (LULC) classification, chosen as an Earth Observation (EO) use case, and tested on the EuroSAT dataset used as reference benchmark. The results of the multiclass classification prove the effectiveness of the presented approach, by demonstrating that the QCNN performances are higher than the classical counterparts. Moreover, investigation of various quantum circuits shows that the ones exploiting quantum entanglement achieve the best classification scores. This study underlines the potentialities of applying quantum computing to an EO case study and provides the theoretical and experimental background for futures investigations.
Article
The High-Performance and Disruptive Computing in Remote Sensing (HDCRS) Working Group (WG) was recently established under the IEEE Geoscience and Remote Sensing Society (GRSS) Earth Science Informatics (ESI) Technical Committee to connect a community of interdisciplinary researchers in remote sensing (RS) who specialize in advanced computing technologies, parallel programming models, and scalable algorithms. HDCRS focuses on three major research topics in the context of RS: 1) supercomputing and distributed computing, 2) specialized hardware computing, and 3) quantum computing (QC). This article presents these computing technologies as they play a major role for the development of RS applications. The HDCRS disseminates information and knowledge through educational events and publication activities which will also be introduced in this article.
Article
Full-text available
Quantum computing is a transformative technology with the potential to enhance operations in the space industry through the acceleration of optimization and machine learning processes. Machine learning processes enable automated image classification in geospatial data. Quantum algorithms provide novel approaches for solving these problems and a potential future advantage over current classical techniques. Universal Quantum Computers, developed by Rigetti Computing and other providers, enable fully general quantum algorithms to be executed, with theoretically proven speed-up over classical algorithms in certain cases. This paper describes an approach to satellite image classification using a universal quantum enhancement to convolutional neural networks: the quanvolutional neural network. Using a refined method, we found a performance improvement over previous quantum efforts in this domain and identified potential refinements that could lead to an eventual quantum advantage. We benchmark these networks using the SAT-4 satellite imagery dataset in order to demonstrate the utility of machine learning techniques in the space industry and the potential benefits that quantum machine learning can offer.
Article
Full-text available
The shear volumes of data generated from earth observation and remote sensing technologies continue to make major impact; leaping key geospatial applications into the dual data and compute-intensive era. As a consequence, this rapid advancement poses new computational and data processing challenges. We implement a novel remote sensing data flow (RESFlow) for advancing machine learning to compute with massive amounts of remotely sensed imagery. The core contribution is partitioning massive amounts of data into homogeneous distributions for fitting simple models. RESFlow takes advantage of Apache Spark and the availability of modern computing hardware to harness the acceleration of deep learning inference on expansive remote sensing imagery. The framework incorporates a strategy to optimize resource utilization across multiple executors assigned to a single worker. We showcase its deployment in both computationally and data-intensive workloads for pixel-level labeling tasks. The pipeline invokes deep learning inference at three stages; during deep feature extraction, deep metric mapping, and deep semantic segmentation. The tasks impose compute-intensive and GPU resource sharing challenges motivating for a parallelized pipeline for all execution steps. To address the problem of hardware resource contention, our containerized workflow further incorporates a novel GPU checkout routine and the ticketing system across multiple workers. The workflow is demonstrated with NVIDIA DGX accelerated platforms and offers appreciable compute speed-ups for deep learning inference on pixel labeling workloads; processing 21 028 TB of imagery data and delivering output maps at area rate of 5.245 sq.km/s, amounting to 453 168 sq.km/day—reducing a 28 day workload to 21 h.
Article
Full-text available
High-Performance Computing (HPC) has recently been attracting more attention in remote sensing applications due to the challenges posed by the increased amount of open data that are produced daily by Earth Observation (EO) programs. The unique parallel computing environments and programming techniques that are integrated in HPC systems are able to solve large-scale problems such as the training of classification algorithms with large amounts of Remote Sensing (RS) data. This paper shows that the training of state-of-the-art deep Convolutional Neural Networks (CNNs) can be efficiently performed in distributed fashion using parallel implementation techniques on HPC machines containing a large number of Graphics Processing Units (GPUs). The experimental results confirm that distributed training can drastically reduce the amount of time needed to perform full training, resulting in near linear scaling without loss of test accuracy.
Article
Full-text available
Advances in remote sensing hardware have led to a significantly increased capability for high-quality data acquisition, which allows the collection of remotely sensed images with very high spatial, spectral, and radiometric resolution. This trend calls for the development of new techniques to enhance the way that such unprecedented volumes of data are stored, processed, and analyzed. An important approach to deal with massive volumes of information is data compression, related to how data are compressed before their storage or transmission. For instance, hyperspectral images (HSIs) are characterized by hundreds of spectral bands. In this sense, high-performance computing (HPC) and high-throughput computing (HTC) offer interesting alternatives. Particularly, distributed solutions based on cloud computing can manage and store huge amounts of data in fault-tolerant environments, by interconnecting distributed computing nodes so that no specialized hardware is needed. This strategy greatly reduces the processing costs, making the processing of high volumes of remotely sensed data a natural and even cheap solution. In this paper, we present a new cloud-based technique for spectral analysis and compression of HSIs. Specifically, we develop a cloud implementation of a popular deep neural network for non-linear data compression, known as autoencoder (AE). Apache Spark serves as the backbone of our cloud computing environment by connecting the available processing nodes using a master-slave architecture. Our newly developed approach has been tested using two widely available HSI data sets. Experimental results indicate that cloud computing architectures offer an adequate solution for managing big remotely sensed data sets.
Conference Paper
The Markov Random Field (MRF) is used for extracting feature information in images and is formed as an Ising-like model. The quantum annealing is a novel method to optimize objective functions, and objective functions have to be expressed in terms of the Ising model. Hence, the MRF can be embedded into the the quantum annealing method, and feature information of remote sensing images then can be extracted using a quantum annealing computer. Extracted information or features are used to segment an image.
Article
As a fundamental but challenging task in the interpretation of remote sensing images, scene classification plays an important role in various applications and has become an active research topic. Many previous works have demonstrated the remarkable performance of the deep convolutional neural networks (CNNs) for remote sensing scene classification. However, the progress made by CNN-based methods for scene classification has gradually reached saturation in recent years, due to the serious dependence on the pretrained CNN models, the limitations of manually designed network architecture and the disadvantages of existing data sets. In this article, a new paradigm to automatically design a suitable CNN architecture for scene classification is investigated. We propose an efficient architecture search framework to discover optimal network architectures in continuous search space with the gradient-based optimization method. Our framework consists of two stages: the search phase and the evaluation phase. During the search process, a greedy and progressive search strategy is introduced to search network building blocks (i.e., cells) through bilevel optimization. Besides, we propose a simple architecture regularization scheme to further improve the search efficiency and the robustness of discovered architectures. After the search process, the optimal cell architectures are determined and then repeatedly stacked to construct the final network for evaluation. For the data set, we propose a mergence strategy to build a new large-scale remote sensing scene image data set that contains rich scene categories and image diversity, making it feasible to find a new CNN model with strong generalization ability for scene classification. Extensive experiments demonstrate the efficiency of the proposed search strategies and the impressive classification performance of searched CNN architectures on seven public benchmark data sets, including four large-scale data sets and three small-scale data sets.