Content may be subject to copyright.
Michael Schmitt1, Florence Tupin2, Xiao Xiang Zhu1,3
1Signal Processing in Earth Observation, Technical University of Munich (TUM), Munich, Germany
ecom ParisTech, Universit´
e Paris Saclay, Paris, France
3Remote Sensing Technology Institute (IMF), German Aerospace Center (DLR), Wessling, Germany
In this paper, we summarize challenges, proposed solutions
and recent trends in the field of SAR-optical remote sens-
ing data fusion. Although being a pre-processing step be-
fore the actual fusion-by-estimation, it is shown that matching
and coregistration is one of the core challenges in that regard,
which is mainly due to the strongly different geometric and
radiometric properties of the two observation types. We then
review some of the published fusion methods and discuss the
future trends of this topic.
Index Termssynthetic aperture radar (SAR), optical
imagery, remote sensing, data fusion
Currently, we are living in the “golden era of Earth obser-
vation”, which is characterized by an abundance of airborne
and spaceborne sensors that provide a large variety of remote
sensing data. Every sensor type possesses different peculiar-
ities, designed for specific tasks. Thus, in a scientific field,
where the variety of exploited sensors ranges throughout most
of the electromagnetic spectrum, and which includes both ac-
tive and passive sensing technologies, comprises resolutions
from the micrometer to the kilometer level, and aims at appli-
cations from geological deformation monitoring to biomass
estimation and urban area reconstruction, sensor data fusion
is a crucial topic. Only with multi-sensor data fusion, the
maximal utilization of what is available in the archives or, in
the case of, for example, rapid mapping situations, of what
can be acquired in the shortest possible time, can be ensured
An important example for the exploitation of complemen-
tary information from remote sensing sensors is the joint use
of synthetic aperture radar (SAR) and optical data [2]. While
SAR imagery measures physical properties of the observed
scene and can be acquired independently of weather or day-
light conditions, optical imagery measures chemical charac-
teristics and needs both daylight and, if not flown at low alti-
tudes, a cloudless sky. On the other hand, optical data is much
easier to interpret by human operators and usually provides
more details at similar resolution, whereas SAR data contains
not only amplitude but also phase information, which enables
a high-precision measurement of three-dimensional topogra-
phy and deformations thereof.
While SAR-optical data fusion has been investigated for
some time now, it has recently gained new drive, mainly
caused by two major developments. The first development
was the growing availability of imagery with very high spatial
resolutions that are meant to enable a precise mapping of the
Earth’s surface, especially in urban areas. The second de-
velopment is the implementation of new international space
programs, such as ESA’s Copernicus, which incorporate var-
ious sensor technologies already by design. In this example,
there will be great potential for a joint exploitation of SAR
data provided by the Sentinel-1 satellites and multi-spectral
data provided by the Sentinel-2 mission [3, 4].
Given the high relevance of SAR-optical sensor data fu-
sion in the current remote sensing environment, this paper in-
tends to summarize both the challenges faced as well as re-
cent research trends. Section 2 will quickly recapitulate the
data fusion taxonomy as applicable in the remote sensing do-
main, while Section 3 discusses the challenges faced in SAR-
optical fusion. Section 4 then summarizes hitherto published
solutions, before Section 5 sketches the trends we will face in
the near future.
Data fusion has been a well-discussed research topic in the re-
mote sensing community with the first review and discussion
articles published more than 15 years ago [5]. As explained
in great detail by Hall and Llinas [6], multisensor data fusion
can be organized into several levels: object refinement, situa-
tion refinement, and threat refinement. Coming from a mili-
tary background, their theory, however, must be adapted to the
remote sensing context. In this regard, mainly the object re-
finement level is of interest, which itself is structured into the
following tasks (Figure 1): data alignment, data/object corre-
lation, attribute estimation, and identity estimation. From a
remote sensing point of view, these four steps can be summa-
Fig. 1. Flowchart of the generic data fusion process.
rized into two core actions. Data alignment and data/object
correlation together form what is commonly referred to as
matching and coregistration. Their goal is to ensure that mea-
surements are properly connected to each other and to the
respective object of interest. The other two tasks, attribute
estimation and identity estimation, constitute the actual fu-
sion step, i.e., the combined exploitation of aligned and cor-
related measurement data using statistical estimation or ma-
chine learning methods.
3.1. Matching and Coregistration
In remote sensing, these two steps aim at the spatial and tem-
poral matching and, if necessary, the coregistration, respec-
tively, of different sensor data showing potentially very dif-
ferent radiometric, geometric, and other properties. When the
alignment problem is solved and spatial, temporal, and/or se-
mantic relationships among the individual data sources are
established, a reference frame can be defined to which all
available data can be transformed. This transformation may
often require an additional resampling process, which may
be necessary not only for the spatial domain but also for the
temporal domain. In the end, the result of matching and po-
tential coregistration is an exact determination of which mea-
surements belong to the same geospatial object and/or were
acquired at the same relevant point in time.
Generally, the matching and coregistration of heteroge-
neous data, such as optical and SAR imagery, comprise a core
challenge in remote sensing data fusion. Although this has
been a thoroughly studied problem for many years, it is still
an open field of research because massive data amounts re-
quire fully automated procedures for data registration, which
in turn requires a preliminary automated matching of homol-
ogous data points. While this is rather simple to achieve for
homogeneous sensor data such as mono-sensor optical im-
ages [7], it is far more challenging for imagery from different
sensors (e.g. [8, 9]). Furthermore, additional challenges arise
if matching or coregistration cannot be carried out without
external knowledge, such as pre-existing information about
the three-dimensional (3-D) nature of the real-world object
of interest. If this external knowledge corresponds to the de-
sired entity, which actually is the goal of the whole data fusion
process, it will be necessary to closely link the matching and
coregistration steps to the actual fusion step and find a so-
lution by jointly optimizing both the matching/coregistration
and the estimation objective.
3.2. Fusion by Estimation
In any case, attribute and/or identity estimation are the very
core of any data fusion endeavour and are mainly driven by
different developments in statistical estimation theory and
machine learning, respectively. The combination of informa-
tion can be done at different levels: pixel level, region level
(for instance given by a segmentation), or object level (for
instance primitive level driven by shape information). Some
of these frameworks are reviewed in the following section.
In this section we review some of the published solutions
(without any exhaustivity) for data alignement and low-level
to high level tasks.
4.1. Data alignment
As said in in Section 3, this step is still a difficult one and is
not made easier with the new sensor generation. On the one
hand, the accuracy of the sensor parameters has improved,
leading to very precise geometric information. But on the
other hand, the very high resolution makes necessary to take
into account object elevation, specially when dealing with
urban areas. Beyond the geometric distortions due to dif-
ferent viewing conditions and image synthesis principles as
discussed in [10], radiometric differences are also crucial.
Therefore any matching procedure has to be adapted to the
feature characteristics (as is done in [11] where optical edges
are matched against SAR lines). More recently, fusion ap-
proaches have attempted to circumvent these problems by in-
corporating prior knowledge in the form of existing 3-D geo-
data for the simulation of reference data sets [12].
4.2. Joint classification or data improvement
When dealing with aligned data, many low-level fusion tasks
can be done. As already discussed in [2], the fusion of SAR
and optical remote sensing data has been an active field of re-
search for many years, where the goal of improved land cover
classification [13, 14] is just one driving motivation. Other
reasons for data fusion in the SAR-optical fusion context are
the sharpening of low-resolution optical images by very high-
resolution SAR imagery [15, 16], or conversely the improve-
ment of SAR amplitude by exploiting an optical image [17].
4.3. Object level
When dealing with data with such radiometric and geometric
differences it may be easier to combine information at the ob-
ject level. It can be the case for road network extraction [18],
[19] or for building reconstruction with (amplitude, radar-
grammetric, interferometric or tomographic) SAR measure-
ments and optical images [20, 21, 22, 23]. The latest develop-
ments in this area have extended the mapping of urban areas
supported by data fusion even to the global scale [24, 25].
4.4. Fusion and change
Most of the fusion schemes are developed supposing both
optical and SAR information are in accordance (meaning no
change has occured between the two acquisitions). An even
more challenging task is the detection of changes between
two acquisitions of different sensors. This is a topic of high-
est interest since this is the situation usually faced for dam-
age assessment where some existing archived data has to be
compared with a newly acquired image, possibly taken by a
different acquisition system. The change detection is usually
done at object level and combines all the available clues that
can be computed [26, 27]. Machine learning strategies are
usually very useful for such difficult situations.
The future of research in the field of SAR-optical data fu-
sion will comprise several interesting directions. We intend
to sketch those we find most promising:
Exploitation of “Big Data”
As mentiond in Section 1 already, with the advent of
the Sentinel satellites of the Copernicus program, big
data has also arrived in the field of Earth observation, as
now everybody can access huge amounts of spaceborne
remote sensing data. In our context, this means quasi-
unlimited access to SAR and optical imagery acquired
by the Sentinel-1/2 missions. One of the future chal-
lenges thus will be to exploit these data sources using
sophisticated processing methods. This will not only
comprise scalable algorithms for information extrac-
tion by data fusion, but also the exploitation of cloud-
based and parallelized computing approaches.
Deep Learning
In parallel to the availability of large data amounts (and
ever-increasing computational power), the application
of deep learning approaches will become more and
more attractive, also in the field of SAR-optical data
fusion. One of the first examples is the automated
learning of patch similarity using convolutional neural
networks [28].
The Time Variable
With the availability of huge temporal series both for
SAR and optical sensors, it becomes necessary to de-
velop fusion approaches taking into account temporal
changes [29]. While exploiting SITS (Satellite Image
Time Series) has been widely investigated for mono-
sensor data (either optical [30] or SAR [31]), exploiting
a time series combining both sensors is still an open re-
search topic facing numerous difficulties.
In this paper, the challenges in the fusion of SAR and optical
remote sensing data, and some published solutions have been
discussed. In addition, future trends in this field of multi-
sensor data fusion have been sketched.
This work is partially supported by the Helmholtz Associa-
tion under the framework of the Young Investigators Group
SiPEO (VH-NG-1018, and the Ger-
man Research Foundation (DFG), grant SCHM 3322/1-1.
[1] M. Schmitt and X. Zhu, “Data fusion and remote sens-
ing – an ever-growing relationship,” IEEE Geosci. and
Remote Sens. Mag., vol. 4, no. 4, pp. 6–23, 2016.
[2] F. Tupin, “Fusion of optical and SAR images,” in
Radar Remote Sensing of Urban Areas, Uwe Soergel,
Ed. Springer, Doetinchem, The Netherlands, 2010.
[3] R. Torres, P. Snoeij, D. Geudtner, D. Bibby, M. David-
son, E. Attema, P. Potin, B. Rommen, N. Floury,
M. Brown, et al., “GMES Sentinel-1 mission,” Remote
Sens. Env., vol. 120, pp. 9–24, 2012.
[4] M. Drusch, U. Del Bello, S. Carlier, O. Colin, V. Fer-
nandez, F. Gascon, B. Hoersch, C. Isola, P. Laberinti,
P. Martimort, et al., “Sentinel-2: ESA’s optical high-
resolution mission for GMES operational services,” Re-
mote Sens. Env., vol. 120, pp. 25–36, 2012.
[5] L. Wald, “Some terms of reference in data fusion,IEEE
Trans. Geosci. Remote Sens., vol. 37, no. 3, pp. 1190–
1193, 1999.
[6] D. L. Hall and J. Llinas, “An introduction to multisen-
sory data fusion,” Proc. of IEEE, vol. 85, no. 1, pp.
6–23, 1997.
[7] B. Zitova and J. Flusser, “Image registration methods:
A survey,” Image Vision Comput., vol. 21, no. 11, pp.
977–1000, 2003.
[8] M. Irani and P. Anandan, “Robust multi-sensor image
alignment,” in Proc. ICCV, 1998, pp. 959–966.
[9] Y. Keller and A. Averbuch, “Multisensor image registra-
tion via implicit similarity, IEEE Trans. Pattern Anal.
Mach. Intell., vol. 28, no. 5, pp. 794–801, 2006.
[10] G. Palubinskas, P. Reinartz, and R. Bamler, “Image ac-
quisition geometry analysis for the fusion of optical and
radar remote sensing data,” Int. J. Image Data Fusion,
vol. 1, no. 3, pp. 271–282, 2010.
[11] G. Lehureau, F. Tupin, C. Tison, G. Oller, and D. Pe-
tit, “Registration of metric resolution SAR and optical
images in urban areas,” Proc. EUSAR, 2008.
[12] J. Tao, G. Palubinskas, P. Reinartz, and S. Auer, “Inter-
pretation of sar images in urban areas using simulated
optical and radar images,” in Proc. Joint Urban Remote
Sensing Event, 2011, pp. 41–44.
[13] B. Waske and J. A. Benediktsson, “Fusion of support
vector machines for classification of multisensory data,
IEEE Trans. Geosci. Remote Sens., vol. 45, no. 12, pp.
3858–3866, 2007.
[14] B. Waske and S. van der Linden, “Classifying multi-
level imagery from SAR and optical sensors by decision
fusion,” IEEE Trans. Geosci. Remote Sens., vol. 46, no.
5, pp. 1457–1466, 2008.
[15] F. Balik Sanli, S. Abdikan, M. T. Esetilli, M. Ustuner,
and F. Sunar, “Fusion of TerraSAR-X and RapidEye
data: A quality analysis,” in Int. Arch. Photogramm.
Remote Sens. Spatial Inf. Sci., 2013, vol. XL-7.
[16] R. Reulke, G. Giaquinto, M. M. Giovenco, and
D. Ruess, “Optics and radar image fusion, in Proc.
7th Int. Conf. Sensing Technology, 2013, p. 686692.
[17] L. Verdoliva, R. Gaetano, G. Ruello, and G. Poggi,
“Optical-driven nonlocal SAR despeckling, IEEE
Geosc. Remote Sens. Letters, vol. 12, no. 2, pp. 314–
318, 2015.
[18] G. Lisini, P. Gamba, F. Dell’Acqua, and F. Holecz,
“First results on road network extraction and fusion on
optical and SAR images using a multi-scale adaptive ap-
proach,” Int. J. Image Data Fusion, vol. 2, no. 4, pp.
363–375, 2011.
[19] T. Perciano, F. Tupin, R. Hirata, and R. Cesar, “A
two-level Markov random field for road network extrac-
tion and its application with optical, SAR and multi-
temporal data,” International Journal of Remote Sens-
ing, vol. 37, no. 16, pp. 3584–3610, 2016.
[20] V. Poulain, J. Inglada, M. Spigai, J.-Y. Tourneret, and
P. Marthon, “High resolution optical and SAR image
fusion for building database updating, IEEE Trans. on
Geosc. and Remote Sensing, vol. 49, no. 8, pp. 2900–
2910, 2011.
[21] J. D. Wegner, R. H¨
ansch, A. Thiele, and U. Soergel,
“Building detection from one orthophoto and high-
resolution InSAR data using conditional random fields,”
IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens.,
vol. 4, no. 1, pp. 83–91, 2011.
[22] H. Sportouche, F. Tupin, and L. Denis, “Extraction and
three-dimensional reconstruction of isolated buildings
in urban scenes from high-resolution optical and SAR
spaceborne images,” IEEE Trans. Geosci. Remote Sens.,
vol. 49, no. 10, pp. 3932–3946, 2011.
[23] Y. Wang, X. Zhu, B. Zeisl, and M. Pollefeys, “Fus-
ing meter-resolution 4-D InSAR point clouds and op-
tical images for semantic urban infrastructure monitor-
ing,” IEEE Trans. on Geosc. and Rem. Sens., vol. 55,
no. 1, pp. 1–13, 2017.
[24] P. Gamba, M. Aldrighi, and M. Stasolla, “Robust ex-
traction of urban area extents in HR and VHR SAR im-
ages,” IEEE J. Sel. Topics Appl. Earth Observ. Remote
Sens., vol. 4, no. 1, pp. 27–34, 2011.
[25] P. Gamba and M. Aldrighi, “SAR data classification of
urban areas by means of segmentation techniques and
ancillary optical data,” IEEE J. Sel. Topics Appl. Earth
Observ. Remote Sens., vol. 5, no. 4, pp. 1140–1148,
[26] D. Brunner, G. Lemoine, and L. Bruzzone, “Earthquake
damage assessment of buildings using VHR optical and
SAR imagery,” IEEE Trans. Geosci. Remote Sens., vol.
48, no. 5, pp. 2403–2420, 2010.
[27] T.-L. Wang and Y.-Q. Jin, “Postearthquake build-
ing damage assessment using multi-mutual information
from pre-event optical image and post-event SAR im-
age,” IEEE Geosc. Remote Sens. Letters, vol. 9, no. 3,
pp. 452–456, 2012.
[28] L. Mou, M. Schmitt, Y. Wang, and X. Zhu, “A CNN for
the identification of corresponding patches in SAR and
optical imagery of urban scenes,” in Proc. Joint Urban
Remote Sensing Event, 2017.
[29] F. Bovolo and L. Bruzzone, “The time variable in data
fusion: a change detection perspective,” IEEE Geosc.
Remote Sens. Magazine, vol. 3, no. 3, pp. 8–26, 2012.
[30] S. Rejifi, F. Chaabane, and F. Tupin, “Expert
knowledge-based method for satellite image time series
analysis and interpretation,” IEEE J. of Sel. Topics Appl.
Earth Obsev. Remote Sens., vol. 8, no. 5, pp. 2138–2150,
[31] X. Su, C. Deledalle, F. Tupin, and H. Sun, “NOR-
CAMA: Change analysis in SAR time series by like-
lihood ratio change matrix clustering,” ISPRS Journal
of Photogrammetry and Remote Sensing, vol. 101, pp.
247–261, 2015.
... Related are more complicated spectral-spatial fusion scenarios [17], [18], [19], [20], [21], [22], [23], [24] as well as spatio-temporal fusion techniques [25], [26]. For readers looking for a general overview of remote sensing data fusion, there are several wellcrafted review papers [27], [28], [1], [29], [30] that discuss technical challenges, solutions, applications, and trends. ...
Full-text available
Fusing satellite imagery acquired with different sensors has been a long-standing challenge of Earth observation, particularly across different modalities such as optical and Synthetic Aperture Radar (SAR) images. Here, we explore the joint analysis of imagery from different sensors in the light of representation learning: we propose to learn a joint embedding of multiple satellite sensors within a deep neural network. Our application problem is the monitoring of lake ice on Alpine lakes. To reach the temporal resolution requirement of the Swiss Global Climate Observing System (GCOS) office, we combine three image sources: Sentinel-1 SAR (S1-SAR), Terra MODIS and Suomi-NPP VIIRS. The large gaps between the optical and SAR domains and between the sensor resolutions make this a challenging instance of the sensor fusion problem. Our approach can be classified as a late fusion that is learnt in a data-driven manner. The proposed network architecture has separate encoding branches for each image sensor, which feed into a single latent embedding. I.e., a common feature representation shared by all inputs, such that subsequent processing steps deliver comparable output irrespective of which sort of input image was used. By fusing satellite data, we map lake ice at a temporal resolution of <1.5 days. The network produces spatially explicit lake ice maps with pixel-wise accuracies >91% (respectively, mIoU scores >60%) and generalises well across different lakes and winters. Moreover, it sets a new state-of-the-art for determining the important ice-on and ice-off dates for the target lakes, in many cases meeting the GCOS requirement.
... In the initial stage, the publication of newly involved countries/regions is usually less than that of developed countries/regions, thus leading to an increase in the values of the standard deviation and coefficient of variation year by year. To a certain extent, this result reflects that remote sensing has attracted the attention of scholars in an increasing number of countries/regions (Zhuang et al., 2013;Schmitt et al., 2017;Morales-Barquero et al., 2019). However, it is worth noting that this phenomenon may lead to a decrease in the spatial aggregation of remote sensing studies (Zhuang et al., 2013;Ma et al., 2019b;Jin and Li, 2019;Xu and Yang, 2020). ...
Full-text available
The development of remote sensing technology largely reflects the scientific research level of a country or region. Given that the quantity and quality of research works are important indicators for scientific prowess evaluation, exploratory spatial data analysis and scientometric analysis of remote sensing work published from 2012 to 2021 were performed in this study, utilizing the Web of Sciences database. This study probed the spatial distribution and spatiotemporal evolution at the country/regional level to reveal the spatiotemporal characteristics of knowledge spillover in remote sensing. According to the results, the global spatial distribution of research output in remote sensing presented a significant dispersion; the United States and China were the most active countries. During the study period, Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery was one of the most influential studies, both in the field of remote sensing and in the whole scientific community. With respect to the spatial evolution of research output in remote sensing, the gap between continents and the regional imbalance showed a downward trend, while Asia ranked first in the intracontinental disparity and Europe ranked last. For relevant countries/regions and institutions trying to optimize the spatial allocation of scientific and technological resources to narrow regional disparities, this study provides fundamental data and decision-making references.
... • Domain Gap. SAR and optical images reveal different characteristics of observed objects due to their different imaging mechanisms, and thus a large domain gap exists between them (Schmitt et al., 2017;Liu and Lei, 2018). Transferring the complementary information from a SAR image to compensate for the missing information in cloudy regions is non-trivial. ...
The challenge of the cloud removal task can be alleviated with the aid of Synthetic Aperture Radar (SAR) images that can penetrate cloud cover. However, the large domain gap between optical and SAR images as well as the severe speckle noise of SAR images may cause significant interference in SAR-based cloud removal, resulting in performance degeneration. In this paper, we propose a novel global-local fusion based cloud removal (GLF-CR) algorithm to leverage the complementary information embedded in SAR images. Exploiting the power of SAR information to promote cloud removal entails two aspects. The first, global fusion, guides the relationship among all local optical windows to maintain the structure of the recovered region consistent with the remaining cloud-free regions. The second, local fusion, transfers complementary information embedded in the SAR image that corresponds to cloudy areas to generate reliable texture details of the missing regions, and uses dynamic filtering to alleviate the performance degradation caused by speckle noise. Extensive evaluation demonstrates that the proposed algorithm can yield high quality cloud-free images and performs favorably against state-of-the-art cloud removal algorithms.
... Although increased accuracy is obtained with the merged data, combining radar and optical images has certain difficulties. The main handicaps are geometric and radiometric properties that differ in the two data types, which need to pay attention to before applying a fusion algorithm (Schmitt et al. 2017). ...
Classification for land cover mapping is of great importance for accurate analysis and temporal monitoring of natural resources. In this study, the classification process was carried out using four synthetic aperture radar (SAR) and optical satellite images obtained in different seasons at equal intervals within a year. In addition to combining optical and SAR data for classification, single optical and SAR images have been classified separately. Thus, the effect of combining SAR and optical images on classification accuracy was examined. Moreover, the normalized difference vegetation index (NDVI), which is a vegetation index, was added to the image data, and the seasonal effect on accuracy was examined for the region with dense vegetation. In classification, three different object-oriented classification algorithms, support vector machines (SVM), random forest algorithm (RF), and k-nearest neighbors algorithm (kNN), were used. Finally, the number of training samples used for classification was increased, and its effect on accuracy was revealed in the study. The lowest overall classification accuracy was found to be 40.46% with classification using single SAR images, while the highest classification accuracy was found to be 95.12% as a result of the classification of the image obtained by combined SAR and optical satellite images. Furthermore, an additional testing area was considered to validate the method, and consistent results were obtained in that area as well. As a result, monitoring of the natural resources with high accuracy has been discussed, considering the data sources, machine learning methods, and the seasonal effects.
... he complementary nature of optical and SAR images has now drawn increasing attention to fuse the two data sources into one analysis stream [1,2,3]. However for accurate and successful co-analysis, they need to be geometrically coregistered beforehand. ...
Full-text available
To fully explore the complementary information from optical and synthetic aperture radar (SAR) imageries, they need firstly to be co-registered with high accuracy. Due to the vast radiometric and geometric disparity, the problem to match high resolution optical and SAR images is quite challenging. The present deep learning based methods have shown advantages over the traditional approaches, but the performance increment is not significant. In this article, we explore better network framework for high resolution optical and SAR image matching from three aspects. Firstly, we propose an effective multilevel feature fusion method which helps to take advantage of both the low-level fine-grained features for precious feature location and the high-level semantic features for better discriminative ability. Secondly, a feature channel excitation procedure is conducted using a novel multi-frequency channel attention module, which is able to make image features of different types and multiple levels effectively collaborate with each other, and produce image matching features with high diversity. Thirdly, the self-adaptive weighting loss is introduced, with which, each sample is assigned with an adaptive weighting factor, and therefore information buried in all nearby samples can be better exploited. Under a pseudo-Siamese architecture, the proposed optical and SAR image matching network (OSMNet) is trained and tested on a large and diverse high resolution optical and SAR dataset. Extensive experiments demonstrate that each component of the proposed deep framework helps to improve the matching accuracy. Also, the OSMNet shows overwhelming superior to the state-of-the-art handcrafted approaches on imageries of different land-cover types.
Full-text available
Dünyadaki son yılların en büyük tehlikesi, iklim değişikliklerine sebep olan küresel ısınmadır. Küresel ısınmanın yarattığı sonuçlardan birçok doğal kaynak etkilenmektedir. Doğal kaynakların doğru analizi ve zamansal olarak izlenmesi için arazi örtüsü haritalaması için sınıflandırma büyük önem taşımaktadır. Bu çalışmada, bir sene içinde eşit aralıklarla farklı mevsimlerde elde edilen Sentetik Açıklıklı Radar (SAR) ve Optik uydu görüntüleri kullanılarak sınıflandırma işlemi gerçekleştirilmiştir. Sınıflandırma işlemi için optik ve SAR verilerinin birleştirilmesinin yanı sıra, yalnızca optik ve SAR görüntüleri de ayrı olarak sınıflandırma işlemine tabi tutulmuştur. Böylelikle SAR ve optik görüntülerinin birleştirilmesinin sınıflandırma doğruluğuna olan etkisi incelenmiştir. Ayrıca bir bitki indeksi olan Normalize Edilmiş Fark Bitki Örtüsü İndeksi (NDVI, Normalised Difference Vegetation Index) görüntü verilerine eklenmiş olup bitki örtüsünün yoğun bulunduğu bölge için mevsimsel değişimlere bağlı olarak doğruluğa olan etkisi incelenmiştir. Sınıflandırma için nesne tabanlı yaklaşım kullanılmış olup, üç farklı sınıflandırma algoritması kullanılmıştır. Bunlar Destek Vektör Makineleri (DVM), Rastgele Orman Algoritması (RO) ve K-En Yakın Komşuluk (EYK) algoritmasıdır. Son olarak sınıflandırma için kullanılan eğitim örnekleri sayısı arttırılmış ve doğruluğa olan etkisi çalışmada ortaya konulmuştur. En düşük genel sınıflandırma doğruluğu, yalnızca SAR görüntüleri kullanılarak yapılan sınıflandırma ile %40.46 olarak elde edilmiştir. En yüksek sınıflandırma doğruluğu ise, SAR ve optik uydu görüntülerinin birleştirilmesi ile elde edilen görüntünün sınıflandırılması sonucu %95.90 olarak bulunmuştur. Ayrıca yapılan sınıflandırmaları doğrulamak için yeni bir test alanında sınıflandırmalar yapılmıştır. Bulunan test sonuçları, ana sınıflandırma sonuçları ile tutarlı olmuştur. Yapılan çalışmada arazi örtüsündeki zamansal değişime bağlı sınıflandırma doğruluğunun kullanılan girdi verileri ile ilişkisi de incelenmiştir. Böylelikle, korunması gereken doğal kaynakların mevsimsel etkileri dikkate alarak yüksek doğruluk ile izlenmesi için ihtiyaç duyulan veri kaynakları ve makine öğrenmesi yöntemleri ortaya konulmuştur.
Full-text available
Self-Supervised learning (SSL) has become the new state of the art in several domain classification and segmentation tasks. One popular category of SSL are distillation networks such as Bootstrap Your Own Latent (BYOL). This work proposes RS-BYOL, which builds on BYOL in the remote sensing (RS) domain where data are non-trivially different from natural RGB images. Since multi-spectral (MS) and synthetic aperture radar (SAR) sensors provide varied spectral and spatial resolution information, we utilise them as an implicit augmentation to learn invariant feature embeddings. In order to learn RS based invariant features with SSL, we trained RS-BYOL in two ways, i.e. single channel feature learning and three channel feature learning. This work explores the usefulness of single channel feature learning from random 10 MS bands of 10m-20 m resolution and VV-VH of SAR bands compared to the common notion of using three or more bands. In our linear probing evaluation, these single channel features reached a 0.92 F1 score on the EuroSAT classification task and 59.6 mIoU on the IEEE Data Fusion Contest (DFC) segmentation task for certain single bands. We also compare our results with ImageNet weights and show that the RS based SSL model outperforms the supervised ImageNet based model. We further explore the usefulness of multi-modal data compared to single modality data, and it is shown that utilising MS and SAR data allows better invariant representations to be learnt than utilising only MS data.
The challenge of the cloud removal task can be alleviated with the aid of Synthetic Aperture Radar (SAR) images that can penetrate cloud cover. However, the large domain gap between optical and SAR images as well as the severe speckle noise of SAR images may cause significant interference in SAR-based cloud removal, resulting in performance degeneration. In this paper, we propose a novel global–local fusion based cloud removal (GLF-CR) algorithm to leverage the complementary information embedded in SAR images. Exploiting the power of SAR information to promote cloud removal entails two aspects. The first, global fusion, guides the relationship among all local optical windows to maintain the structure of the recovered region consistent with the remaining cloud-free regions. The second, local fusion, transfers complementary information embedded in the SAR image that corresponds to cloudy areas to generate reliable texture details of the missing regions, and uses dynamic filtering to alleviate the performance degradation caused by speckle noise. Extensive evaluation demonstrates that the proposed algorithm can yield high quality cloud-free images and outperform state-of-the-art cloud removal algorithms with a gain about 1.7 dB in terms of PSNR on SEN12MS-CR dataset.
Land cover classification (LCC) is an important application in remote sensing data interpretation and invariably faces big intra-class variance and sample imbalance in remote sensing images. The optical image is obtained by satellites capturing the spectral information of the Earth’s surface, and the synthetic aperture radar (SAR) image is produced by the satellite actively transmitting and receiving the electromagnetic wave signals reflected from land covers. Because of the limitations of the optical image, a single modality (optical image) might be disturbed by external conditions, especially complex weather. Using heterogeneous SAR and optical images for LCC can reduce the negative impact caused by single-modal data damage, and multi-modal data can also be used as supplementary information to enhance classification accuracy. However, general LCC methods mainly focus on remote sensing data of a single modality without fully considering the multi-modalities of land covers. Therefore, we propose a dual-stream deep high-resolution network (DDHRNet) to deeply integrate SAR and optical data at the feature level in every branch. The network can effectively exploit the complementary information in heterogeneous images. It improves classification performance and achieves significant improvements in the classification of clouded images. A multi-modal squeeze-and-excitation (MSE) module is also utilized to fuse the features. Compared with the ordinary methods, MSE modules can lead to an improvement of about 1% to 5% in overall accuracy (OA), Kappa coefficients, and mean intersection over union (mIoU). Besides, in order to evaluate our method, we describe in detail the preprocessing process of Gaofen-2 (GF2) and Gaofen-3 (GF3) data before they are used in the LCC task. The experiments show that the proposed method performs well compared with other excellent segmentation methods and obtains the best performance on heterogeneous images from GF2 and GF3. The code and datasets are available at:
Conference Paper
Full-text available
In this paper we propose a convolutional neural network (CNN), which allows to identify corresponding patches of very high resolution (VHR) optical and SAR imagery of complex urban scenes. Instead of a siamese architecture as conventionally used in CNNs designed for image matching, we resort to a pseudo-siamese configuration with no interconnection between the two streams for SAR and optical imagery. The network is trained with automatically generated training data and does not resort to any hand-crafted features. First evaluations show that the network is able to predict corresponding patches with high accuracy, thus indicating great potential for further development to a generalized multi-sensor matching procedure.
Full-text available
Characterized by a certain focus on the heavily discussed topic of image fusion in its beginnings, sensor data fusion has played a significant role in the remote sensing research community for a long time. With this article, we aim to provide a short overview of established definitions, targeting a generalized understanding of the topic. In addition, a review of the state of the art of remote sensing data fusion research is given. By bringing together the conventional view expressed in the classical data fusion community and a review of current activities in the field of Earth observation, this article provides a holistic view of generic data fusion concepts and their applicability to the remote sensing domain.
Full-text available
Using synthetic aperture radar (SAR) interferometry to monitor long-term millimeter-level deformation of urban infrastructures, such as individual buildings and bridges, is an emerging and important field in remote sensing. In the state-of-the-art methods, deformation parameters are retrieved and monitored on a pixel basis solely in the SAR image domain. However, the inevitable side-looking imaging geometry of SAR results in undesired occlusion and layover in urban area, rendering the current method less competent for a semantic-level monitoring of different urban infrastructures. This paper presents a framework of a semantic-level deformation monitoring by linking the precise deformation estimates of SAR interferometry and the semantic classification labels of optical images via a 3-D geometric fusion and semantic texturing. The proposed approach provides the first "SARptical" point cloud of an urban area, which is the SAR tomography point cloud textured with attributes from optical images. This opens a new perspective of InSAR deformation monitoring. Interesting examples on bridge and railway monitoring are demonstrated.
Full-text available
This paper introduces a method for road network extraction from satellite images. The proposed approach covers a new fusion method (using data from multiple sources) and a new Markov random field (MRF) defined on connected components along with a multilevel application (two levels MRF). Our method allows the detection of roads with different characteristics and decreases by around 30% the size of the used graph model. Results for synthetic aperture radar (SAR) images and optical images obtained using the TerraSAR-X and Quickbird sensors, respectively, are presented demonstrating the improvement brought by the proposed approach. In a second part, an analysis of different types of data fusion combining optical/radar images, radar/radar images and multitemporal SAR (TerraSAR-X and COSMO-SkyMed) images is described. The qualitative and quantitative results show that the fusion approach improves considerably the results of the road network extraction.
Full-text available
We propose a new synthetic aperture radar (SAR) despeckling technique based on nonlocal filtering and driven by a coregistered optical image. A preliminary homogeneous versus heterogeneous classification of the image is used to decide where the optical guide can be safely used, thus preventing any distortion of the SAR geometry. Even in regions where the use of optical data is enabled, despeckling is carried out exclusively in the SAR domain, and the optical guide is used only to improve the predictor selection in nonlocal filtering and, hence, in the estimation process. Experiments on real-world imagery confirm the potential of the proposed approach.
Conference Paper
Optical and Radar images cover two distinct aspects in the satellite image analysis. Optical images giving more semantic information derived e.g. from multispectral data, while the radar being more versatile due to independence on cloudy and night scenes. Due to the different detection methods can expect that the fusion of these different data types can lead to improvements in the overall information from the observed scene. We define, that the information content (IC) of a set of (multispectral) images can be optimally derived from the data, if spatial and spectral resolution is adequate to the task that has to be solved. Furthermore the information is masked by typical sensor smear and noise. Thus the information, which can be derived from remote sensing imagery, depends on the system performance or image quality (IQ). By the additional use of radar data (e.g. for classification), often no significant improvement in the result is visible. Thus, we expect a drastic improvement of IC in the fused image, if the IQ of the two sets is comparable. This can be also analyzed in terms of image quality (IQ) for the fused data. The main purpose of this contribution is to achieve a number representing IQ, as e.g. the National Image Interpretability Rating Scale (NIIRS). The chosen fusion algorithm was the Principal Component Analysis (PCA) applied and validate on different area sets in Germany.
This paper presents an overview on the image fusion concept in the context of multitemporal remote sensing image processing. In the remote sensing literature, multitemporal image analysis mainly deals with the detection of changes and land-cover transitions. Thus the paper presents and analyses the most relevant literature contributions on these topics. From the perspective of change detection and detection of land-cover transitions, multitemporal image analysis techniques can be divided into two main groups: i) those based on the fusion of the multitemporal information at feature level, and ii) those based on the fusion of the multitemporal information at decision level. The former mainly exploit multitemporal image comparison techniques, which aim at highlighting the presence/absence of changes by generating change indices. These indices are then analyzed by unsupervised algorithms for extracting the change information. The latter rely mainly on classification and include both supervised and semi/partially-supervised/unsupervised methods. The paper focuses the attention on both standard (and largely used) methods and techniques proposed in the recent literature. The analysis is conducted by considering images acquired by optical and SAR systems at medium, high and very high spatial resolution.
For many remote-sensing applications, there is usually a gap between the automatic analysis techniques and the direct expert interpretation. This semantic gap is all the more critical as the amount and diversity of satellite data increase. In this context, an important challenge is the integration of expert knowledge in automatic satellite image time series (SITS) analysis to improve results’ reliability and precision. In this paper, we propose an original expert knowledge-based SITS analysis technique for land-cover monitoring and region dynamics assessing. Particularly, we are interested in extracting region temporal evolution similar to a given scenario proposed by the user, which can be useful in many applications such as urbanization and forest regions’ monitoring. As a first step, with the formalization and exploitation of the expert semantic information, we construct a multitemporal knowledge base describing the remote-sensing scene ontology. Then, the temporal evolution of each region in the SITS is modeled by means of graph theory. Finally, given a user scenario, the most similar region temporal evolution is recognized using the marginalized graph kernel (MGK) similarity measure.
This paper presents a likelihood ratio test based method of change detection and classification for synthetic aperture radar (SAR) time series, namely NORmalized Cut on chAnge criterion MAtrix (NORCAMA). This method involves three steps: (1) multi-temporal pre-denoising step over the whole image series to reduce the effect of the speckle noise; (2) likelihood ratio test based change criteria between two images using both the original noisy images and the denoised images; (3) change classification by a normalized cut based clustering-and-recognizing method on change criterion matrix (CCM). The experiments on both synthetic and real SAR image series show the effective performance of the proposed framework.