To read the full-text of this research, you can request a copy directly from the authors.
... Conventional data collection methods like field surveys and expert assessments are expensive and time-consuming especially when cloud coverage obscure critical information [10]. Spectral channels, while rich in information, complicate image processing due to adversarial environmental conditions like cloud cover and seasonal crop variations [11]. ...
Innovation management plays a pivotal role in harnessing advanced technologies to drive progress across diverse fields. In this context, integrating deep learning models within remote sensing technologies presents transformative potential for monitoring, change detection, analysis, and decision-making in fields such as agriculture, urban planning, and environmental studies. This study examines the role of sophisticated deep learning approaches in analyzing high-resolution satellite imagery to improve the detection of agricultural greenhouses. Using MMSegmentation (DeepLabv3Plus) with multispectral data at 0.7-meter resolution, the research addresses the limitations of traditional methods by substantially enhancing detection accuracy and efficiency. To address data scarcity and increase model robustness, advanced data augmentation techniques—such as rotations, scaling, and flipping - expand dataset diversity, fostering adaptability and performance under diverse conditions. The study also assesses the impact of environmental factors, including seasonal variations and weather, on model performance. Suggested improvements include expanding the dataset to encompass a wider variety of greenhouse types and conditions, incorporating high-resolution or hyperspectral imagery for finer details, and building multi-temporal datasets to capture dynamic environmental changes. The findings underscore the importance of advanced innovation management in enhancing remote sensing applications, offering actionable insights for agricultural management in Albania and similar regions. This research contributes to the broader field of innovation management by showcasing how deep learning can revolutionize practical applications in agriculture
... Accordingly, LSP metrics are usually detected by either thresholding the temporal VIs curve, or by exploiting its time derivatives [5]. One main limitation is the presence of clouds, which can severely obscure the observed scene causing the lack of several acquisitions [6]. ...
A novel framework to estimate crop phenology with remote sensing measurements is conceived in this study. We introduce the parametrized Grid-Based Filter (pGBF), that uses the plant age of the crops as a parameter to model the phenology dynamics. Accordingly, the non-stationary transition matrix is defined to characterize the transitions between discrete phenological stages in numerical scale. At the update step, the combination between the observation and the predicted parametrized evolution, is exploited to jointly estimate both phenology and plant age. The method is applied on dense time series of dual-pol VV-VH Sentinel-1 (S1) radar images to estimate the phenological stages of rice (defined in the BBCH scale) in Sevilla, Spain, during four annual seasons, from 2017 to 2020. Phenology is accurately estimated when the evolution model is trained with actual ground data, showing
. Regarding the plant age, when more than four S1 images are considered, its estimates reach stable and accurate values (with a maximum error of 5 days), exhibiting
close to unity. A comparison with the previous GBF shows that this is outperformed by the proposed estimation strategy, confirming the effectiveness of the proposed dynamical framework in obtaining improved and realistic phenology estimates, according to the particular time epoch in the crops life. Finally, when the plant age is unknown, which is the case on most real monitoring scenarios, our method performs significantly better, indicating that the knowledge of sowing date is no longer needed during the remote sensing-based phenology estimation.
Satellite image processing often enhances feature perception and visibility for exact feature identification. Additionally, the change improves remote sensing data quality and clarity. Due to their long distances, these photos often have weather-related noise and distortion that must be fixed. This paper addresses these challenges with a Deep Learning (DL) satellite image denoising technique. An Adaptive Genghis Khan Shark Algorithm (AGKSA) and a Convolutional Denoised Autoencoder are used to improve satellite images in this study. The CDAE settings are optimized using the AGKSA while keeping important image attributes. AGKSA is used in this hybrid model to optimize CDAE performance under various noise conditions. Experimental results showed that the proposed model outperforms traditional image denoising approaches in satellite imagery feature extraction and visual quality. In very precise image processing applications like environmental monitoring and land use classification, the CDAE-AGKSA works well.
The Landsat mission has captured images of the Earth’s surface for over 50 years, and the data have enabled researchers to investigate a vast array of different change phenomena using machine learning models. Landsat-based monitoring research has been influential in geography, forestry, hydrology, ecology, agriculture, geology, and public health. When monitoring Earth's surface change using Landsat data and machine learning, it is essential to consider the implications of the size of the study area, specifics of the machine learning model, and image temporal density. We found that there are two general approaches to Landsat change detection analysis with machine learning: post-classification comparison and sequential imagery stack approaches. The two approaches have different advantages, and the design of an appropriate type of Landsat change detection analysis depends on the task at hand and the available computing resources. This review provides an overview of different approaches used to apply machine learning to Landsat change detection analysis, outlines a framework for understanding the relevant considerations, and discusses recent developments such as generative artificial intelligence, explainable machine learning, and ethical analysis considerations.
Crop yield forecasting plays a significant role in addressing growing concerns about food security and guiding decision-making for policymakers and farmers. When deep learning is employed, understanding the learning and decision-making processes of the models, as well as their interaction with the input data, is crucial for establishing trust in the models and gaining insight into their reliability. In this study, we focus on the task of crop yield prediction, specifically for soybean, wheat, and rapeseed crops in Argentina, Uruguay, and Germany. Our goal is to develop and explain predictive models for these crops, using a large dataset of satellite images, additional data modalities, and crop yield maps. We employ a long short-term memory network and investigate the impact of using different temporal samplings of the satellite data and the benefit of adding more relevant modalities. For model explainability, we utilize feature attribution methods to quantify input feature contributions, identify critical growth stages, analyze yield variability at the field level, and explain less accurate predictions. The modeling results show an improvement when adding more modalities or using all available instances of satellite data. The explainability results reveal distinct feature importance patterns for each crop and region. We further found that the most influential growth stages on the prediction are dependent on the temporal sampling of the input data. We demonstrated how these critical growth stages, which hold significant agronomic value, closely align with the existing literature in agronomy and crop development biology.
Clouds in remote sensing optical images often obscure essential information. They may lead to occlusion or distortion of ground features, thereby affecting the subsequent analysis and extraction of target information. Therefore, the removal of clouds in optical images is a critical task in various applications. SAR-optical image fusion has achieved encouraging performance in the reconstruction of cloud-covered information. Such methods, however, are extremely time-consuming and computationally intensive, making them difficult to apply in practice. This letter proposes a novel Feature Pyramid Network (FPNet) that effectively reconstructs the missing optical information. FPNet enables the extraction and fusion of multi-scale features from the SAR image and the cloudy optical image, as the FPNet leverages the power of convolutional neural networks by merging the feature maps from different scales. It can learn useful features efficiently because it downsamples the input images while preserving important information, thus reducing the computational workload. Experiments are conducted on a benchmark global SEN12MS-CR dataset and a regional South Sudan dataset. Results are compared with those of state-of-the-art methods such as DSen2-CR and GLF-CR. The experimental results demonstrate that FPNet accomplishes superior performance in terms of accuracy and visual effects. Both the inference and training speeds of FPNet are fast. Specifically, it runs at 96 FPS and requires less than four hours to train a single epoch using SEN12MS-CR on two 2080ti GPUs. Therefore, it is suitable for applying to various study areas.
Climate warming-driven temporal shifts in phenology are widely recognised as the foremost footprint of global environmental change. In this regard, concerted research efforts are being made worldwide to monitor and assess the plant phenological responses to climate warming across species, ecosystems and seasons. Here, we present a global synthesis of the recent scientific literature to assess the progress made in this area of research. To achieve this, we conducted a systematic review by following PRISMA protocol, which involved rigorous screening of 9476 studies on the topic and finally selected 215 studies for data extraction. The results revealed that woody species, natural ecosystems and plant phenological responses in spring season have been predominantly studied, with the herbaceous species, agricultural ecosystems and other seasons grossly understudied. Majority of the studies reported phenological advancement (i.e., preponement) in spring, followed by also advancement in summer but delay in autumn. Methodology-wise, nearly two -third of the studies have employed direct observational approach, followed by herbarium-based and experimental approaches, with the latter covering least temporal depth. We found a steady increase in research on the topic over the last decade with a sharp increase since 2014. The global country-wide scientific output map highlights the huge geographical gaps in this area of research, particularly in the biodiversity-rich tropical regions of the developing world. Based on the findings of this global synthesis, we identify the current knowledge gaps and suggest future directions for this emerging area of research in an increasingly warming world.
The worldwide variation in vegetation height is fundamental to the global carbon cycle and central to the functioning of ecosystems and their biodiversity. Geospatially explicit and, ideally, highly resolved information is required to manage terrestrial ecosystems, mitigate climate change and prevent biodiversity loss. Here we present a comprehensive global canopy height map at 10 m ground sampling distance for the year 2020. We have developed a probabilistic deep learning model that fuses sparse height data from the Global Ecosystem Dynamics Investigation (GEDI) space-borne LiDAR mission with dense optical satellite images from Sentinel-2. This model retrieves canopy-top height from Sentinel-2 images anywhere on Earth and quantifies the uncertainty in these estimates. Our approach improves the retrieval of tall canopies with typically high carbon stocks. According to our map, only 5% of the global landmass is covered by trees taller than 30 m. Further, we find that only 34% of these tall canopies are located within protected areas. Thus, the approach can serve ongoing efforts in forest conservation and has the potential to foster advances in climate, carbon and biodiversity modelling.
In an industrial environment, measuring and reconstructing metal objects using computer vision methods can be affected by surface highlight reflections, leading to inaccurate results. In this article, we propose a novel network with broad applicability for removing highlight reflections based on dynamic highlight masks, which is suitable for industrial metal highlight images where a baseline image with no highlight reflections is not available. First, we use a pretrained model to learn the highlight features of the metal surface, and then construct an adaptive dynamic highlight soft mask that allows texture features in highlight regions to be preserved while removing highlights. Second, we adjust the shape and shift of convolution operations using an adaptive highlight mask to better match the metal surface structure and avoid unnatural transitions at the edges of highlights in partially convolutional networks. We conducted quantitative evaluations on synthetic and real-world highlight datasets to demonstrate the effectiveness of our method. Compared with the state-of-the-art methods, our method shows improvements of over 0.4 in terms of correlation and Bhattacharyya distance of the histogram curves for synthesized highlight datasets. In terms of nonhighlight region invariance of real industrial highlight datasets, our method shows improvements of over 8, 0.1, and 23 points in terms of peak signal-to-noise ratio, structural similarity, and mean squared error, respectively.
Due to the limitation of optical images that their waves cannot penetrate clouds, such images always suffer from cloud contamination, which causes missing information and limitations for subsequent agricultural applications, among others. Synthetic aperture radar (SAR) is able to provide surface information for all times and all weather. Therefore, translating SAR or fusing SAR and optical images to obtain cloud-free optical-like images are ideal ways to solve the cloud contamination issue. In this paper, we investigate the existing literature and provides two kinds of taxonomies, one based on the type of input and the other on the method used. Meanwhile, in this paper, we analyze the advantages and disadvantages while using different data as input. In the last section, we discuss the limitations of these current methods and propose several possible directions for future studies in this field.
We propose a new architecture based on a fully connected feed-forward Artificial Neural Network (ANN) model to estimate surface soil moisture from satellite images on a large alluvial fan of the Kosi River in the Himalayan Foreland. We have extracted nine different features from Sentinel-1 (dual-polarised radar backscatter), Sentinel-2 (red and near-infrared bands), and Shuttle Radar Topographic Mission (digital elevation model) satellite products by leveraging the linear data fusion and graphical indicators. We performed a feature importance analysis by using the regression ensemble tree approach and also feature sensitivity to evaluate the impact of each feature on the response variable. For training and assessing the model performance, we conducted two field campaigns on the Kosi Fan in December 11–19, 2019 and March 01–06, 2022. We used a calibrated TDR probe to measure surface soil moisture at 224 different locations distributed throughout the fan surface. We used input features to train, validate, and test the performance of the feed-forward ANN model in a 60:10:30 ratio, respectively. We compared the performance of ANN model with ten different machine learning algorithms [i.e., Generalised Regression Neural Network (GRNN), Radial Basis Network (RBN), Exact RBN (ERBN), Gaussian Process Regression (GPR), Support Vector Regression (SVR), Random Forest (RF), Boosting Ensemble Learning (Boosting EL), Recurrent Neural Network (RNN), Binary Decision Tree (BDT), and Automated Machine Learning (AutoML)]. We observed that the ANN model accurately predicts the soil moisture and outperforms all the benchmark algorithms with correlation coefficient (R = 0.80), Root Mean Square Error (RMSE = 0.040 m3/m3), and bias = 0.004 m3/m3. Finally, for an unbiased and robust conclusion, we performed spatial distribution analysis by creating thirty different sets of training-validation-testing datasets. We observed that the performance remains consistent in all thirty scenarios. The outcomes of this study will foster new and existing applications of soil moisture.
Models for bathymetry retrieval from multispectral images have not considered the errors caused by tidal fluctuation. A rigorous bathymetric model that considers the variation in tide height time series, including the tide height calculation and instantaneous tide height correction at the epoch of satellite flight into the bathymetric retrieval model, is proposed in this paper. The model was applied on Weizhou Island, located in Guangxi Province, China, and its accuracy verificated with four check lines and seven checkpoints. A scene from the Landsat 8 satellite image was used as experimental data. The reference (“true”) water depth data collected by a RESON SeaBat 7125 multibeam instrument was used for comparison analysis. When satellite-derived bathymetry is compared, it is found that maximum absolute error, mean absolute error, and RMSE have decreased 54, 45, and 30% relative to that of the traditional model in the entire test field. The accuracy of the water depths retrieved by our model increased 30 and 56% when validated using four check lines and seven checkpoints, respectively. Therefore, it can be concluded that the model proposed in this paper can effectively improve the accuracy of bathymetry retrieved from Landsat 8 images.
Building a more resilient food system for sustainable development and reducing uncertainty in global food markets both require concurrent and near-real-time and reliable crop information for decision-making. Satellite-driven crop monitoring has become a main method to derive crop information at local, regional, and global scales by revealing the spatial and temporal dimensions of crop growth status and production. However, there is a lack of quantitative, objective, and robust methods to ensure the reliability of crop information, which reduces the applicability of crop monitoring and leads to uncertain and undesirable consequences. In this paper, we review recent progress in crop monitoring and identify the challenges and opportunities in future efforts. We find that satellite-derived metrics do not fully capture determinants of crop production and do not quantitatively interpret crop growth status; the latter can be advanced by integrating effective satellite-derived metrics and new onboard sensors. We have identified that in situ data accessibility and the negative effects of knowledge-based analyses are two essential issues in crop monitoring that reduce the applicability of crop monitoring information for decisions on food security. Crowdsourcing is one solution to overcome the restrictions of ground truth data accessibility. We argue that user participation in the complete process of crop monitoring could improve the reliability of crop information. Allowing users to obtain crop information from multiple sources could prevent unconscious biases. Finally, there is a need to avoid conflicts of interest in publishing publicly available crop information.
The learning-based methods for single image super-resolution (SISR) can reconstruct realistic details, but they suffer severe performance degradation for low-light images because of their ignorance of negative effects of illumination, and even produce overexposure for unevenly illuminated images. In this paper, we pioneer an anti-illumination approach toward SISR named Light-guided and Cross-fusion U-Net (LCUN), which can simultaneously improve the texture details and lighting of low-resolution images. In our design, we develop a U-Net for SISR (SRU) to reconstruct super- resolution (SR) images from coarse to fine, effectively suppressing noise and absorbing illuminance information. In particular, the proposed Intensity Estimation Unit (IEU) generates the light intensity map and innovatively guides SRU to adaptively brighten inconsistent illumination. Further, aiming at efficiently utilizing key features and avoiding light interference, an Advanced Fusion Block (AFB) is developed to cross-fuse low-resolution features, reconstructed features and illuminance features in pairs. Moreover, SRU introduces a gate mechanism to dynamically adjust its composition, overcoming the limitations of fixed-scale SR. LCUN is compared with the retrained SISR methods and the combined SISR methods on low-light and uneven-light images. Extensive experiments demonstrate that LCUN advances the state-of-the-arts SISR methods in terms of objective metrics and visual effects, and it can reconstruct relatively clear textures and cope with complex lighting.
Cloud cover remains a significant limitation to a broad range of applications relying on optical remote sensing imagery, including crop identification/yield prediction, climate monitoring, and land cover classification. A common approach to cloud removal treats the problem as an inpainting task and imputes optical data in the cloud-affected regions employing either mosaicing historical data or making use of sensing modalities not impacted by cloud obstructions, such as SAR. Recently, deep learning approaches have been explored in these applications; however, the majority of reported solutions rely on external learning practices, i.e., models trained on fixed datasets. Although these models perform well within the context of a particular dataset, a significant risk of spatial and temporal overfitting exists when applied in different locations or at different times. Here, cloud removal was implemented within an internal learning regime through an inpainting technique based on the deep image prior. The approach was evaluated on both a synthetic dataset with an exact ground truth, as well as real samples. The ability to inpaint the cloud-affected regions for varying weather conditions across a whole year with no prior training was demonstrated, and the performance of the approach was characterised.
Mapping the within-field variability of crop status is of great importance in precision agriculture, which seeks to balance agronomic inputs with spatial crop demands. Satellite imagery and the delineation of management zones based on remote sensing plays a key role. However, satellite imagery is dependent on a cloud-free view, which is especially challenging in temperate regions such as Northern Europe. This disadvantage can be overcome with unmanned aerial vehicles (UAV), which provide an alternative to satellites. An investigation was conducted to establish whether UAV imagery can generate similar crop heterogeneity maps to satellites (Sentinel 2) and the extent to which crop heterogeneity and management zones can be reproduced by repeated data collection within short time intervals. Three winter wheat fields were monitored during the growing season. Two vegetation indices (NDVI and MSAVI2) based on red and near-infrared (NIR) reflectance were calculated to delineate fields into five management zones based on NDVI raster maps using quintiles. The Pearson correlation coefficient, the Nash-Sutcliffe agreement coefficient and the smallest real difference coefficient (SRD), also called the reproducibility coefficient were used to evaluate the reproducibility. NDVI and MSAVI2 gave similar results, but NDVI was a slightly better descriptor of crop heterogeneity after canopy closure and NDVI was used for the remainder of the study. The results showed that substitution of satellite data with UAV data resulted in an average reclassification of 10 m by 10 m management zones corresponding to 58% of the total field area. Reclassification means that management pixels were classified differently according to origin of images. Repeated satellite and UAV imagery resulted in 39% and 47% reclassification, respectively. The results showed that the reproduction of remote sensing data with different sensor systems added more measurement error to measurements than was the case with repeated measurements using the same sensor systems. In this study, SRD averaged 2.5 management zones, which means that differences up to 2.5 management zones were within the measurement error. This paper discusses the practical aspects of these findings and clarifies that the reclas-sification of management zones is depending on the heterogeneity of the studied fields. Therefore, the achieved results may not be generalized but the presented methodology can be used in future studies.
Optical remote sensing imagery is at the core of many Earth observation activities. The regular, consistent and global-scale nature of the satellite data is exploited in many applications, such as cropland monitoring, climate change assessment, land-cover and land-use classification, and disaster assessment. However, one main problem severely affects the temporal and spatial availability of surface observations, namely cloud cover. The task of removing clouds from optical images has been subject of studies since decades. The advent of the Big Data era in satellite remote sensing opens new possibilities for tackling the problem using powerful data-driven deep learning methods.
In this paper, a deep residual neural network architecture is designed to remove clouds from multispectral Sentinel-2 imagery. SAR-optical data fusion is used to exploit the synergistic properties of the two imaging systems to guide the image reconstruction. Additionally, a novel cloud-adaptive loss is proposed to maximize the retainment of original information. The network is trained and tested on a globally sampled dataset comprising real cloudy and cloud-free images. The proposed setup allows to remove even optically thick clouds by reconstructing an optical representation of the underlying land surface structure.
The existence of clouds is one of the main factors that contributes to missing information in optical remote sensing images, restricting their further applications for Earth observation, so how to reconstruct the missing information caused by clouds is of great concern. Inspired by the image-to-image translation work based on convolutional neural network model and the heterogeneous information fusion thought, we propose a novel cloud removal method in this paper. The approach can be roughly divided into two steps: in the first step, a specially designed convolutional neural network (CNN) translates the synthetic aperture radar (SAR) images into simulated optical images in an object-to-object manner; in the second step, the simulated optical image, together with the SAR image and the optical image corrupted by clouds, is fused to reconstruct the corrupted area by a generative adversarial network (GAN) with a particular loss function. Between the first step and the second step, the contrast and luminance of the simulated optical image are randomly altered to make the model more robust. Two simulation experiments and one real-data experiment are conducted to confirm the effectiveness of the proposed method on Sentinel 1/2, GF 2/3 and airborne SAR/optical data. The results demonstrate that the proposed method outperforms state-of-the-art algorithms that also employ SAR images as auxiliary data.
The availability of curated large-scale training data is a crucial factor for the development of well-generalizing deep learning methods for the extraction of geoinformation from multi-sensor remote sensing imagery. While quite some datasets have already been published by the community, most of them suffer from rather strong limitations, e.g. regarding spatial coverage, diversity or simply number of available samples. Exploiting the freely available data acquired by the Sentinel satellites of the Copernicus program implemented by the European Space Agency, as well as the cloud computing facilities of Google Earth Engine, we provide a dataset consisting of 180,662 triplets of dual-pol synthetic aperture radar (SAR) image patches, multi-spectral Sentinel-2 image patches, and MODIS land cover maps. With all patches being fully georeferenced at a 10 m ground sampling distance and covering all inhabited continents during all meteorological seasons, we expect the dataset to support the community in developing sophisticated deep learning-based approaches for common tasks such as scene classification or semantic segmentation for land cover mapping.
Four time-lapse cameras, Bushnell Nature View HD Camera (Bushnell, Overland Park, KS, USA) were installed in a soybean field to track the response of soybean plants to changing weather conditions. The purpose was to confirm if visible spectroscopy can provide useful data for tracking the condition of crops and, if so, whether game and trail time-lapse cameras can serve as reliable crop sensing and monitoring devices. Using the installed cameras, images were taken at 30-min intervals between July 22 and August 1, 2015. Using the RGBExcel software application developed in-house, image data from the R (red), G (green), and B (blue) bands were exported to Microsoft Excel for further processing and analysis. Daytime adjusted green red index data for the plant, based on the R and G data, were plotted against time of image acquisition and also regressed with selected weather parameters. The former showed a rise-and-fall trend with daily peaks around 13:00, while the latter showed a decreasing order of correlation with weather variables as follows: log of solar radiation > log of soil surface temperature > log of air temperature > log of soil temperature at 50-mm depth > log of relative humidity. Despite some low correlations, the potential for using game and trail cameras with time-lapse capability to track changes in crop vegetation response under varying conditions is established. The resulting data can be used to develop models that can aid precision agriculture applications. This can be further explored in future studies.
Reliable identification of clouds is necessary for any type of optical remote sensing image analysis, especially in operational and fully automatic setups. One of the most elaborated and widespread algorithms, namely Fmask, was initially developed for the Landsat suite of satellites. Despite their similarity, application to Sentinel-2 imagery is currently hampered by the unavailability of a thermal band, and although results can be improved when taking the cirrus band into account, Sentinel-2 cloud detections are unsatisfactory in two points. (1) Low altitude clouds can be undetectable in the cirrus band, and (2) bright land surfaces – especially built-up structures – are often misclassified as clouds when only considering spectral information. In this paper, we present the Cloud Displacement Index (CDI), which makes use of the three highly correlated near infrared bands that are observed with different view angles. Hence, elevated objects like clouds are observed under a parallax and can be reliably separated from bright ground objects. We compare CDI with the currently used cloud probabilities, and propose how to integrate this new functionality into the Fmask algorithm. We validate the approach using test images over metropolitan areas covering a wide variety of global environments and climates, indicating the successful separation of clouds and built-up structures (overall accuracy 95%, i.e. an improvement in overall accuracy of 0.29–0.39 compared to the previous Fmask versions over the 20 test sites), and hence a full compensation for a missing thermal band.
In this paper, we propose a method for cloud removal from visible light RGB satellite images by extending the conditional Generative Adversarial Networks (cGANs) from RGB images to multispectral images. Satellite images have been widely utilized for various purposes, such as natural environment monitoring (pollution, forest or rivers), transportation improvement and prompt emergency response to disasters. However, the obscurity caused by clouds makes it unstable to monitor the situation on the ground with the visible light camera. Images captured by a longer wavelength are introduced to reduce the effects of clouds. Synthetic Aperture Radar (SAR) is such an example that improves visibility even the clouds exist. On the other hand, the spatial resolution decreases as the wavelength increases. Furthermore, the images captured by long wavelengths differs considerably from those captured by visible light in terms of their appearance. Therefore, we propose a network that can remove clouds and generate visible light images from the multispectral images taken as inputs. This is achieved by extending the input channels of cGANs to be compatible with multispectral images. The networks are trained to output images that are close to the ground truth using the images synthesized with clouds over the ground truth as inputs. In the available dataset, the proportion of images of the forest or the sea is very high, which will introduce bias in the training dataset if uniformly sampled from the original dataset. Thus, we utilize the t-Distributed Stochastic Neighbor Embedding (t-SNE) to improve the problem of bias in the training dataset. Finally, we confirm the feasibility of the proposed network on the dataset of four bands images, which include three visible light bands and one near-infrared (NIR) band.
Due to serious cloud contamination in optical satellite images, it is hard to acquire continuous cloud-free satellite observations, which limits the potential utilization of the available images and further data extraction and analysis. Thus, information reconstruction in cloud-contaminated images and the reprocessing of continuous cloud-free images are urgently needed for global change science. Many previous studies use one cloud-free reference image or multitemporal reference images to restore a target cloud-contaminated image; however, this paper is different and has developed a novel spatially and temporally weighted regression (STWR) model for cloud removal to produce continuous cloud-free Landsat images. The proposed method makes full utilization of cloud-free information from input Landsat scenes and employs a STWR model to optimally integrate complementary information from invariant similar pixels. Moreover, a prior modification term is added to minimize the biases derived from the spatially-weighted-regression-based prediction for each reference image. The results of the experimental tests with both simulated and actual Landsat series data show the proposed STWR can yield visually and quantitatively plausible recovery results. Compared with other cloud removal methods, our method produces lower biases and more robust efficacy. This approach provides a complete framework for continuous cloud removal and has the potential to be used for other optical images and to be applied to the reprocessing of cloud-free remote sensing productions.
Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
The Global Energy and Water Cycle Experiment (GEWEX) Radiation Panel initiated the GEWEX Cloud Assessment in 2005 to compare available, global, long-term cloud data products with the International Satellite Cloud Climatology Project (ISCCP). The GEWEX Cloud Assessment database included cloud properties retrieved from different satellite sensor measurements, taken at various local times and over various time periods. The relevant passive satellite sensors measured radiation scattered or emitted by Earth's surface and by its atmosphere including clouds. Specific spectral domains were exploited for particular retrieval methods to maximize the sensitivity to the presence of clouds and to determine key cloud properties. ISCCP also emphasized on temporal resolution over spectral resolution to resolve the diurnal cycle of clouds the GEWEX cloud climate record.
Geometry-based point cloud compression (G-PCC) is a state-of-the-art point cloud compression standard. While G-PCC achieves excellent performance, its reliance on the predicting transform leads to a significant dependence problem, which can easily result in distortion accumulation. This not only increases bitrate consumption but also degrades reconstruction quality. To address these challenges, we propose a dependence-based coarse-to-fine approach for distortion accumulation in G-PCC attribute compression. Our method consists of three modules: level-based adaptive quantization, point-based adaptive quantization, and Wiener filter-based refinement level quality enhancement. The level-based adaptive quantization module addresses the interlevel-of-detail (LOD) dependence problem, while the point-based adaptive quantization module tackles the interpoint dependence problem. On the other hand, the Wiener filter-based refinement level quality enhancement module enhances the reconstruction quality of each point based on the dependence order among LODs. Extensive experimental results demonstrate the effectiveness of the proposed method. Notably, when the proposed method was implemented in the latest G-PCC test model (TMC13v23.0), a Bj
ntegaard delta rate of
4.9%,
12.7%, and
14.0% was achieved for the Luma, Chroma Cb, and Chroma Cr components, respectively.
In this article, we propose a method based on time coding metasurface (TCM) to modulate the characteristics of radar targets in complex high-resolution range profiles (HRRPs). Unlike the existing works where metasurfaces are typically attached to target surfaces, in our work, the metasurface and controlled target are separable, making modulation more flexible. The proposed method involves controlling the equivalent scattering phase of distance offset false targets within the HRRP, which are generated through TCM by adjusting the modulation parameters. The scattering characteristics of the false target and the controlled target are combined within the same distance unit to create a superimposed scattering feature vector. By taking into account the influence of interference effect, we are able to modulate the scattering characteristics of radar targets within the complex HRRP. The proposed method was validated through theoretical analysis, simulation experiments, and actual experiments conducted in various scenarios. The experimental results demonstrate that the false targets generated by the external TCM can be used to enhance or weaken the scattering intensity of the original target in HRRP, thereby changing the overall image characteristics and posing challenges to target recognition. This method can achieve the modulation of radar target image characteristics under noncoverage conditions, expanding the application scenarios of metasurfaces in the field of radar electronic countermeasures (ECM).
In recent years, point clouds have become increasingly popular for representing three-dimensional (3D) visual objects and scenes. To efficiently store and transmit point clouds, compression methods have been developed, but they often result in a degradation of quality. To reduce color distortion in point clouds, we propose a graph-based quality enhancement network (GQE-Net) that uses geometry information as an auxiliary input and graph convolution blocks to extract local features efficiently. Specifically, we use a parallel-serial graph attention module with a multi-head graph attention mechanism to focus on important points or features and help them fuse together. Additionally, we design a feature refinement module that takes into account the normals and geometry distance between points. To work within the limitations of GPU memory capacity, the distorted point cloud is divided into overlap-allowed 3D patches, which are sent to GQE-Net for quality enhancement. To account for differences in data distribution among different color components, three models are trained for the three color components. Experimental results show that our method achieves state-of-the-art performance. For example, when implementing GQE-Net on a recent test model of the geometry-based point cloud compression (G-PCC) standard, 0.43 dB, 0.25 dB and 0.36 dB
Bjϕntegaard
delta (BD)-peak-signal-to-noise ratio (PSNR), corresponding to 14.0%, 9.3% and 14.5% BD-rate savings were achieved on dense point clouds for the Y, Cb, and Cr components, respectively. The source code of our method is available at https://github.com/xjr998/GQE-Net.
Learning-based multi-view stereo (MVS) has by far centered around 3D convolution on cost volumes. Due to the high computation and memory consumption of 3D CNN, the resolution of output depth is often considerably limited. Different from most existing works dedicated to adaptive refinement of cost volumes, we opt to directly optimize the depth value along each camera ray, mimicking the range (depth) finding of a laser scanner. This reduces the MVS problem to ray-based depth optimization which is much more light-weight than full cost volume optimization. In particular, we propose RayMVSNet which learns sequential prediction of a 1D implicit field along each camera ray with the zero-crossing point indicating scene depth. This sequential modeling, conducted based on transformer features, essentially learns the epipolar line search in traditional multi-view stereo. We devise a multi-task learning for better optimization convergence and depth accuracy. We found the monotonicity property of the SDFs along each ray greatly benefits the depth estimation. Our method ranks top on both the DTU and the Tanks & Temples datasets over all previous learning-based methods, achieving an overall reconstruction score of 0.33 mm on DTU and an F-score of 59.48% on Tanks & Temples. It is able to produce high-quality depth estimation and point cloud reconstruction in challenging scenarios such as objects/scenes with non-textured surface, severe occlusion, and highly varying depth range. Further, we propose RayMVSNet++ to enhance contextual feature aggregation for each ray through designing an attentional gating unit to select semantically relevant neighboring rays within the local frustum around that ray. This improves the performance on datasets with more challenging examples (e.g. low-quality images caused by poor lighting conditions or motion blur). RayMVSNet++ achieves state-of-the-art performance on the ScanNet dataset. In particular, it attains an AbsRel of 0.058 m and produces accurate results on the two subsets of textureless regions and large depth variation.
Precise crop mapping is crucial for guiding agricultural production, forecasting crop yield, and ensuring food security. Integrating optical and synthetic aperture radar (SAR) satellite image time series (SITS) for crop classification is an essential and challenging task in remote sensing. Previously published studies generally employ a dual-branch network to learn optical and SAR features independently, while ignoring the complementarity and correlation between the two modalities. In this article, we propose a novel method to learn optical and SAR features for crop classification through cross-modal contrastive learning. Specifically, we develop an updated dual-branch network with partial weight-sharing of the two branches to reduce model complexity. Furthermore, we enforce the network to map features of different modalities from the same class to nearby locations in a latent space, while samples from distinct classes are far apart, thereby learning discriminative and modality-invariant features. We conducted a comprehensive evaluation of the proposed method on a large-scale crop classification dataset. Experimental results show that our method consistently outperforms traditional supervised learning approaches, no matter the training samples are adequate or not. Our findings demonstrate that unifying the representations of optical and SAR image time series enables the network to learn more competitive features and suppress inference noise.
The challenge of the cloud removal task can be alleviated with the aid of Synthetic Aperture Radar (SAR) images that can penetrate cloud cover. However, the large domain gap between optical and SAR images as well as the severe speckle noise of SAR images may cause significant interference in SAR-based cloud removal, resulting in performance degeneration. In this paper, we propose a novel global–local fusion based cloud removal (GLF-CR) algorithm to leverage the complementary information embedded in SAR images. Exploiting the power of SAR information to promote cloud removal entails two aspects. The first, global fusion, guides the relationship among all local optical windows to maintain the structure of the recovered region consistent with the remaining cloud-free regions. The second, local fusion, transfers complementary information embedded in the SAR image that corresponds to cloudy areas to generate reliable texture details of the missing regions, and uses dynamic filtering to alleviate the performance degradation caused by speckle noise. Extensive evaluation demonstrates that the proposed algorithm can yield high quality cloud-free images and outperform state-of-the-art cloud removal algorithms with a gain about 1.7 dB in terms of PSNR on SEN12MS-CR dataset.
In the past decades, remote sensing (RS) data fusion has always been an active research community. A large number of algorithms and models have been developed. Generative adversarial networks (GANs), as an important branch of deep learning, show promising performances in a variety of RS image fusions. This review provides an introduction to GANs for RS data fusion. We briefly review the frequently used architecture and characteristics of GANs in data fusion and comprehensively discuss how to use GANs to realize fusion for homogeneous RS, heterogeneous RS, and RS and ground observation (GO) data. We also analyze some typical applications with GAN-based RS image fusion. This review provides insight into how to make GANs adapt to different types of fusion tasks and summarizes the advantages and disadvantages of GAN-based RS data fusion. Finally, we discuss promising future research directions and make a prediction on their trends.
In rate-distortion optimization, the encoder settings are determined by maximizing a reconstruction quality measure subject to a constraint on the bitrate. One of the main challenges of this approach is to define a quality measure that can be computed with low computational cost and which correlates well with the perceptual quality. While several quality measures that fulfil these two criteria have been developed for images and videos, no such one exists for point clouds. We address this limitation for the video-based point cloud compression (V-PCC) standard by proposing a linear perceptual quality model whose variables are the V-PCC geometry and color quantization step sizes and whose coefficients can easily be computed from two features extracted from the original point cloud. Subjective quality tests with 400 compressed point clouds show that the proposed model correlates well with the mean opinion score, outperforming state-of-the-art full reference objective measures in terms of Spearman rank-order and Pearson linear correlation coefficient. Moreover, we show that for the same target bitrate, rate-distortion optimization based on the proposed model offers higher perceptual quality than rate-distortion optimization based on exhaustive search with a point-to-point objective quality metric. Our datasets are publicly available at
https://github.com/qdushl/Waterloo-Point-Cloud-Database-2.0
.
The scan-line corrector (SLC) of the Landsat 7 ETM+ failed permanently in 2003, resulting in about 22% unscanned gap pixels in the SLC-off images, affecting greatly the utility of the ETM+ data. To address this issue, we propose a spatial-spectral radial basis function (SSRBF)-based interpolation method to fill gaps in SLC-off images. Different from the conventional spatial-only radial basis function (RBF) that has been widely used in other domains, SSRBF also integrates a spectral RBF to increase the accuracy of gap filling. Concurrently, global linear histogram matching is applied to alleviate the impact of potentially large differences between the known and SLC-off images in feature space, which is demonstrated mathematically in this article. SSRBF fully exploits information in the data themselves and is user-friendly. The experimental results on five groups of data sets covering different heterogeneous regions show that the proposed SSRBF method is an effective solution to gap filling, and it can produce more accurate results than six popular benchmark methods.
Missing data reconstruction is a classical yet challenging problem in remote sensing images. Most current methods based on traditional convolutional neural network require supplementary data and can only handle one specific task. To address these limitations, we propose a novel generative adversarial network-based missing data reconstruction method in this letter, which is capable of various reconstruction tasks given only single source data as input. Two auxiliary patch-based discriminators are deployed to impose additional constraints on the local and global regions, respectively. In order to better fit the nature of remote sensing images, we introduce special convolutions and attention mechanism in a two-stage generator, thereby benefiting the tradeoff between accuracy and efficiency. Combining with perceptual and multiscale adversarial losses, the proposed model can produce coherent structure with better details. Qualitative and quantitative experiments demonstrate the uncompromising performance of the proposed model against multisource methods in generating visually plausible reconstruction results. Moreover, further exploration shows a promising way for the proposed model to utilize spatio-spectral-temporal information. The codes and models are available at
https://github.com/Oliiveralien/Inpainting-on-RSI
.
The use of time series analysis with moderate resolution satellite imagery is increasingly common, particularly since the advent of freely available Landsat data. Dense time series analysis is providing new information on the timing of landscape changes, as well as improving the quality and accuracy of information being derived from remote sensing. Perhaps most importantly, time series analysis is expanding the kinds of land surface change that can be monitored using remote sensing. In particular, more subtle changes in ecosystem health and condition and related to land use dynamics are being monitored. The result is a paradigm shift away from change detection, typically using two points in time, to monitoring, or an attempt to track change continuously in time. This trend holds many benefits, including the promise of near real-time monitoring. Anticipated future trends include more use of multiple sensors in monitoring activities, increased focus on the temporal accuracy of results, applications over larger areas and operational usage of time series analysis.
Because of sensor malfunction and poor atmospheric conditions, there is usually a great deal of missing information in optical remote sensing data, which reduces the usage rate and hinders the follow-up interpretation. In the past decades, missing information reconstruction of remote sensing data has become an active research field, and a large number of algorithms have been developed. However, to the best of our knowledge, there has not, to date, been a study that has been aimed at expatiating and summarizing the current situation. This is therefore our motivation in this review. This paper provides an introduction to the principles and theories of missing information reconstruction of remote sensing data. We classify the established and emerging algorithms into four main categories, followed by a comprehensive comparison of them from both experimental and theoretical perspectives. This paper also predicts the promising future research directions.
Filling dead pixels or removing uninteresting objects is often desired in the applications of remotely sensed images. In this paper, an effective image inpainting technology is presented to solve this task, based on multichannel nonlocal total variation. The proposed approach takes advantage of a nonlocal method, which has a superior performance in dealing with textured images and reconstructing large-scale areas. Furthermore, it makes use of the multichannel data of remotely sensed images to achieve spectral coherence for the reconstruction result. To optimize the proposed variation model, a Bregmanized-operator-splitting algorithm is employed. The proposed inpainting algorithm was tested on simulated and real images. The experimental results verify the efficacy of this algorithm.
The Earth Observing System of the National Aeronautics and Space Administration pays a great deal of attention to the long-term global observations of the land surface, biosphere, atmosphere, and oceans. Specifically, the Moderate Resolution Imaging Spectroradiometer (MODIS) instrument on board the twin satellites Terra and Aqua plays a vital role in the mission. Unfortunately, around 70% of the detectors in Aqua MODIS band 6 have malfunctioned or failed. Consequently, many of the derivatives related to band 6, such as the normalized difference snow index, suffer from the adverse impact of dead or noisy pixels. In this letter, the missing or noisy information in Aqua MODIS band 6 is successfully completed using a robust multilinear regression (M-estimator) based on the spectral relations between working detectors in band 6 and all the other spectra. The experimental results indicate that the proposed robust M-estimator multiregression (RMEMR) algorithm can effectively complete the large areas of missing information while retaining the edges and textures, compared to the state-of-the-art methods.
Since the scan line corrector (SLC) of the Landsat Enhanced Thematic Mapper Plus (ETM +) sensor failed permanently in 2003, about 22% of the pixels in an SLC-off image are not scanned. To improve the usability of the ETM + SLC-off data, we propose an integrated method to recover the missing pixels. The majority of the degraded pixels are filled using multi-temporal images as referable information by building a regression model between the corresponding pixels. When the auxiliary multi-temporal data cannot completely recover the missing pixels, a non-reference regularization algorithm is used to implement the pixel filling. To assess the efficacy of the proposed method, simulated and actual SLC-off ETM + images were tested. The quantitative evaluations suggest that the proposed method can predict the missing values very accurately. The method performs especially well in edges, and is able to keep the shape of ground features. According to the assessment results of the land-cover classification and NDVI, the recovered data are also suitable for use in further remote sensing applications.
A normalized difference vegetation index (NDVI) cloud index (NCI) was derived from Pathfinder Advanced Very High Resolution Radiometer (AVHRR) daily NDVI data and compared with observed cloud amounts and a sunshine duration-cloud index (SCI) over an area of diverse land cover. Ground observations from 120 meteorological stations were significantly related to the daily NCI and the SCI, with R2 values of 0.41 and 0.50, respectively. The daily NCI and interpolated cloud indices derived from ground observations over the 776 900 km2 study area were compared. The correlation coefficient between the NCI and the observed cloud amount was less than 0.6 for less than 20% of the area. The correlation coefficient between the NCI and the observed sunshine duration index was less than 0.6 for less than 10% of the area and less than 0.7 for 41% of the area. There were strong correlations for high elevations in summer, and correlations for low elevations in winter were weaker. A frozen soil surface or snow cover degrades the NDVI relationship to clouds. The NCI and observed cloud indices had high correlation coefficients in areas with diverse land uses, suggesting that the NCI may be useful in estimating cloudiness over a large region.
Using appropriate techniques to fill the data gaps in SLC-off ETM+ imagery may enable more scientific use of the data. The local linear histogram-matching technique chosen by USGS has limitations if the scenes being combined exhibit high temporal variability and radical differences in target radiance due, for example, to the presence of clouds. This study proposes using an alternative interpolation method, the kriging geostatistical technique, for filling the data gaps. The case study shows that the ordinary kriging techniques may provide a powerful tool for interpolating the missing pixels in the SLC-off ETM+ imagery. While the standardized ordinary cokriging has been shown to be particularly useful when samples of the variable to be predicted are sparse and samples of a second, related variable are plentiful, the case study demonstrates that it provides little improvement in interpolating the data gap in the SLC-off imagery.