International Journal of Remote Sensing

Published by Taylor & Francis
Online ISSN: 1366-5901
Print ISSN: 0143-1161
Learn more about this page
Aims and scope

IJRS publishes research focusing on the remote sensing of the atmosphere, biosphere, cryosphere and terrestrial earth, and remote sensing with Drones

Recent publications
Atmospheric pollution affects air quality and can pose a serious risk to public health. Traditional PM2.5 monitoring is limited by the uneven distribution of ground stations, which makes it difficult to obtain spatially continuous and accurate PM2.5 concentrations information. Using aerosol optical depth (AOD) retrieved from satellite remote sensing to study the temporal and spatial variation characteristics of PM2.5 can accurately predict the PM2.5 concentrations in a wide range and provide a basis for atmospheric pollution prevention and control. Considering the spatio-temporal correlations among the data, this study uses AOD data, introduces geographic location data, temporal data, meteorological data, and elevation data to construct an Optimized XGBoost (O-XGBoost) model with spatial and temporal characteristics to estimate PM2.5 concentrations in the Guanzhong region from 2019 to 2021 and analyse its spatial and temporal distribution characteristics. The results show that compared with the random forest (RF) and XGBoost models, the O-XGBoost model has higher estimation accuracy, with R2, RMSE, and MAE of 0.873, 11.460μg⋅m−3, and 8.061μg⋅m−3, respectively. The estimation of PM2.5 concentrations based on the O-XGBoost model makes up for the uneven distribution of ground monitoring stations and improves the estimation accuracy.
Electric shorting induced by tall vegetation is one of the major hazards affecting power transmission lines extending through rural regions and rough terrain for tens of kilometres. This raises the need for an accurate, reliable, and cost-effective approach for continuous monitoring of canopy heights. This paper proposes and evaluates two deep convolution neural network (CNN) variants based on Seg-Net and Res-Net architectures, characterized by their small number of trainable weights (nearly 800,000) while maintaining high estimation accuracy. The proposed models utilize the freely available data from Sentinel-2, and a digital surface model to estimate forest canopy heights with high accuracy and a spatial resolution of 10 metres. Various factors affect canopy height estimation , including topography signature, dataset diversity, input layers, and model structure. The proposed models are applied separately to two powerline regions located in the northern and southern parts of Thailand. The application results show that the proposed Encoder-Decoder CNN Seg-Net model presents an average mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination R 2 ð Þ of 1.38 m, 1.85 m, and 0.87, respectively , and is nearly 4.8 times faster than the CNN Res-Net model in conversion. These results prove the proposed model's capability of estimating and monitoring canopy heights with high accuracy and fine spatial resolution.
Extracting mask information of buildings and water areas from high resolution remote sensing images is beneficial to monitoring and management of urban development. However, due to different times, different geographical locations and different remote sensing acquisition angles, water areas and buildings will feed back different spectral information. Existing semantic segmentation methods do not pay enough attention to channel information, and the feature information extracted by downsampling is relatively abstract, which is easy to cause the loss of some details in high-resolution images under complex scenes, leading to the misjudgement of buildings and waters. To solve the existing problems, feature enhancement network (FENet) for high-resolution remote sensing image segmentation of buildings and water areas is proposed. By paying more attention to the characteristic information of the passage, the probability of misjudgement of buildings and waters can be reduced and their edge contour information can be enhanced. The self-attention feature module proposed in this paper encodes the context information and transmits it to the local features, and establishes the relationship between channels through the channel feature enhancement module to reduce the loss of channel feature information. The feature fusion module fuses feature information of different scales in space and outputs more detailed prediction images. Comparative experiments show that this model is superior to the existing classical semantic segmentation model. Compared with the existing models, the proposed method can achieve 2% improvement than PSPNet on the indicator MIoU, and the final MIoU reaches 82.85% for land cover dataset. This study demonstrates the advantages of our proposed method in land cover classification and detection.
Drought disasters significantly threaten the stability of terrestrial ecosystems in arid and semi-arid regions. Although the impact of drought on vegetation has been extensively researched, there is still a lack of understanding regarding the ecological impacts of drought on the heterogeneous vegetation in northern China’s arid and semi-arid regions. To investigate the spatial patterns of vegetation and differences in response to drought climate, the Normalized Difference Vegetation Index (NDVI) and Standardized Precipitation Evapotranspiration Index (SPEI) were used to determine vegetation characteristics and the seasonal sensitivity of vegetation types to drought response, as well as to identify the factors driving the sensitivity differences. In the study area, 83.1% of the vegetation exhibited a significant positive correlation with the drought index. The maximum correlation coefficient between vegetation and drought was 0.62, demonstrating that the majority of vegetation was influenced by drought. It was observed that different vegetation types responded differently to drought at varying spatial and temporal scales. The main drivers of vegetation response to drought in arid and semi-arid zones were quantified by using the random forest algorithm. The results indicated that climate and soil factors are the primary limiting factors. These findings enhance our comprehension of arid and semi-arid areas, the effect of drought on different plant species, and the factors that affect vegetation’s response to drought.
We have constructed a Bayesian neural network able of retrieving tropospheric temperature profiles from rotational Raman-scatter measurements of nitrogen and oxygen and applied it to measurements taken by the RAman Lidar for Meteorological Observations (RALMO) in Payerne, Switzerland. We give a detailed description of using a Bayesian method to retrieve temperature profiles including estimates of the uncertainty due to the network weights and the statistical uncertainty of the measurements. We trained our model using lidar measurements under different atmospheric conditions, and we tested our model using measurements not used for training the network. The computed temperature profiles extend over the altitude range of 0.7 km to 6 km. The mean bias estimate of our temperatures relative to the MeteoSwiss standard processing algorithm does not exceed 0.05 K at altitudes below 4.5 km, and does not exceed 0.08 K in an altitude range of 4.5 km to 6 km. This agreement shows that the neural network estimated temperature profiles are in excellent agreement with the standard algorithm. The method is robust and is able to estimate the temperature profiles with high accuracy for both clear and cloudy conditions. Moreover, the trained model can provide the statistical and model uncertainties of the estimated temperature profiles. Thus, the present study is a proof of concept that the trained NNs are able to generate temperature profiles along with a full-budget uncertainty. We present case studies showcasing the Bayesian neural network estimations for day and night measurements, as well as in clear and cloudy conditions. We have concluded that the proposed Bayesian neural network is an appropriate method for the statistical retrieval of temperature profiles.
Satellite images are widely used for change detection in various applications, like, agricultural and forest cover area monitoring and management, urban planning, disaster management etc. which require prior information about the timeline of changes. However, it is very challenging to identify precise timeline of changes and to acquire corresponding labelled reference of changes. Considering this, Convolutional Auto-Encoder (CAE)-based approaches has been proposed for unsupervised change detection from bi-temporal remote sensing images. However, these type of approaches require prior knowledge about the exact moment the change occurs in order to employ the bi-temporal images, one pre-change and one post-change. This work introduces the utilization of satellite image time-series (SITS) data in a Joint 3D CAE-based unsupervised approach to detect annual changes. Our study focuses on a mining area where the changes mostly occurs due to tailings, waste and ore deposit, mining pit, mining residue, and so on. Here, we have explored Synthetic Aperture Radar (SAR) (i.e. Sentinel-1) data along with optical (i.e. Sentinel-2) data, since optical data alone was not able to detect all the changes. The recall of detected change areas during 2016–2018 has improved by about 16% with the use of combination of Sentinel-1&2 data as compared to utilization of only Sentinel-2 images.
Convolutional neural networks (CNNs) extract semantic features from images by stacking convolutional operators, which easily causes semantic information loss and leads to hollow and edge inaccuracies in building extraction. Therefore, a features self-attention U-block network (FSAU-Net) is proposed. The network focuses on the target feature self-attention in the coding stage, and features self-attention (FSA) distinguishes buildings from nonbuilding by weighting the extracted features themselves; we introduce spatial attention (SA) in the decoder stage to focus on the spatial locations of features, and SA generates spatial location features through the spatial relationship among the features to highlight the building information area. A jump connection is used to fuse the shallow features generated in the decoder stage with the deep features generated in the encoder stage to reduce the building information loss. We validate the superiority of the method FSAU-Net on the WHU and Inria datasets with 0.3 m resolution and Massachusetts with 1.0 m resolution, experimentally showing IoU of 91.73%, 80.73% and 78.46% and precision of 93.60%, 90.71% and 86.37%, respectively. In addition, we also set up ablation experiments by adding an FSA module, Squeeze-and-Excitation (SE) module and Efficient Channel Attention (ECA) module to UNet and ResNet101, where UNet+FSA improves the IoU values by 3.15%, 2.72% and 1.77% compared to UNet, UNet+SE and UNet+ECA, respectively, and ResNet101+FSA improves the IoU values by 2.06%, 1.17% and 0.9% compared to ResNet101, ResNet101+SE and ResNet101+ECA, respectively, demonstrating the superiority of our proposed FSA module. FSAU-Net improves the IoU values by 3.18%, 2.75% and 1.80% compared to those of UNet, UNet+SE and UNet+ECA, respectively. FSAU-Net has 2.11%, 1.22%, and 0.95% IoU improvements over the IoU values of ResNet101, ResNet101+SE and ResNet101+ECA, respectively, demonstrating the superiority of our proposed FSAU-Net model. The TensorFlow implementation is available at
Distribution of selected PEP725 stations by country across Europe.
Bland-Altman plot showing for each pair BBCH-remotely sensed metric the average (x-axis) plotted against their difference (y-axis). The blue line indicates the mean difference, including a confidence region for the mean of the differences.
Redundancy Analysis (RDA) biplot: distribution of the biophysical explanatory variables as vectors (blue arrows) and the phenological metrics difference (Δ) as response variables (black points). The first two RDA factors (F1 and F2) explain almost 90% of the total variance of the response variables.
Results of paired Wilcoxon test with reference to BBCH SOS values.
2023) On the temporal mismatch between in-situ and satellite-derived spring phenology of ABSTRACT Forest phenology plays a key role in the global terrestrial ecosystem influencing a range of ecosystem processes such as the annual carbon uptake period, and many food webs and changes in their timing and progression. The timing of the start of the phenology season has been successfully determined at a range of scales, from the individual tree by in situ observations to landscape and continental scales by using remotely sensed vegetation indices (VIs). The spatial resolution of satellites is much coarser than traditional methods, creating a gap between space-borne and actual field observations, which brings limitations to phenological research at the ecosystem level. Several unconsidered methodological and observational-related limitations may lead to misinterpretation of the timing of the satellite-derived signals. The aim of this study is therefore to clarify the meaning of a set of spring phenology metrics derived from Moderate Resolution Imaging Spectroradiometer (MODIS) Enhanced Vegetation Index (EVI) time series in beech forests distributed across Europe with respect to PEP725 in situ observations, from 2003 to 2020. To this aim, we (i) tested the differences between remotely sensed and in situ start-of-season (SOS) metrics and (ii) quantified the influence of latitude, elevation, temperature, and precipitation on such differences. Results demonstrated that there is a clear temporal gradient among the different SOS metrics, all of them occurring prior to the in situ observations. Furthermore, latitude and temperatures proved to be the main factors guiding the differences between remotely sensed and in situ SOS metrics. Evidence from this study may help in recognizing the actual meaning of what we see by means of remotely sensed phenology metrics. In this perspective, field observations are crucial in understanding phenology events and provide a reference base. Satellite data, on the other hand, complement field observations by filling in gaps in spatial and temporal coverage, thus enhancing the overall understanding. ARTICLE HISTORY
Synthetic aperture radar (SAR) image change detection is a key technique for such essential applications as flood disaster assessment and forest fire detection. In SAR image change detection, clustering algorithm is the most applied methods, but clustering algorithm only considers the grey features, therefore it is more susceptible to speckle noise. Deep learning model is difficult to be trained by supervised learning because of lack of labels. To alleviate the above-mentioned challenges of SAR image change detection, a parallel dual-branch SAR image change detection network based on clustering and segmentation (Clustering-Segmentation Network) is proposed in this paper. In the clustering branch, the clustering-based change detection results were obtained by fuzzy c-means (FCM) clustering. In the segmentation branch, the Graph-Based Image Segmentation algorithm was used for pre-segmentation. These results were used as labels of the neural network for training for the segmentation-based change detection. After the fusion of these dual-branch, a double sparse dictionary (DSD) discrimination algorithm is proposed to extract the neighbourhood features for the final discrimination, and obtain the final results. By fusing the dual-branch results, the influence of speckle noise on change detection can be suppressed while maintaining a high accuracy. We show that the Clustering-Segmentation Network exhibited better results compared with existing algorithms on several datasets. The accuracy and kappa coefficients are improved by 0.83% and 3.45% respectively, thereby proving the effectiveness of our proposed method.
Accurate monitoring of soil moisture and the development of timely interventions are important to reduce the social and economic losses caused by drought. Compared to short-wave infrared (SWIR) and thermal infrared (TIR), near-infrared (NIR) and visible bands are widely used in almost all optical satellites. Drought monitoring using NIR and visible bands is therefore more relevant for optical satellites. Among the visible bands, the red band is often used in combination with the NIR band for drought monitoring due to its sensitivity to vegetation. However, current drought indexes based on the NIR and the red band applied to areas of high vegetation suffer from insufficient accuracy or tedious calculations. In this study, the ratio drought index (RDI) was developed after constructing a new feature space by examining the spectral properties of soil and vegetation at different water levels in the NIR and red bands. The accuracy of soil moisture inversion under two types of bare soil and vegetation was evaluated using in situ data from Tai’an City, Shandong Province. The perpendicular drought index (PDI) and modified perpendicular drought index (MPDI) were also used to compare for the RDI. The results showed that the RDI correlation coefficients (R²) of 0.653 and 0.641 were better than the MPDI of 0.616 and 0.594 and the PDI of 0.602 and 0.546 for soil moisture measurements from vegetation and bare soil cover. The RDI attenuates the effect of vegetation on soil moisture inversion, as its root mean square error (RMSE) in vegetated areas is lower than that of the PDI and MPDI. The RDI calculation can be used as a theoretical guide for large-scale soil moisture estimation because it is fast, accurate and does not require additional quantitative remote sensing inversion factors.
Geostationary Ocean Colour Imager (GOCI) provides ocean colour products for monitoring ecosystem dynamics and assessing spatiotemporal changes. However, producing reliable satellite-based product estimates remains challenging in optically complex coastal waters, and bio-optics algorithms must be assessed. To date, assessments of satellite products have been analysed based on clear, natural waters. In this study, we assume stable waters that express little to no diurnal variability due to biological or physical processes in coastal water. Then, a suitable bio-optical algorithm with the lowest uncertainty is selected. The Tassan (MS) algorithm proposed in 2016 provided the appropriate representations of the temporal variation in chlorophyll in the results. The uncertainty of chlorophyll a estimated at the Mooring site located at the Yangtze River mouth and at the Ieodo site located off the coast of the Tsushima Strait were less than 14.5%, and the maximum uncertainty at the Socheongcho site, which was close to the Korean Peninsula, was 28.7%.
Due to the instability of sensors and other factors, hyperspectral images (HSIs) are inevitably polluted by various types of mixed noise. To explore a better denoising method based on the existing research, combining the denoising advantages of tensor-ring (TR) decomposition and tensor robust principal component analysis (TRPCA), a mixed-noise removal method for HSIs is proposed in this paper. First, TRPCA maintains the tensor structure of the image itself, accurately recovers the low-rank part and the sparse part from their sum and separates the sparse noise in the form of sparse tensors. Then, TR decomposition is introduced to denoise the low-rank tensors. To verify the effectiveness and superiority of this method, experiments are carried out on two simulated data sets and two real data sets. Compared with the traditional denoising methods and several existing improved denoising methods from both visual and quantitative aspects, the proposed TRPCA-TR method provides better denoising results.
Ocean waves are the richest small-scale texture on the sea surface, from which valuable information can be inversed. In general, synthetic aperture radar (SAR) images of surface waves will inevitably have an impact on some oceanography follow-up applications due to the intricate motion of the waves. However, existing suppression methods based on image geometric properties or deep learning-based have limitations. Owing to the lack of clean images and the uncertainty of the statistical characteristics of seawater, these methods suppress ocean surface waves while harming other target information. To address this issue, we propose an unsupervised wave suppression algorithm based on sub-aperture SAR images named SAR-Noise2Noise (SAR-N2N). First, SAR-N2N exploits the geometric and time-varying relationship of the sub-aperture images to suppress the wave target selectively through sub-aperture image cancellation. Second, through the method of unsupervised learning, only noisy images are used to suppress ocean surface waves, avoiding heavy dependence on the assumption of the wave distribution. Finally, the proposed regularizer is used as an additional loss to form an end-to-end training process. We explain our approach from a theoretical perspective and further validate it through extensive experiments, including synthetic experiments with various ocean wave distributions in simulated SAR images and real-world images. Training on collocations over 9000 SAR images, we demonstrate on test data from simulated images that SAR-N2N can improve peak signal-to-noise ratio (PSNR) up to 30 dB compared with the classical approach. Furthermore, it improves the equivalent number of looks (ENL) value of real-world SAR images by a factor of 8. All results and the methods are novel in terms of the accuracy achieved, combining the classical approach with deep learning techniques. We conclude that SAR-N2N has the potential to make useful contributions to several oceanographic applications as well as to the near real time (NRT) processing of multiparametric sea states.
Globally, the rapid loss of mangrove forests often creates long-lasting environmental damage alongside coastlines, mudflats, and river banks. All-weather, physical monitoring of such areas is almost impossible because of inaccessibility to swampy areas and a hostile substrate. Subsequently, conventional field surveys are relatively unavailable to monitor human encroachment in coastal areas like the Mumbai mangrove region of India. In this context, the polarimetric synthetic aperture radar (PolSAR) remote sensing tool becomes a potential candidate for mangrove conservation and management. Since the Mumbai mangrove region of India exists over the extensive land cover, the large-scale classification can articulate the continuous encroachment because of the location of human settlements or activities across this metropolitan coastal city. For this, the traditional algorithms need the essential improvisation to apply for large-scale data analysis, aiming simultaneously at getting the highest decisive and time efficiency. Here, we introduce a shallow learning model of MapReduce-based Multi-Layer Perception (MLP) algorithm to classify the hybrid compact polarimetric (CP) and fully polarimetric (FP) feature space. Even though a shallow learning model of the automated method is easily scalable, it requires a distinctive shallow learnable feature set for better land cover classification. In this effort, this paper investigates the efficacy of derived feature space compared to direct polarimetric measurements of sensors and shows that the shallow learnable feature set is more effective with both CP & FP observations. Simultaneously, the relevancy of the proposed distributed model of MLP is also justified compared to distributed extreme learning machine (DELM) algorithm and provides a practically implementable scaled-MLP algorithm of shallow learning model. Ultimately, this paper comes up with a better polarimetric signature of land types for both the CP and FP datasets, which can be used in an alternative manner as per data availability for a multi-sensor data analysis.
Detecting the field maturity moment for maize (Zea mays L.) crop represents a relevant point to estimate its optimal harvest time. Knowing the optimal harvest time (defined by grain moisture content) at the end of the crop season is a major concern for maize farmers, as it could lead to substantial economic losses if not harvested on time. For this crop, optimal harvest time usually occurs 3–4 weeks after field maturity, depending on weather conditions. Therefore, this study focused on the interferometric coherence time-series analysis at the end of the maize crop season, to indirectly estimate the field maturity. For such purpose, a coherence object-based change detection method using Sentinel-1 SAR images was developed aiming to estimate the potential field maturity time. These estimations were assessed using an independent data set of field maturity dates obtained through field inspection and crop growth modelling. The technique was tested over 52 fields in the northwest region of Kansas, United States, with a detection rate of 80%, and a field maturity estimation error of 10 days (assessed with the root mean square error). The proposed method constitutes a promising approach to estimating the maize field maturity in near-real time, determining the field harvest readiness, and developing a decision support tool to assist farmers in prioritizing the allocation of fields at harvest time.
In coal mine roadway, the reconstruction model obtained by 3D laser scanning is interfered by internal point cloud including dust, spray, auxiliary transportation vehicles, personnel flow, and so on, which seriously impacts deformation monitoring based on it. However, most of the current denoising methods barely consider the working conditions of roadway. Thus, this paper developed an automatic removal method of interference points with high efficiency and accuracy in coal mine roadway. It is realized through the establishment and transformation of local coordinate system, construction of voxel grid, bidirectional projection and central axis extraction, and noise removal based on a stepping bounding box. Under different levels of random noise and obstacle point clouds, the method has good performance. Based on the experimental results carried out in an underground roadway, the proposed approach takes 10 s to remove the interference points with an accuracy of 96%, which verifies the feasibility and effectiveness of the method.
In Portugal, almonds are a very important crop, due to their nutritional properties. In the northeastern part of the country, the almond sector has endured over time, with strong cultural traditions and key economic significance. In these areas, several cultivars are used. In effect, the presence of various almond cultivars implies differentiated management in irrigation, disease control, pruning system, and harvest planning. Therefore, cultivar classification is essential over large agricultural areas. Over the last decades, remote-sensing data have led to important breakthroughs in the classification of different cultivars for several crops. Nonetheless, for almonds, studies are incipient. Thus, this study aims to fill this knowledge gap and explore the classification of almond cultivars in an almond orchard. High-resolution multispectral data were acquired by an unmanned aerial vehicle (UAV). Vegetation indices (VIs) and tree structural parameters were, subsequently, estimated. To obtain an accurate cultivar identification, four machine learning classifiers, such as K-nearest neighbour (kNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost), were applied and optimized through the fine-tuning process. The accuracy of machine learning classifiers was analysed. SVM and RF performed best with OAs of 76% and 74% using VIs and spectral bands (GREEN, GRVI, GN, REN, ClRE). Adding the canopy height model (CHM) improved performance, with RF and XGBoost having OAs of 88% and 84%. kNN performed worst with an OA of 73% using only VIs and spectral bands, 80% with VIs, spectral bands and CHM, and 93% with VIs, CHM, and tree crown area (TCA). The best performance was achieved by RF and XGBoost with OAs of 99% using VIs, CHM, and TCA. These results demonstrate the importance of the feature selection process. Moreover, this study reveals the feasibility of remote-sensing data and machine learning classifiers in the classification of almond cultivars.
Analysis of the suitability of water colour products for the reliability of subsequent applications, or improvements of the algorithm at later stages, is important. We used three types of water bodies: Type 1 (Southern Ocean and Qinghai Lake) clean oceanic water bodies dominated by chlorophyll and clean inland water bodies dominated by chlorophyll and coloured dissolved organic matter; Type 2 (Taihu Lake and Chagan Lake) turbid inland water bodies dominated by suspended matter and chlorophyll and Type 3 (Xiaoxingkai Lake and Daxingkai Lake) turbid inland water bodies dominated by suspended matter. The accuracies of two Chl a concentration products data for synchronous Sentinel-3 OLCI Neural Network (NN) and ocean colour 4 band ratio (OC4Me) algorithms were verified using two metrics combined with in situ data (to compare the applicability of Chl a concentration algorithms for the three types of water bodies). The optimal Chl-a concentration algorithm was selected to analyse the spatial and temporal variation characteristics of monthly concentrations of Chl a. Based on the analysis, (1) the NN algorithm had better applicability to Type 1 water bodies ; the NN algorithm had better applicability to Type 2 water bodies, ; and both algorithms had poor applicability to Type 3 water bodies, (2) Limitations: An overall low derivation value was obtained for the Sentinel-3 OLCI NN algorithm with water concentrations of Chl a, higher than 25 mg m⁻³. The OC4Me Chl a concentration product only applies to the concentrations of Chl a in Type 2 during the summer. (3) When the Sentinel-3 OLCI Chl a concentration product was compared with the moderate-resolution imaging spectroradiometer (MODIS) and visible infrared imaging radiometer (VIIRS) Chl a concentration products, the distribution trends of the Chl a concentrations were consistent. Furthermore, the Sentinel-3 OLCI Chl a concentration product was more dominant in spatial resolution.
This study describes On-Orbit absolute radiometric calibration for the Ocean Colour Monitor2 (OCM2) onboard Oceansat-2 satellite through a vicarious calibration experiment performed at the Great Rann of Kutch calibration site in Gujarat, India, in February 2022. To achieve accurate and consistent calibration for the OCM2 channels, a reflectance-based calibration method was used which relies on synchronous in-situ measurements of surface reflectance and atmospheric parameters at the time of satellite overpass. In this exercise, the 6 SV (Second Simulation of a Satellite Signal in the Solar Spectrum Vector) radiative transfer (RT) model was used to simulate the Top Of Atmosphere (TOA) spectral radiance for the OCM2 channels. The on-orbit radiometric performance/changes were derived by comparing the 6 SV simulated TOA radiance with those of the OCM2 Level 1B (L1B) data product. The results indicate that the average gain for band 1 to band 8 was found in the range from 0.88 to 1.23 and the relative error from 1.58% to 18.79% for the OCM2 sensor. However, the Root-Mean-Square-Error (RMSE) between the OCM2 measured and the 6 SV simulated TOA radiance data range from 0.17 (µW cm⁻² sr⁻¹ nm⁻¹) at 443 nm to 2.04 (µW cm⁻² sr⁻¹ nm⁻¹) at 620 nm. Furthermore, we analysed in detail the various uncertainties in this approach emanating from surface reflectance, atmospheric conditions (ozone, water vapour and aerosol optical depth), aerosol-type assumption in the RT model, BRDF, and inherent accuracy of the 6 SV RT model. The overall uncertainty was within 6% estimated using the reflectance-based calibration method.
Rapid and accurate assessments of soil salinity information surrounding saline lakes are crucial for agricultural development and ecological security in arid regions. The Support Vector Machine (SVM) algorithm is currently utilized to derive the relationship between environmental covariates and soil salinity to perform remote sensing inversion of regional soil salinity; however, there is still potential for improvement in the existing SVM algorithm. Therefore, this study aims to improve the remote sensing-based soil salinity content (SSC) extraction from the Landsat 8, DEM and HJ-1A CCD satellite data using the Cuckoo Search Algorithms-Support Vector Machines (CS-SVM) model. In addition, the correlation and principal component analysis were conducted to determine the principal components of environmental covariates. The results show that the differential transformation effectively separates the land and water, which helps to reduce the noise in the raw remote sensing image. The analysis of soil and vegetation factors shows that the first three principal components cumulative variance contributed 99.69% on the raw remote sensing image, while the first two principal components cumulative variance contributed 88.01% and 85.28% on the first- and second-order differential transformation remote sensing images, respectively. Interestingly, L-S2 is the only factor correlated with SSC in the third order differential transform remote sensing image, with the R value of 0.325. The slope direction and plane curvature under the topographic factors had negative correlations with SSC, with the R values of −0.521 and −0.325, respectively. Finally, the SSC inversion model was developed using the first order differential transformation remote sensing images, which has high accuracy and good stability (R² = 0.68 and RMSE = 3.80 g⁻¹). The cuckoo algorithm is helpful for determining the best support vector machine parameters and offers new perspectives in improving the reliability of remote sensing-based soil salinity inversion in arid regions.
As ship target detection technology has high application value in military and civil fields, it is significant to research ship detection in SAR images. Aiming at the complex and diverse backgrounds, significant differences in ship sizes, and real-time detection problems in the ship target detection task of SAR remote sensing images, a lightweight ship detection network based on the YOLOx-Tiny model is proposed. Firstly, a multi-scale ship feature extraction module is proposed, composed of a parallel multi-branch structure connected by a standard convolution layer, asymmetric convolution layer, and dilatation convolution layer with different expansion rates in turn. It makes better use of local features and global features and effectively improves the detection accuracy of multi-scale ship targets; Secondly, to ensure detection performance and eliminate background interference, we propose a whole SAR remote sensing image detection strategy based on an adaptive threshold, which effectively suppresses false alarms caused by background and improves detection speed. The experimental results on two different SAR ship datasets, SSDD and HRSID, show that, compared with several advanced methods, the effectiveness and superiority of the method in this paper are verified, and excellent results are shown in the detection of the whole SAR remote-sensing image. It can provide effective theoretical and technical support for ship detection on platforms with limited computing resources and has good application prospects.
Object tracking plays an important role in computer vision. In recent years, hyperspectral object tracking has gained increasing attention because the material information contained in a large number of spectral bands of hyperspectral images (HSIs), which is critical in distinguishing the target from the background. However, owing to the high-dimensional characteristics of HSIs and complex real-world scenarios, hyperspectral object tracking remains a challenging task. In this paper, we propose a domain transfer and difference-aware band weighting (DT-DBW) tracker for hyperspectral object tracking. Firstly, a domain transfer module is designed to adjust the feature distribution of HSIs, so that the deep learning object tracker can be effectively applied to hyperspectral videos. To further improve the performance and accuracy of the tracker, a difference-aware band weighting module is implemented to exploit the spectral difference features between the target and the background to generate individual band weights for the hyperspectral videos. Through the band weighting operation, the spectral response value of HSIs is recalibrated to enhance the value of spectral information and suppress the background spectral information. Experimental results on hyperspectral datasets demonstrate that the Area-Under-Curve (AUC) and tracking speed of DT-DBW tracker are up to 0.647 and 48.6 FPS, outperforming existing hyperspectral object trackers.
The noise and significant spectral differences lead to severe spectral and spatial information distortions in the fusion result of SAR and optical images. We propose a fusion method based on phase congruency information and an improved, simplified pulse-coupled neural network (PC-SPCNN). The PC-SPCNN method builds the basic fusion framework based on the generalized intensity-hue-saturation transform (GIHS) and nonsubsampled contourlet transform (NSCT). When fusing low-frequency coefficients, a fusion method that couples phase congruency and gain injection is adopted to reduce the spectral distortion caused by nonlinear radiometric differences between images. Meanwhile, an improved, simplified pulse-coupled neural network model is used to fuse the high-frequency coefficients of SAR and optical images. Three groups of multi-source, multi-scale, and multi-scene remote sensing images are used to verify the feasibility of PC-SPCNN and compared with existing fusion algorithms. The results indicate that the PC-SPCNN is superior to existing algorithms in both visual effect and objective evaluation and has better fusion performance.
The temporal stability and spatial dependence of pixel-specific red-NIR soil line coefficients were studied using analyses of geostatistics and covariance. We used time series of MODIS 8-day composite reflectance data from 2000 to 2019 of bare soil surfaces near Kabul, Afghanistan. The goal was to explore the feasibility of reducing the soil background signal in red-NIR vegetation indices (VIs). The results were that 1) Red-NIR pixel-specific soil line coefficients can be obtained using multi-temporal remote sensing imagery; 2) pixel-specific Red-NIR soil line coefficients derived from multi-temporal remote sensing imagery exhibit clear spatial dependency related to the soil line coefficients of soils sampled in the field, and 3) Red-NIR pixel-specific soil line coefficients were stable for years, but showed instability for periods longer than 16 years. These results imply that A) some changes in soil surface characteristics can be deduced using long-term observations, B) pixel-specific Red-NIR soil lines at any location can be estimated by kriging interpolation. Consequently, soil line coefficients derived from multi-temporal satellite observations can be used to derive information about vegetation, such as the germination stages pixel-by-pixel or above-ground biomass in sparsely vegetated lands. The derived values should be valid over wide areas, minimizing soil background effects.
Crop type mapping visualizes the spatial distribution pattern and proportion of planting areas of different crop types, which is the basis for subsequent agricultural applications. Although optical remote sensing has been widely used to monitor crop dynamics, data are not always available due to cloud and other atmospheric effects on optical sensors. Satellite microwave systems such as Synthetic Aperture Radar (SAR) have all-time and all-weather advantages in monitoring ground and crop conditions, combining optical imagery and SAR imagery for crop type classification is of great significance. Our study mainly proposes seven feature combination schemes based on the combination of multi-temporal spectral features and texture features of Sentinel-2 (S2), and radar backscattering features of Sentinel-1 (S1) evaluate the influence of different data sources and different features on classification accuracy, obtains the optimal classification strategy and analyses the contribution of different features to classification result, in the aim of providing a new technical approach for the fine identification of crops from multi-source remote-sensing data. Results show that the crop classification accuracy of combined multi-time series spectral, texture, and radar features is higher than that of combining two types of features. The features subset selected by multi-period spectral, texture, and radar features have the best classification result, the overall accuracy (OA) and kappa coefficients reach 96.40% and 0.93, respectively. The study provides a method reference for future research on larger-scale remote-sensing crop precise extraction.
High-resolution remote sensing images (HR-RSIs) have a strong dependency between geospatial objects and background. Considering the complex spatial structure and multiscale objects in HR-RSIs, how to fully mine spatial information directly determines the quality of semantic segmentation. In this paper, we focus on the Spatial-specific Transformer with involution for semantic segmentation of HR-RSIs. First, we integrate the spatial-specific involution branch with self-attention branch to form a Spatial-specific Transformer backbone to produce multilevel features with global and spatial information without additional parameters. Then, we introduce multiscale feature representation with large window attention into Swin Transformer to capture multiscale contextual information. Finally, we add a geospatial feature supplement branch in the semantic segmentation decoder to mitigate the loss of semantic information caused by down-sampling multiscale features of geospatial objects. Experimental results demonstrate that our method can achieve a competitive semantic segmentation performance of 87.61% and 80.08% mIoU on Potsdam and Vaihingen datasets, respectively.
Band selection (BS) is a method for optimizing feature selection, which aims to of reduce the computational complexity of processing hyperspectral image (HSI). However, there are many BS methods applied to image classification, target detection, and anomaly detection. Furthermore, the existing BS methods ignore the spatial structure of HSI. To solve the above problems, we proposed a dynamic programming-based BS method for hyperspectral unmixing. In this paper, we use the convex geometric structure of HSI in band space to project it into the subspace to obtain depth spectral features, then construct a dynamic programming model to select representative bands. To verify the effectiveness of the proposed method, experiments are conducted on three widely used datasets, and compared with three popular BS methods. The experimental results show that the proposed method has satisfactory performance in different evaluation indexes (including signal to reconstruction error (SRE), root mean square error (RMSE)) and three quantitative evaluations (average information entropy (AIE), average correlation coefficient (ACC) and average relative entropy (ARE)).
The global extent of the amount of burned area seems to have changed substantially in the last two decades. Discussions regarding the main force behind the current trends have dominated research in recent years, with several studies attributing the global decline in wildfires to socio-economic and land-use changes. This review discusses the uncertainties and limitations of remotely sensed data used to determine global trends in burned areas and changes in their potential drivers. In particular, we quantify changes in the amount of burned area and cropland area and illustrate the lack of consistency in the direction and magnitude of the trend in cropland land cover type specifically within sub-Saharan Africa, the region where data show a strong trend in the amount of burned area. We state the limitations of remote-sensed fire and land cover products. We end by demonstrating that based on the currently available data and research methods applied in the literature, it is not possible to unequivocally determine that cropland expansion is the primary driver of the decline in fire activity.
During the last decades, Synthetic Aperture Radar Automatic Target Recognition (SAR ATR) has been vastly utilized in the military services and civil studies. Because of the sensibility of the SAR images of the imaging in the azimuth dimension, using the multi-aspect SAR image sequence is more practical than the single-aspect SAR image to achieve the superior classification accuracy in an actual SAR ATR task. At present time, multi-aspect SAR ATR models mostly utilize Recurrent Neural Networks (RNN) that depend on the sequence between samples and so suffer from the lack of information. In a practical work, a huge amount of training set is needed to train a deep learning network accurately, but it is so costly work to extract multi-aspect SAR images. So, in this article, a new model of multi-aspect SAR ATR is proposed by using a self-attention model. The proposed self-attention model is utilized to calculate the internal correlation between the original SAR images. At the same time, to develop the anti-noise capability of the proposed model and decrease the dependency on a huge amount of training data, a Convolutional Auto Encoder (CAE) is designed and utilized in the feature extraction section of the proposed model. On the other hand, unlike existing traditional methods, in this work we use both amplitude and phase information of SAR images to devolve the training process of the proposed model. It should be noted that all of the parameters of the proposed network are developed to a complex domain. In addition a complex backpropagation algorithm by using the gradient based model is used for training the network. In the end, experiments is obtained by using two MSTAR data set (the MSTAR-SOC and EOC). The experimental results prove that the proposed model not only can achieve a high recognitions rate on the case of sufficient training sample but also can obtain an acceptable rate in the case of small training samples in different complex situations. In addition, the simulation results demonstrate that by using the encoder of CAE in the proposed model, the whole configuration of the proposed model achieves the anti-noise capability that is a valuable benefit of any practical SAR ATR task.
Deep learning has achieved promising results for hyperspectral image (HSI) classification in recent years due to its hierarchical structure and automatic feature extraction ability from raw data. The HSI has continuous spectral information, allowing for the precise identification of materials by capturing minute spectral differences. Convolutional neural networks (CNNs) have proven to be effective feature extractors for HSI classification. However, inherent network limitations prevent them from adequately mining and representing the sequence attributes of spectral signatures and learning critical and valuable features from both spectral and spatial dimensions simultaneously. This paper proposes a deep learning-based framework called a novel dual attention-based multiscale-multilevel ConvLSTM3D (DAMCL) to address these challenges. In this work, our contribution is threefold; firstly, a dual attention mechanism is proposed, effectively learning critical and valuable features from spectral and spatial dimensions. Secondly, multiscale ConvLSTM3D blocks can learn the discriminative features alongside handling long-range dependencies of spectral data. Thirdly, these features are combined by a multilevel feature fusion approach to maximize the impact of features learned at different levels. To assess the performance of the proposed method, extensive experiments are carried out on five different benchmark datasets containing complex and challenging land cover classes. The results confirm that the proposed method outperforms state-of-the-art techniques with a small number of training samples in terms of overall accuracy (OA), average accuracy (AA), and Kappa (k). The overall accuracy of 98.88%, 99.42%, 99.20%, 95.37%, and 92.57% is achieved over the Indian Pines, Salinas Valley, University of Pavia, Houston 2013, and Houston 2018 datasets, respectively.
In existing rooftop extraction methods, either too few or too many features in high spatial resolution remote-sensing image (HSRRSI) are used, reducing the rooftop extraction accuracy. Accordingly, a rooftop extraction method for HSRRSI based on sparse representation (SR) is proposed in this work. The optimal segmentation parameters are first determined by the ratio of mean difference to neighbours to standard deviation index method and maximum area method. Thereafter, the optimal feature subset of HSRRSI is constructed on the basis of the L1 regularization SR model to remove redundant features. Finally, a random forest classifier is used to extract rooftops based on the optimal feature subset. Results show that the overall accuracy of the two study areas in Zhanggong District are 0.91776 and 0.88313, respectively. This study can help in effectively extracting rooftops from HSRRSI, which is of great significance in urban planning, population statistics and economic forecasting.
Recently, deep learning for hyperspectral image classification has been successfully applied, and some convolutional neural network (CNN)-based models already achieved attractive classification results. Since hyperspectral data is a spectral-spatial cube data that can generally be considered as sequential data along with the spectral dimension, CNN models perform poorly on such a sequential data. Unlike convolutional neural networks (CNNs) that mainly concern with local relationship models in images, transformer has been shown to be a powerful structure for qualifying sequential data. In the SA (self-attention) module of ViT, each token is updated through aggregating all token’s features based on the self-attention graph. Through this, tokens can exchange information sufficiently among each other which provides a powerful representation capability. However, as the layers become deeper, the transformer model suffers from network degradation. Therefore, in order to improve the layer-to-layer information exchange and alleviate the network degradation problem, we propose a Weighted Residual Self-attention Graph-based Transformer (RSAGformer) model for hyperspectral image classification with respect to the self-attention mechanism. It effectively solves the network degradation problem of deep transformer model by fusing the self-attention information between adjacent layers and extracts the information of data effectively. Extensive experiment evaluation with six public hyperspectral datasets shows that the RSAGformer yields competitive results for classification.
Beacon towers are an important infrastructure responsible for transmitting military information in ancient times. However, many of the beacons have disappeared due to natural erosion and man-made vandalism. Historical U2 aerial images provide heritage and geographical information over the last century to research the beacon tower system. However, the fragmented distribution of beacons and the greyscale colouring of the aerial images make it difficult to manually identify small-sized beacons in a wide range of aerials. This study introduced deep learning to automatically detect beacons in U2 images. Three improvements were added to the standard Fully Convolutional One-Stage Object Detection (FCOS) network: 1) The structure of the Feature Pyramid Network (FPN) was adjusted to enhance the small objects feature at lower layers; 2) The standard convolutional kernel in backbone network was replaced with DCNv2 to account for irregular towers; 3) NMS was replaced with Soft-NMS to improve the accuracy of the detection box prediction. Our results demonstrate that more than 60% average precision (AP) can be obtained using our improved FCOS. After testing, the results showed that the three-part methodology can automatically detect most beacons in historical U2 aerial images, reduce the manual miss rate, and improve efficiency. The results of the test were the first to successfully identify destroyed beacons, recreate the beacon route, and retrace the beacon siting strategy. Our method helps to speed up the efficiency of heritage excavation in historical aerial images, and it may provide a convenient means of processing in other architectural heritage restoration studies.
In recent years, Graph Neural Networks (GNN) have begun to receive extensive attention from researchers. Subsequently, ViG was proposed and its performance in learning irregular feature information in non-Euclidean data space was astonishing. Inspired by the success of ViG, we propose a GNN-based multi-scale fusion network model (GCNCD) to extract graph-level features for remote sensing building change detection (CD). GCNCD builds bitemporal images into a graph structure. It then learns richer features by aggregating the features (edge information) of neighbour vertices in the graph. To alleviate the over-smoothing problem caused by multi-layer graph convolution, the FNN module is used to improve the network’s ability to transform features and reduce the loss of spatial structure information. Compared with the traditional single-type feature fusion module, in the decoder, we perform feature fusion on adjacent-scale features and all scale features, respectively. It helps to promote information mobility and reduce spatial information loss. Our extensive experiments demonstrate the positive effects of graph convolution and fusion module in the field of remote sensing building change detection.
Change detection (CD) in high-resolution remote sensing images (RSIs) can be regarded as a binary visual recognition problem. Metric learning (ML) is a reliable method to determine pixel class attributes based on discriminative distance function between learnable image features. However, most related works learn discriminative distance functions in single-scale feature pairs, which suffer from slow convergence and poor local optima, partially due to that the loss function employs only large-scale feature samples while not interacting with the other scale features. Furthermore, more effective features are a prerequisite for improving the performance of ML-based CD. Hence, we propose a novel hierarchical metric learning network (HMLNet) with dual attention for CD in RSIs, where the key point is that hierarchical metric learning is performed in an ensemble manner to improve detection accuracy and accelerate model convergence. Specifically, based on the features extracted from the encoder-decoder backbone, we construct a feature pyramid to handle the complex details of objects at various scales in RSIs, and then perform metric learning between the paired pyramid features at the same scale. In addition, the dual-attention module is proposed to enhance the internal consistency of changed objects and effectively obtain more detailed information by acting on multi-scale pyramid features. Extensive experiments on the two public RSIs CD datasets, and the results demonstrate that the proposed HMLNet can accurately locate changed objects, which consistently outperforms the state-of-the-art CD competitors.
Parcels are the basic unit of crop planting and management. Therefore, parcelwise farmland data become the fundamental basis for precision agriculture applications, and the extraction of parcels from high-resolution remote sensing images is of great importance. The deep learning-based edge detection methods have achieved superior performance, but these methods output edge intensity maps with pixel values from 0 to 255 in raster format. Vectorization, which transforms the rasterized data into vectors, is an important post-processing procedures. In this process, segmentation and thinning are two key steps for deriving the one-pixel-wide binary edge, however, the traditional method suffers deviation from the actual edge and the unclosed edge. To address these problems, based on the hypothesis that the larger the edge intensity value is, the greater the likelihood that the pixel is on or near the boundary, we developed a multilevel segmentation method for agricultural parcel extraction from a semantic boundary, which prioritizes using the pixels with high intensity to ensure that the extracted boundaries adhere closely to the actual boundaries and the pixels with low intensity connect the unclosed boundaries, thus simultaneously improving the fidelity and completeness of boundaries. We selected images acquired in Hangzhou Bay and Denmark to test our method, and the result demonstrates that our method can accurately extract agricultural parcels. Compared with the single threshold segmentation method, our method shows higher boundary fidelity and completeness. Compared with the state-of-the-art method, our method achieves competitive performance in traditional metrics but outperforms edge preservation and one-to-one correspondence between the extracted parcel and actual parcel.
It is proposed to obtain information on atmospheric refractivity structure by measuring the angle of arrival (AoA) of radio signals routinely broadcast by commercial aircraft. The angle of arrival would be measured at hill-top sites using a simple two-element interferometer. Knowledge of the aircraft’s location (information conveniently contained within the broadcasts) and the AoA will enable the bending angle of the signals to be calculated. As measurable bending will only occur at grazing incidence, sources of signals either very close to the radio horizon, or at a similar height to the interferometer, are essential. The routine navigational data broadcasts from civil aircraft represent the ideal source. In areas of high air traffic density such as the UK, \~105-106bending angle measurements may be possible each day. Numerical weather prediction models routinely assimilate bending angles retrieved from GNSS radio occultation data, so it is anticipated that assimilation methods could be developed that are able to make good use of this new source of bending angle data. Sensitivity tests were performed to estimate the resolution of humidity retrievals assuming a target AoA accuracy of 0.01°. Simulated annealing was used to demonstrate the ability to retrieve relative humidity and mixing ratio vertical profiles using AoA measurements. It is shown that for observed AoA measurements with an accuracy of 0.01° it should be possible to retrieve relative humidity and mixing ratio vertical profiles with an accuracy of \~5% and \~0.5 g/kg respectively. An AoA accuracy of 0.01° should be achievable using hardware costing \~€10k, however further hardware development is still required.
The Gravity Recovery And Climate Experiments (GRACE) satellite mission has been instrumental in estimating large-scale groundwater storage changes across the globe. GRACE observations include significant errors, so pre-processing is normally required before the data are used. In particular, the terrestrial water storage anomalies (TWSA) are usually filtered to reduce the effects of measurement errors and then rescaled to reduce the unintended impacts of the filtering. The scaling is typically selected to maximize the Nash-Sutcliffe Efficiency (NSE) between the rescaled filtered TWSA and the original TWSA from large-scale hydrologic models that represent an incomplete water budget. The objectives of this study are as follows (1) to evaluate the use of NSE in the current GRACE rescaling methodology, (2) develop an improved methodology that incorporates a complete regional water budget, and (3) examine the impacts of the rescaling methodology on regional assessments of groundwater depletion. To evaluate the use of NSE as a performance metric, we compare it to an analytical solution that restores the relative variability between the rescaled filtered and original GRACE TWSA series. The relative variability approach produces more reliable estimates when comparing to TWSA estimates from global positioning systems (GPS) for the Sacramento and San Joaquin River basins in California. Rescaling the complete regional water budget results in a larger scale factor than the scale factor from the large-scale hydrologic model outputs, and the new TWSA results are more consistent with those from GPS. The large scale factor also suggests that regional groundwater depletion is more severe than previously estimated.
Deep learning has achieved impressive success in computer vision, especially remote sensing. It is well known that different deep models are able to extract different kinds of features from remote sensing images. For example, the convolutional neural networks (CNN) can extract neighbourhood spatial features in the short-range region, the graph convolutional networks (GCN) can extract structural features in the middle- and long-range region, and the encoder-decoder (ED) can obtain the reconstruction features from an image. Thus, it is challenging to design a model that can combine the different models to extract fused features in a hyperspectral image classification task. To this end, this paper proposes a three-branch attention deep model (TADM) for the classification of hyperspectral images. The model can be divided into three branches: graph convolutional neural network, convolutional neural network, and deep encoder-decoder. These three branches first extract structural features, spatial-spectral joint features and reconstructed encoded features from hyperspectral images, respectively. Then, a cross-fusion strategy and an attention mechanism are employed to automatically learn the fusion parameters and complete the feature fusion. Finally, the hybrid features are fed into a standard classifier for pixel-level classification. Extensive experiments on two real-world hyperspectral datasets (Houston and Trento) demonstrate the effectiveness and superiority of the proposed method. Compared with other baseline classification methods, such as FuNet-C and Two-Branch CNN(H), proposed method achieves the highest classification results. Specifically, overall classification accuracies of 93.25% and 95.84% were obtained on the Houston and Trento data, respectively.
Due to background clutter in synthetic aperture radar (SAR) images, the detection of dense ship targets suffers from a low detection rate, high false alarm rate, and high missed detection rate. To address this issue, an FSM-DFF-YOLOv5+Confluence algorithm is proposed in this paper for the detection of near-shore ship targets in SAR images with complex backgrounds. First, based on the YOLOv5 target detection algorithm, two improvements are made in the feature extraction network: feature refinement and multi-feature fusion; in the feature extraction network, deformable convolutional neural networks are adopted to change the position of the target sampling points of the convolution to improve the feature extraction capability of the target and the detection rate of ship targets in SAR images with a complex background; in the multi-feature fusion network structure, cascading and parallel pyramids are used in the multi-feature fusion network to realize feature fusion at different levels; the visual perceptual field of feature extraction is expanded by using null convolution to enhance the adaptability of the network to detect near-shore multi-scale ship targets with complex backgrounds and reduce the false alarm rate of ship target detection in SAR images with complex environments. In this way, the DFF-YOLOv5 near-shore ship target detection algorithm is established. Meanwhile, to address the problem of missed detection in near-shore dense ship target detection, this paper adds rectangular convolution kernels to the convolution of the feature extraction network to better realize the feature extraction of dense ship targets in SAR images with complex backgrounds. Besides, the Confluence algorithm instead of non-maximum suppression is used in the prediction stage. Through experiments on the constructed complex background near-shore ship detection dataset, it is indicated that the average accuracy of the FSM-DFF-YOLOv5+Confluence detection algorithm reaches 88.96%, and the recall rate reaches 88.80%.
Owing to the differences in sensor types, resolutions, and imaging conditions of heterologous remote sensing images, the matching results of remote sensing images, such as low accuracy, few matched pairs, and low distribution quality, are not ideal, which makes precise registration between heterogeneous images difficult. To mitigate this, we propose a reliable matching algorithm for heterogeneous remote sensing images that considers the spatial distribution of the matched features. First, feature-based matching algorithms such as the scale-invariant feature transform (SIFT) algorithm or the speeded-up robust features algorithm are used to match images to obtain an initial set of matched pairs and a set of candidate features. Then, according to the stability of the spatial distribution of locally correctly matched features, the distance and angular proximity between matched features and their neighbours are calculated to obtain the accuracy of the matched pairs and remove incorrectly matched pairs. Finally, the random sample consensus (RANSAC) algorithm was used to fit the transformation model between images, and the final matched feature selection algorithm and automatic transformation error algorithm were used to detect candidate features to increase the number of matched pairs. Experimental analysis of heterogeneous multiscale and multitemporal optical remote sensing images demonstrates the superior capability of the proposed algorithm over commonly used algorithms, including SIFT, RANSAC, locality preserving matching, learning a two-class classifier for mismatch removal, and linear adaptive filtering algorithms. In particular, when the precision of the initially matched pair is low, the proposed algorithm can achieve excellent results.
The speckle noise found in synthetic aperture radar (SAR) images severely affects the efficiency of image interpretation, retrieval and other applications. Thus, effective methods for despeckling SAR image are required. The traditional methods for SAR image despeckling fail to balance in terms of the relationship between the intensity of speckle noise filtering and the retention of texture details. Deep learning based SAR image despeckling methods have been shown to have the potential to achieve this balance. Therefore, this study proposes a self-attention multi-scale convolution neural network (SAMSCNN) method for SAR image despeckling. The advantage of the SAMSCNN method is that it considers both multi-scale feature extraction and channel attention mechanisms for multi-scale fused features. In the SAMSCNN method, multi-scale features are extracted from SAR images through convolution layers with different depths. These are concatenated; then, and an attention mechanism is introduced to assign different weights to features of different scales, obtaining multi-scale fused features with weights. Finally, the despeckled SAR image is generated through global residual noise reduction and image structure fine-tuning. The despeckling experiments in this study involved a variety of scenes using simulated and real data. The performance of the proposed model was analysed using quantitative and qualitative evaluation methods and compared to probabilistic patch-based (PPB), SAR block-matching 3-D (SAR-BM3D) and SAR-CNN methods. The experimental results show that the method proposed in this paper improves the objective indexes and shows great advantages in visual effects compared to these classical methods. The method proposed in this study can provide key technical support for the practical application of SAR images.
Change detection in remote sensing images has an important impact in various application fields. In recent years, great progress has been made in the change detection methods of multiple types of ground objects, but there are still limited recognition capabilities of the extracted features, resulting in unclear boundaries, and the accuracy rate needs to be improved. To address these issues, we use a high-resolution network (HRNet) to generate high-resolution representations and add new data augmentation methods to improve its accuracy. Secondly, we introduce the model of Transformer structure —— CSWin and HRNet to fuse to improve the performance and effect of the model. In order to enhance the model’s ability to perceive ground objects at different scales, a feature fusion network suitable for multi-class semantic segmentation is designed, named A-FPN. This feature fusion network is introduced between the CSWin backbone network and the semantic segmentation network. The experimental results show that the fusion method greatly improves the accuracy to 89.31% on the SECOND dataset, significantly reduces false detections, and recognizes the edges of objects more clearly. And achieved good results in the three evaluation indicators of precision, recall, and F1-score.
Three-dimensional (3D) point cloud registration is a critical topic in 3D computer vision and remote sensing. Several algorithms based on deep learning techniques have recently tried to deal with indoor partial-to-partial point cloud registration by searching the correspondences between input point clouds. However, existing correspondence-based methods are vulnerable to noise and do not adequately exploit geometric information to extract features, resulting in incorrect correspondences. In this work, we develop a novel network using correspondence confidence and overlap scores to address these challenges. Specifically, we first introduce a feature interaction module that combines spatial structure information to encode unique geometric embedding, greatly enhancing the feature perception. Furthermore, we design a corresponding point matching module, which includes a two-stage point filtering strategy. This method effectively improves the ability to identify embedded inliers from outliers and accurately remove spurious matches, thus allowing the network to focus more on the accurate correspondences of overlapping regions. Extensive experiments on different benchmark datasets indicate that our network shows superior performance of indoor point cloud registration, especially in low overlap registration, with significant improvement over state-of-the-art (SOTA) methods. Our code can be found at
With the continuous improvement of radar resolution, distributed characteristics are presented with regard to maritime moving targets, which occupy multi-resolution cells in Synthetic Aperture Radar (SAR) image domain. However, the current constant false alarm rate (CFAR) detection algorithms rarely consider the impact of distributed characteristics on target detection performance, and thus the corresponding performance evaluation methods could not be applicable to this issue. In this paper, a multi-station fusion detection method for maritime moving target (MMT) is presented based on three-dimensional (3D) sliding window. Firstly, the multi-station echoes in Cartesian-Doppler frequency rate (DFR) domain are obtained under the simple-transmitting and multiple-receiving operation configuration, and the 3D sliding window is designed to achieve the optimal matching for a specified target with any moving direction in terms of its prior information, i.e. Radar Cross Section (RCS), radar resolution and target size. Then, the target cells within the sliding window are directly processed by means of the M/N criterion, and thus to avoid target detection performance loss caused by the artificial construction of extended detection statistics. Finally, a novel quantitative evaluation method in regards to a high-resolution radar is designed by mining the relationship between the target detection performance and the occupied cell number, which could greatly improve detection probability of distributed targets on the premise of a constant false alarm rate. The proposed algorithm does not need to design complex extended detection statistics, which provides a feasible way for robust detection and performance evaluation for a high-resolution radar. Simulation results verify the effectiveness of this research.
Image semantic segmentation methods based on convolutional neural networks rely on supervised learning with labels, and their performance often drops significantly when applied to unlabelled datasets from different sources. The domain adaptation methods can reduce the inconsistency of feature distribution between the unlabelled target domain data used for testing and the labelled source domain data used for training, thus improve the segmentation performance and have more practical applications. However, in the field of remote sensing image processing, if the spatial resolutions of the source domain and the target domain are different and this problem is not to be solved, the performance of the transferred model will be affected. In this paper, we propose a bidirectional semantic segmentation method based on super-resolution and domain adaption (BSSM-SRDA), which is suitable for the transfer learning task of a semantic segmentation model from a low-resolution source domain data to a high-resolution target domain data. BSSM-SRDA mainly consists of three parts: a shared feature extraction network; a super-resolution image translation module, which incorporates a super-resolution approach to reduce spatial resolution differences and visual style differences of the two domains; a domain-adaptive semantic segmentation module, which combines an adversarial domain adaptation approach to reduce differences at the output level. At the same time, we design a new bidirectional self-supervised learning algorithm for BSSM-SRDA that facilitates mutually beneficial learning of the super-resolution image translation module and the domain-adaptive semantic segmentation module. The experiments demonstrate the superiority of the proposed method over other state-of-the-art methods on two remote-sensing image datasets, with mIoU improvements of 2.5% and 3.2%, respectively. Code:
Automated 3D reconstruction based on satellite images has become a research hotspot at the interdisciplinary of photogrammetry and computer vision. The 3D results based on satellite images will play a key role in the understanding of global 3D information, monitoring of national geographic and urban construction, with the inherent advantage of satellite images in global coverage. Researchers have devoted substantial effort to develop state-of-the-art 3D reconstruction methods for two-view satellite images and multi-view satellite images. However, it is still a challenging task to obtain complete and accurate 3D results with satellite images due to the difference in shooting angles between satellite images, exposure differences and building occlusions in urban scenes. In this paper, we execute theoretical analyses and experimental evaluations about the popular 3D reconstruction methods towards satellite images following the order of two views to multiple views: (1) The advanced dense matching methods aimed at satellite images are reviewed theoretically and evaluated experimentally. (2) The state-of-the-art 3D reconstruction based on two-view satellite images are analysed in detail and experimentally evaluated with two-view WorldView-3 satellite images. (3) The popular fusion methods of multi-view DSM are analysed theoretically and assessed on multi-view WorldView-3 satellite images. This review will be helpful for researchers dedicated to enhancing the accuracy and completeness of the results of 3D reconstruction from urban satellite images.
This study presents a global, hourly surface soil moisture estimation procedure based on precipitation and temperature data. Information on soil composition further helps to define the local characteristics of soil moisture development. An advanced antecedent precipitation index (API) is utilized to generate a global soil moisture product of high temporal resolution with the Global Precipitation Measurement (GPM) Missions Integrated Multi-Satellite Retrievals for GPM (IMERG) as main driver. The resulting global GPM API data set is compared against in situ measurements from the International Soil Moisture Network (ISMN) and is also evaluated against the soil moisture data set from the European Space Agency’s Climate Change Initiative (ESA CCI SM). The study shows that with empirically derived dampening factors the GPM API achieves a mean ubRMSD across the utilized in situ stations in different climates and vegetation zones of 4.68 Vol% and a bias of 0.88 Vol%. The data set clearly represents the local soil moisture schemes with seasonal variations. When comparing with ESA CCI SM, the GPM API does perform better at the measurement sites concerning bias, correlation and error values. The data set is in most parts negatively biased compared to the ESA CCI SM, however better matches the mean soil moisture at ISMN stations. Overall, the GPM API delivers a very promising global, hourly surface soil moisture product at 0.1 ∘×0.1 ∘ spatial resolution.
The long-term trend and its spatio-temporal characteristics of dew formation over the Beijing-Tianjin-Hebei (BTH) region are still poorly understood. We examined dew and its climatic controls in BTH from 2008 to 2021 over China with the aid of model estimation based on the atmospheric data from the China Meteorological Administration Land Data Assimilation System (CLDAS). Besides, the relevance of dew to drought was also examined. Dew amount showed a decreasing tendency from northwest to southeast over BTH for the 14 years. The average monthly accumulative dew amount ranged from 0.56 mm to 15.87 mm. The annual dew amount was highest in Bashang and lowest in Plain area. Precipitation(P), air temperature (Ta) and relative humidity (RH) all showed positive correlation with dew over the three ecoregions, and among these, RH showed the most significant correlation with dew. The BTH showed a drier trend over the study period and dew exhibited higher over the dry area. Our study can not only provide a novel understanding of dew characteristics in the BTH, but also have an important potential effect on the water resources management in the context of climate change exacerbation.
Due to the unique climatic characteristics and vegetation features of tropical regions, the correlation R² of gross primary productivity (GPP) estimation in tropical regions using remote sensing models was generally lower than 0.3. Therefore, for the cloudy and rainy tropical regions, the influence brought by clouds on remote sensing images needed to be considered in GPP estimation. This paper developed a corrected vegetation photosynthesis model (VPM) for estimating GPP under cloudy conditions. It mainly corrected the two parameters, Wscalar and Enhanced Vegetation Index (EVI), which were obtained from remote sensing images and were therefore greatly influenced by clouds in the model. First, the water stress factor Wscalar was replaced by Evaporation Fraction (EF). Secondly, using the good correlation between near surface temperature and EVI, the conversion coefficient between near surface temperature and EVI was fitted to achieve the effective reconstruction of EVI contaminated by clouds. The correction of the two factors improved the estimation accuracy of the VPM model, and the comparison with the observed values of the GPP site in 4 years showed that the correction of EVI had a better improvement, with an increase of 0.22 in R² compared with the pre-correction, and the correction of Wscalar was increased by 0.11 in R². To verify the proposed method, the in-situ observation data of Xishuangbanna flux site from 2007 to 2010 were used. The results showed that the proposed method effectively improved the accuracy of GPP estimation by VPM model, especially in 2007 it was strongly influenced by clouds, and the improvement was significant, with R² increasing from 0.2 to 0.82. In general, the accuracy of GPP estimation by the proposed method had been significantly improved, with RMSE (gC·m⁻²·8 day⁻¹) decreasing from 15,14.4, 18.1, 14.2 to 8.07, 6.56, 10.33, 11.44, respectively. Therefore, the proposed method can be used to estimate the GPP for tropical seasonal rain forests in Xishuangbanna.
Journal metrics
Acceptance rate
3.531 (2021)
Journal Impact Factor™
4.5 (2021)
Top-cited authors
Hanqiu Xu
  • Fuzhou University
J. R. G. Townshend
  • University of Maryland, College Park
Josep Penuelas
  • Spanish National Research Council (CSIC)-Centre for Ecological Research and Forestry Applications (CREAF)
Iolanda Filella
  • CREAF Centre for Ecological Research and Forestry Applications
Compton J Tucker