Figure - available via license: CC BY
Content may be subject to copyright.
Source publication
Remote sensing image change detection (CD) is done to identify desired significant changes between bitemporal images. Given two co-registered images taken at different times, the illumination variations and misregistration errors overwhelm the real object changes. Exploring the relationships among different spatial–temporal pixels may improve the p...
Context in source publication
Similar publications
facial photographs of the subjects are often used in the diagnosis process of orthognathic surgery. the aim of this study was to determine whether convolutional neural networks (cnns) can judge soft tissue profiles requiring orthognathic surgery using facial photographs alone. 822 subjects with dentofacial dysmorphosis and / or malocclusion were in...
We propose a novel hybrid approach that fuses traditional computer vision techniques with deep learning models to detect figures and formulas from document images. The proposed approach first fuses the different computer vision based image representations, i.e., color transform, connected component analysis, and distance transform, termed as Fi-Fo...
Citations
... Google Earth can provide virtual visualizations of the Earth. The advantages of Google Earth not only show the Earth visually but also give data exploration, data collection, validation, data integration, simulation, and ease of use [7]. Conversely, Google Earth has many limitations, such as inconsistent image quality, a limited capability for making quantitative measurements, a lack of analytical functionality, and the inability to support precise global spatial simulations [8]. ...
The international community views Indonesia as rich in natural tourism, like a tourist village. One of the tourism potentials that can be developed as part of a tourist village is Pabelan Sabodam, located in Magelang Regency, Central Java Province. The Pabelan River in Sabodam has the potential to serve the community as a tourist destination in addition to being a water system infrastructure for agricultural land. However, the current problem is that there is no supporting road access to this weir. Therefore, this research will model the design of the access road to the Pabelan Sabodam to support the acceleration of the tourism development of the Pabelan Sabodam. The modeling process begins with collecting data from Google Earth and processing it through the Global Mapper application. Road modeling is carried out using the Civil 3D application, and then the visualization process approaches the actual conditions using the SketchUp application. The modeling results show horizontal and vertical alignment, road, and cross sections. In addition, the estimated volume of excavation and embankment required from the Civil 3D application is also obtained.
... Moreover, the features of inter-class separability and intra-class separability can be obtained from the learning of semantic relations to make the network more robust. A Spatial-Temporal Attention-Based Method and a New Dataset for Remote Sensing Image Change Detection (STAnet) proposed by Chen et al. [15] designed two kinds of self-attention modules, the basic space-time attention module (BAM) and the pyramidal space-time attention module (PAM). BAM learns to capture the spatio-temporal dependence between any two locations (noting the weights) and calculates the response of each location by weighting the sum of features at all locations in space-time. ...
Remote sensing image change detection can effectively show the change information of land surface features such as roads and buildings at different times, which plays an indispensable role in application fields such as updating building information and analyzing urban evolution. At present, multispectral remote sensing images contain more and more information, which brings new development opportunities to remote sensing image change detection. However, this information is difficult to use effectively in change detection. Therefore, a change-detection method of multispectral remote sensing images based on a Siamese neural network is proposed. The features of dual-temporal remote sensing images were extracted based on the ResNet-18 network. In order to capture the semantic information of different scales and improve the information perception and expression ability of the algorithm for the input image features, an attention module network structure is designed to further enhance the extracted feature maps. Facing the problem of false alarms in change detection, an adaptive threshold comparison loss function is designed to make the threshold more sensitive to the remote sensing images in the data set and improve the robustness of the algorithm model. Moreover, the threshold segmentation method of the measurement module is used to determine the change area to obtain a better change-detection map domain. Finally, our experimental tests show that the proposed method achieves excellent performance on the multispectral OSCD detection data sets.
... A. Experimental Setups 1) Dataset: We conduct experiments on the LEVIR-CD Dataset [24]. This is a large-scale benchmark dataset for CD in HR RSIs. ...
Vision Foundation Models (VFMs) such as the Segment Anything Model (SAM) allow zero-shot or interactive segmentation of visual contents, thus they are quickly applied in a variety of visual scenes. However, their direct use in many Remote Sensing (RS) applications is often unsatisfactory due to the special imaging characteristics of RS images. In this work, we aim to utilize the strong visual recognition capabilities of VFMs to improve the change detection of high-resolution Remote Sensing Images (RSIs). We employ the visual encoder of FastSAM, an efficient variant of the SAM, to extract visual representations in RS scenes. To adapt FastSAM to focus on some specific ground objects in the RS scenes, we propose a convolutional adaptor to aggregate the task-oriented change information. Moreover, to utilize the semantic representations that are inherent to SAM features, we introduce a task-agnostic semantic learning branch to model the semantic latent in bi-temporal RSIs. The resulting method, SAMCD, obtains superior accuracy compared to the SOTA methods and exhibits a sample-efficient learning ability that is comparable to semi-supervised CD methods. To the best of our knowledge, this is the first work that adapts VFMs for the CD of HR RSIs.
... The motivation for applying similarity optimization in CD is to learn a feature space where the pairs of feature vectors representing changed regions exhibit substantial dissimilarity, while pairs representing unchanged regions demonstrate proximity to each other [48]. By explicitly optimizing the distance between bi-temporal features extracted from the Siamese CNN, the efficacy of feature representation is enhanced, thus facilitating the extraction of time-series correlations. ...
... Building upon the contrastive loss, Zhang et al. proposed a triplet loss to capture the semantic correlation among features extracted from Siamese CNN [50], consequently improving the interclass separability and intraclass consistency of features. As continuous down-sampling operations result in the loss of spatial details in input images, Chen and Shi [48] further enhanced feature representation by integrating features of different scales and introducing self-attention mechanisms to exploit spatial and temporal relations. Furthermore, Shi et al. applied deep supervision and soft attention mechanisms to further improve the feature representation in CD [51]. ...
... The details of these datasets are described as follows. 1) LEVIR-CD dataset: The LEVIR-CD dataset is a largescale building CD dataset that covers 20 different cities in the US [48]. The dataset includes images of 0.5m spatial resolution with more than 30,000 change buildings. ...
Change detection (CD) is a fundamental and important task for monitoring the land surface dynamics in the earth observation field. Existing deep learning-based CD methods typically extract bi-temporal image features using a weight-sharing Siamese encoder network and identify change regions using a decoder network. These CD methods, however, still perform far from satisfactorily as we observe that 1) deep encoder layers focus on irrelevant background regions and 2) the models' confidence in the change regions is inconsistent at different decoder stages. The first problem is because deep encoder layers cannot effectively learn from imbalanced change categories using the sole output supervision, while the second problem is attributed to the lack of explicit semantic consistency preservation. To address these issues, we design a novel similarity-aware attention flow network (SAAN). SAAN incorporates a similarity-guided attention flow module with deeply supervised similarity optimization to achieve effective change detection. Specifically, we counter the first issue by explicitly guiding deep encoder layers to discover semantic relations from bi-temporal input images using deeply supervised similarity optimization. The extracted features are optimized to be semantically similar in the unchanged regions and dissimilar in the changing regions. The second drawback can be alleviated by the proposed similarity-guided attention flow module, which incorporates similarity-guided attention modules and attention flow mechanisms to guide the model to focus on discriminative channels and regions. We evaluated the effectiveness and generalization ability of the proposed method by conducting experiments on a wide range of CD tasks. The experimental results demonstrate that our method achieves excellent performance on several CD tasks, with discriminative features and semantic consistency preserved.
... Zhang et al. [42] proposed a deep Siamese semantic network change detection method, improved the loss function, and used the triplet of piecewise functions to strengthen the robustness of the model. Chen and Shi [43] proposed a spatio-temporal attention neural network based on a connected body, dividing the image into sub-regions of multiple scales and introducing the self-attention mechanism in them, thus capturing spatio-temporal dependencies of various scales. Song et al. [44] proposed a change detection network based on the U-shaped structure, which extracts and learns the similar feature information, different feature information, and global feature information of dual-temporal remote sensing images through multiple branches. ...
... To more comprehensively validate the effectiveness of our proposed SAFNet model, we evaluate the model's performance on three different remote sensing image change detection datasets, BICD [50], CDD [51] and LEVIR-CD [43]. ...
Recently, deep learning-based change detection methods for bitemporal remote sensing images have achieved promising results based on fully convolutional neural networks. However, due to the inherent characteristics of convolutional neural networks, if the previous block fails to correctly segment the entire target, erroneous predictions might accumulate in the subsequent blocks, leading to incomplete change detection results in terms of structure. To address this issue, we propose a bitemporal remote sensing image change detection network based on a Siamese-attention feedback architecture, referred to as SAFNet. First, we propose a global semantic module (GSM) on the encoder network, aiming to generate a low-resolution semantic change map to capture the changed objects. Second, we introduce a temporal interaction module (TIM), which is built through each encoding and decoding block, using the feature feedback between two temporal blocks to enhance the network’s perception ability of the entire changed target. Finally, we propose two auxiliary modules—the change feature extraction module (CFEM) and the feature refinement module (FRM)—which are further used to learn the fine boundaries of the changed target. The deep model we propose produced satisfying results in dual-temporal remote sensing image change detection. Extensive experiments on two remote sensing image change detection datasets demonstrate that the SAFNet algorithm exhibits state-of-the-art performance.
... (1) Learning, vision, and remote sensing change detection dataset (LEVIR-CD) [66] is a publicly available change detection resource for large buildings. It comprises 637 pairs of high-resolution (0.5 m) remote sensing images, each measuring 1024×1024 pixels. ...
Among the current mainstream change detection networks, transformer is deficient in the ability to capture accurate low-level details, while convolutional neural network (CNN) is wanting in the capacity to understand global information and establish remote spatial relationships. Meanwhile, both of the widely used early fusion and late fusion frameworks are not able to well learn complete change features. Therefore, based on swin transformer V2 (Swin V2) and VGG16, we propose an end-to-end compounded dense network SwinV2DNet to inherit the advantages of both transformer and CNN and overcome the shortcomings of existing networks in feature learning. Firstly, it captures the change relationship features through the densely connected Swin V2 backbone, and provides the low-level pre-changed and post-changed features through a CNN branch. Based on these three change features, we accomplish accurate change detection results. Secondly, combined with transformer and CNN, we propose mixed feature pyramid (MFP) which provides inter-layer interaction information and intra-layer multi-scale information for complete feature learning. MFP is a plug and play module which is experimentally proven to be also effective in other change detection networks. Further more, we impose a self-supervision strategy to guide a new CNN branch, which solves the untrainable problem of the CNN branch and provides the semantic change information for the features of encoder. The state-of-the-art (SOTA) change detection scores and fine-grained change maps were obtained compared with other advanced methods on four commonly used public remote sensing datasets. The code is available at https://github.com/DalongZ/SwinV2DNet.
... In the first stage, two different scales of features (contextual and detail information) were inputted into the self-attention module separately, and the weights were then adjusted using the self-attention mechanism. In order to reduce computational complexity and improve calculation speed, as shown in Figure 4, this study applied average pooling to reduce the spatial resolution of the input features [47]. Then, the feature X ∈ R C×H ×W was processed through three 1 × 1 convolutions to obtain the three feature vectors Q (queries), K (keys), and V (values). ...
Semantic change detection (SCD) is a challenging task in remote sensing, which aims to locate and identify changes between the bi-temporal images, providing detailed “from-to” change information. This information is valuable for various remote sensing applications. Recent studies have shown that multi-task networks, with dual segmentation branches and single change branch, are effective in SCD tasks. However, these networks primarily focus on extracting contextual information and ignore spatial details, resulting in the missed or false detection of small targets and inaccurate boundaries. To address the limitations of the aforementioned methods, this paper proposed a spatial-temporal semantic perception network (STSP-Net) for SCD. It effectively utilizes spatial detail information through the detail-aware path (DAP) and generates spatial-temporal semantic-perception features through combining deep contextual features. Meanwhile, the network enhances the representation of semantic features in spatial and temporal dimensions by leveraging a spatial attention fusion module (SAFM) and a temporal refinement detection module (TRDM). This augmentation results in improved sensitivity to details and adaptive performance balancing between semantic segmentation (SS) and change detection (CD). In addition, by incorporating the invariant consistency loss function (ICLoss), the proposed method constrains the consistency of land cover (LC) categories in invariant regions, thereby improving the accuracy and robustness of SCD. The comparative experimental results on three SCD datasets demonstrate the superiority of the proposed method in SCD. It outperforms other methods in various evaluation metrics, achieving a significant improvement. The Sek improvements of 2.84%, 1.63%, and 0.78% have been observed, respectively.
... We train our network on LEVIR-CD [41] and DSIFN-CD [42]. The image resolution is set to 256 × 256 to reduce computational load per batch. ...
The transformer plays a crucial role in building change detection (BCD) systems, which are important for observing urban development and post-disaster assessment. However, existing technologies often lack the ability to simultaneously attend to object features in bitemporal images and are not sensitive to changes in small target buildings. To address these issues, we propose SOAT-UNet, a novel transformer-based Siamese network with a multi-head over-attention block for CD tasks. Leveraging token-based space, our model extracts long-range contextual relationships and improves feature extraction for small targets. Inspired by human behavior, we generate queries (Q) from two image sets and calculate keys (K) and values (V) from another set, prioritizing regions likely to change. Experimental results demonstrate that our SOAT-UNet achieves superior CD performance compared to previous models on two existing datasets.
... While their experiment resulted in a 79.6 % accuracy in detecting changed pixels with promising anti-noise capabilities, its computational accuracy is lower than most studies deploying similar techniques. Another study used the publicly available LEVIR-CD dataset and compared it with other public datasets (Chen & Shi, 2020). With a slightly higher computational overhead, their proposed change detection (CD) pipeline achieved an improved accuracy of 87.3 %. ...
Forest Change Detection (FCD) is a critical component of natural resource monitoring and conservation strategies, enabling informed decision-making. Various methods utilizing the power of artificial intelligence (AI) have been developed for detecting and categorizing changes in forest cover using remote sensing (RS) data. One prominent AI-powered approach is the U-Net, a deep learning (DL) architecture famous for its segmentation proficiency. However, the standard U-Net architecture fails to effectively capture intricate spatial dependencies and long-range contextual information present in remote sensing imagery. To address this research gap, we introduce an attention-residual-based novel DL model which leverages the U-Net architecture and Sentinel-2 satellite images to map alterations in forest vegetation cover in the tropical region. Our novel model enhances the U-Net architecture by seamlessly integrating the strengths of the U-Net, harnessing attention mechanisms strategically to amplify crucial features, and leveraging cutting-edge residual connections to facilitate the smooth flow of information and gradient propagation. These meticulous design choices enabled the precise feature extraction, resulting in improved computational performance of the proposed method compared to the Standard U-Net, Deeplabv3+, Deep Res-U-Net, and Attention U-Net. The classification results demonstrate the enhanced efficiency of our model, achieving a Mean Intersection over Union (MIoU) of 0.9330 on our test dataset. This performance surpasses the Attention U-Net (0.9146), Standard U-Net (0.9029), Deeplabv3+ (0.9247), and Deep Res-U-Net (0.9282). The comparative analysis of ground truth reproductions unveiled the superior detection capabilities of our model in accurately identifying forest and non-forest polygons, surpassing both the standard U-Net, and the U-Net augmented with attention mechanism, along with other state-of-the-art techniques, thereby highlighting its enhanced efficacy. The model’s broad applicability can support forest managers and ecologists in rapidly evaluating the long-term ramifications of infrastructure initiatives, such as roads, on tropical forests, including those in Brunei.
... Although the above networks improve the completeness of the prediction results through the attention mechanism, they do not consider the spatial and temporal correlation between dualtemporal remote sensing images. Chen et al [22] and Zhang et al [23] both use the attention mechanism to capture the spatial and temporal relationships between dual-temporal images at different scales to generate better feature representations, which can effectively mitigate pseudo-variation phenomena due to illumination differences and so on. ...
Change detection is a crucial undertaking in the field of remote sensing. Current change detection methods tend to emphasize modelling difference features, ignoring the alignment error of dual-temporal images and the spatio-temporal relationship between dual-temporal images, which affects the recognition ability of features and makes it difficult to distinguish the real change region. Aiming at the above problems, this paper proposes a remote sensing image change detection method based on cross mixing attention network. The method employs the feature alignment module to obtain dual-temporal correction features to improve the classification effect of the boundary pixels of the target region. The spatio-temporal relationship of the dual-temporal phase images is better exploited by the cross mixing attention module to obtain attention maps at different scales to guide the up-sampling and enhancing the detection performance of target areas at different scales. Our introduced network demonstrates promising performance, as evidenced by extensive experimental results on both the LEVIR-CD dataset and SYSU-CD dataset.