Article

A Deep Unfolded Prior-Aided RPCA Network for Cloud Removal

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Clouds, together with their shadows, usually occlude ground-cover features in optical remote sensing images. This hinders the utilization of these images for a range of applications such as earth observation, land-cover classification and urban planning. In this work, we propose a deep unfolded and prior-aided robust principal component analysis (DUPA-RPCA) network for removing clouds and recovering ground-cover information in multi-temporal satellite images. We model these cloud-contaminated images as a sum of low rank and sparse elements and then unfold an iterative RPCA algorithm that has been designed for reweighted 1\ell _{1} minimization. As a result, the activation function in DUPA-RPCA adapts for every input at each layer of the network. Our experimental results on both Landsat and Sentinel images indicate that our method gives better accuracy and efficiency when compared with existing state of the art methods.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This is in contrast to the correlation between foreground frames which may vary significantly for various applications. Consequently, we use the masks of the sparse component, given by the computationally efficient GoDec algorithm [13], to reweight the thresholds of the proximal operator [14]. To exploit the high background correlation, we add constraints that ensure continuity in both spatial and temporal dimensions of the background, which as per the model of RPCA is low-rank. ...
... where Tr() denotes the trace of a matrix, • denotes the Hadamard product, A s and A t are the spatial and temporal Laplacian matrices respectively, and γ 1 and γ 2 are the regularization parameters which balance the spatial and temporal prior information, respectively. In addition, we also reweight the elements of S, as done in [14], with a prior matrix W = Φ(ρ,Ŵ ), where Φ(.) is the sigmoid function with a gain of ρ andŴ is the background mask obtained by inverting the foreground mask from the GoDec algorithm. GoDec is significantly more efficient than ADMM [16] as it does not use time-consuming SVD operation for estimating the lowrank component. ...
... This design leads to a distinct proximal operator for every input at each iteration. Using (14), the iterative steps of L and S, after some algebraic manipulation, turn out to be We unroll (15) and (16) into the multi-layer neural network of DUST-RPCA by replacing the functions of the measurement matrices with convolution kernels as follows ...
Preprint
Low-rank and sparse decomposition based methods find their use in many applications involving background modeling such as clutter suppression and object tracking. While Robust Principal Component Analysis (RPCA) has achieved great success in performing this task, it can take hundreds of iterations to converge and its performance decreases in the presence of different phenomena such as occlusion, jitter and fast motion. The recently proposed deep unfolded networks, on the other hand, have demonstrated better accuracy and improved convergence over both their iterative equivalents as well as over other neural network architectures. In this work, we propose a novel deep unfolded spatiotemporal RPCA (DUST-RPCA) network, which explicitly takes advantage of the spatial and temporal continuity in the low-rank component. Our experimental results on the moving MNIST dataset indicate that DUST-RPCA gives better accuracy when compared with the existing state of the art deep unfolded RPCA networks.
... A similar idea has also been evaluated in the context of video synthesis applications [44]. Additionally, RPCA-DUNs have also achieved impressive results in various applications, including but not limited to foreground-background separation [35], SAR imaging [3], radar interference mitigation [30], and cloud removal [16]. Specifically, methods like CORONA [32] and L+S-Net [15] have achieved significant improvement in tasks such as ultrasound clutter suppression and dynamic MRI reconstruction. ...
... Substituting (15) and (16) into (14), we can easily verify the correctness of both equations. At this point, we have established a wellposed system of three linear equations, i.e., (12), (15), and (16). ...
Preprint
Low-rank regularization-based deep unrolling networks have achieved remarkable success in various inverse imaging problems (IIPs). However, the singular value decomposition (SVD) is non-differentiable when duplicated singular values occur, leading to severe numerical instability during training. In this paper, we propose a differentiable SVD based on the Moore-Penrose pseudoinverse to address this issue. To the best of our knowledge, this is the first work to provide a comprehensive analysis of the differentiability of the trivial SVD. Specifically, we show that the non-differentiability of SVD is essentially due to an underdetermined system of linear equations arising in the derivation process. We utilize the Moore-Penrose pseudoinverse to solve the system, thereby proposing a differentiable SVD. A numerical stability analysis in the context of IIPs is provided. Experimental results in color image compressed sensing and dynamic MRI reconstruction show that our proposed differentiable SVD can effectively address the numerical instability issue while ensuring computational precision. Code is available at https://github.com/yhao-z/SVD-inv.
... According to the relevant literature [1], the ground occlusion feature in the remote sensing images is typically obscured by cloud and shadow under the influence of geographic environment and weather conditions, reaching about 55% over land and 72% over the ocean. By restoring the scene information under cloud occlusion, particularly thick cloud occlusion, in remote sensing images, employing the appropriate cloud removal method, the availability of remote sensing image data can be significantly increased [2][3][4][5]. ...
Article
Full-text available
Remote sensing images are very vulnerable to cloud interference during the imaging process. Cloud occlusion, especially thick cloud occlusion, significantly reduces the imaging quality of remote sensing images, which in turn affects a variety of subsequent tasks using the remote sensing images. The remote sensing images miss ground information due to thick cloud occlusion. The thick cloud removal method based on a temporality global–local structure is initially suggested as a solution to this problem. This method includes two stages: the global multi-temporal feature fusion (GMFF) stage and the local single-temporal information restoration (LSIR) stage. It adopts the fusion feature of global multi-temporal to restore the thick cloud occlusion information of local single temporal images. Then, the featured global–local structure is created in both two stages, fusing the global feature capture ability of Transformer with the local feature extraction ability of CNN, with the goal of effectively retaining the detailed information of the remote sensing images. Finally, the local feature extraction (LFE) module and global–local feature extraction (GLFE) module is designed according to the global–local characteristics, and the different module details are designed in this two stages. Experimental results indicate that the proposed method performs significantly better than the compared methods in the established data set for the task of multi-temporal thick cloud removal. In the four scenes, when compared to the best method CMSN, the peak signal-to-noise ratio (PSNR) index improved by 2.675, 5.2255, and 4.9823 dB in the first, second, and third temporal images, respectively. The average improvement of these three temporal images is 9.65%. In the first, second, and third temporal images, the correlation coefficient (CC) index improved by 0.016, 0.0658, and 0.0145, respectively, and the average improvement for the three temporal images is 3.35%. Structural similarity (SSIM) and root mean square (RMSE) are improved 0.33% and 34.29%, respectively. Consequently, in the field of multi-temporal cloud removal, the proposed method enhances the utilization of multi-temporal information and achieves better effectiveness of thick cloud restoration.
Preprint
Full-text available
This paper presents two deep unfolding neural networks for the simultaneous tasks of background subtraction and foreground detection in video. Unlike conventional neural networks based on deep feature extraction, we incorporate domain-knowledge models by considering a masked variation of the robust principal component analysis problem (RPCA). With this approach, we separate video clips into low-rank and sparse components, respectively corresponding to the backgrounds and foreground masks indicating the presence of moving objects. Our models, coined ROMAN-S and ROMAN-R, map the iterations of two alternating direction of multipliers methods (ADMM) to trainable convolutional layers, and the proximal operators are mapped to non-linear activation functions with trainable thresholds. This approach leads to lightweight networks with enhanced interpretability that can be trained on few data. In ROMAN-S, the correlation in time of successive binary masks is controlled with a side-information scheme based on L1-L1 minimization. ROMAN-R enhances the foreground detection by learning a dictionary of atoms to represent the moving foreground in a high-dimensional feature space and by using reweighted-L1-L1 minimization. Experiments are conducted on both synthetic and real video datasets and comparisons are made with existing deep unfolding RPCA neural networks, which do not use a mask formulation for the foreground. The models are also compared to a U-Net baseline. Results show that our proposed models outperform other deep unfolding models, as well as the untrained optimization algorithms. ROMAN-R, in particular, is competitive with the U-Net baseline for foreground detection, with the additional advantage of providing video backgrounds and requiring substantially fewer training parameters and smaller training sets.
Article
Full-text available
The papers in this special issue introduce the reader to the theory, algorithms, and applications of principal component analysis (PCA) and its many extensions. The aim of PCA is to reduce the dimensionality of multivariate data while preserving as much of the relevant information as possible. It is often the first step in various types of exploratory data analysis, predictive modeling, and classification and clustering tasks, and finds applications in biomedical imaging, computer vision, process fault detection, recommendation systems’ design, and many more domains.
Article
Full-text available
Robust PCA (RPCA) via decomposition into low-rank plus sparse matrices offers a powerful framework for a large variety of applications such as image processing, video processing and 3D computer vision. Indeed, most of the time these applications require to detect sparse outliers from the observed imagery data that can be approximated by a low-rank matrix. Moreover, most of the time experiments show that RPCA with additional spatial and/or temporal constraints often outperforms the state-of-the-art algorithms in these applications. Thus, the aim of this paper is to survey the applications of RPCA in computer vision. In the first part of this paper, we review representative image processing applications as follows: (1) low-level imaging such as image recovery and denoising, image composition , image colorization, image alignment and rectification, multi-focus image and face recognition, (2) medical imaging like dynamic Magnetic Resonance Imaging (MRI) for acceleration of data acquisition, background suppression and learning of inter-frame motion fields, and (3) imaging for 3D computer vision with additional depth information like in Structure from Motion (SfM) and 3D motion recovery. In the second part, we present the applications of RPCA in video processing which utilize additional spatial and temporal information compared to image processing. Specifically, we investigate video denoising and restoration, hyperspectral video and background/foreground separation. Finally, we provide perspectives on possible future research directions and algorithmic frameworks that are suitable for these applications.
Conference Paper
Full-text available
Clouds cover about 70% of the Earth's surface and play a dominant role in the energy and water cycle of our planet. Only satellite observations provide a continuous survey of the state of the atmosphere over the entire globe and across the wide range of spatial and temporal scales that comprise weather and climate variability. Satellite cloud data records now exceed more than 25 years; however, climatologies compiled from different satellite datasets can exhibit systematic biases. Questions therefore arise as to the accuracy and limitations of the various sensors. The Global Energy and Water cycle Experiment (GEWEX) Cloud Assessment, initiated in 2005 by the GEWEX Radiation Panel, provides the first coordinated intercomparison of publicly available, global cloud products (gridded, monthly statistics) retrieved from measurements of multi-spectral imagers (some with multi-angle view and polarization capabilities), IR sounders and lidar. Cloud properties under study include cloud amount, cloud height (in terms of pressure, temperature or altitude), cloud radiative properties (optical depth or emissivity), cloud thermodynamic phase and bulk microphysical properties (effective particle size and water path). Differences in average cloud properties, especially in the amount of high-level clouds, are mostly explained by the inherent instrument measurement capability for detecting and/or identifying optically thin cirrus, especially when overlying low-level clouds. The study of long-term variations with these datasets requires consideration of many factors. The monthly, gridded database presented here facilitates further assessments, climate studies, and the evaluation of climate models.
Article
Full-text available
Identification of clouds, cloud shadows and snow in optical images is often a necessary step toward their use. Recently a new program (named Fmask) designed to accomplish these tasks was introduced for use with images from Landsats 4–7 (Zhu & Woodcock, 2012). In this paper, there are the following: (1) improvements in the Fmask algorithm for Landsats 4–7; (2) a new version for use with Landsat 8 that takes advantage of the new cirrus band; and (3) a prototype algorithm for Sentinel 2 images. Though Sentinel 2 images do not have a thermal band to help with cloud detection, the new cirrus band is found to be useful for detecting clouds, especially for thin cirrus clouds. By adding a new cirrus cloud probability and removing the steps that use the thermal band, the Sentinel 2 scenario achieves significantly better results than the Landsats 4–7 scenario for all 7 images tested. For Landsat 8, almost all the Fmask algorithm components are the same as for Landsats 4–7, except a new cirrus cloud probability is calculated using the new cirrus band, which improves detection of thin cirrus clouds. Landsat 8 results are better than the Sentinel 2 scenario, with 6 out of 7 test images showing higher accuracies.
Conference Paper
Full-text available
Low-rank and sparse structures have been pro-foundly studied in matrix completion and com-pressed sensing. In this paper, we develop "Go Decomposition" (GoDec) to efficiently and ro-bustly estimate the low-rank part L and the sparse part S of a matrix X = L + S + G with noise G. GoDec alternatively assigns the low-rank ap-proximation of X − S to L and the sparse ap-proximation of X − L to S. The algorithm can be significantly accelerated by bilateral random projections (BRP). We also propose GoDec for matrix completion as an important variant. We prove that the objective value ∥X − L − S∥ 2 F converges to a local minimum, while L and S lin-early converge to local optimums. Theoretically, we analyze the influence of L, S and G to the asymptotic/convergence speeds in order to dis-cover the robustness of GoDec. Empirical stud-ies suggest the efficiency, robustness and effec-tiveness of GoDec comparing with representative matrix decomposition and completion tools, e.g., Robust PCA and OptSpace.
Article
Deep neural networks provide unprecedented performance gains in many real-world problems in signal and image processing. Despite these gains, the future development and practical deployment of deep networks are hindered by their black-box nature, i.e., a lack of interpretability and the need for very large training sets. An emerging technique called algorithm unrolling, or unfolding, offers promise in eliminating these issues by providing a concrete and systematic connection between iterative algorithms that are widely used in signal processing and deep neural networks. Unrolling methods were first proposed to develop fast neural network approximations for sparse coding. More recently, this direction has attracted enormous attention, and it is rapidly growing in both theoretic investigations and practical applications. The increasing popularity of unrolled deep networks is due, in part, to their potential in developing efficient, high-performance (yet interpretable) network architectures from reasonably sized training sets.
Article
Cloud removal is a ubiquitous and important task in remote sensing image processing, which aims at restoring the ground regions shadowed by clouds. It is challenging to remove the clouds for a single satellite image due to the difficulty of distinguishing clouds from white objects on the ground and filling the irregular missing regions with visual consistency. In this article, we propose a novel two-stage cloud removal method. The first stage is cloud segmentation, i.e., extracting the clouds and removing the thin clouds directly using U-Net. The second stage is image restoration, i.e., removing the thick cloud and recovering the corresponding irregular missing regions using generative adversarial network (GAN). We evaluate the proposed scheme on both synthetic images and real satellite images (over 20000×2000020\,000\, \times \,20\,000 pixels). On synthetic images for cloud coverage less than 40%, the proposed scheme achieves improvements of 0.049–0.078 in Structural SIMilarity (SSIM) and 3.8–6.2 dB in peak signal-to-noise ratio (PSNR), while the 1\ell _{1} -norm error reduces by 49%–78%, compared with a state-of-the-art deep learning method Pix2Pix. On real satellite images, we demonstrate the consistent visual results of the proposed scheme.
Article
Contrast enhanced ultrasound is a radiation-free imaging modality which uses encapsulated gas microbubbles for improved visualization of the vascular bed deep within the tissue. It has recently been used to enable imaging with unprecedented subwavelength spatial resolution by relying on super-resolution techniques. A typical preprocessing step in super-resolution ultrasound is to separate the microbubble signal from the cluttering tissue signal. This step has a crucial impact on the final image quality. Here, we propose a new approach to clutter removal based on robust principle component analysis (PCA) and deep learning. We begin by modeling the acquired contrast enhanced ultrasound signal as a combination of low rank and sparse components. This model is used in robust PCA and was previously suggested in the context of ultrasound Doppler processing and dynamic magnetic resonance imaging. We then illustrate that an iterative algorithm based on this model exhibits improved separation of microbubble signal from the tissue signal over commonly practiced methods. Next, we apply the concept of deep unfolding to suggest a deep network architecture tailored to our clutter filtering problem which exhibits improved convergence speed and accuracy with respect to its iterative counterpart. We compare the performance of the suggested deep network on both simulations and in-vivo rat brain scans, with a commonly practiced deep-network architecture and with the fast iterative shrinkage algorithm. We show that our architecture exhibits better image quality and contrast.
Article
Clouds and accompanying shadows, which exist in optical remote sensing images with high possibility, can degrade or even completely occlude certain ground-cover information in images, limiting their applicabilities for Earth observation, change detection, or land-cover classification. In this paper, we aim to deal with cloud contamination problems with the objective of generating cloud-removed remote sensing images. Inspired by low-rank representation together with sparsity constraints, we propose a coarse-to-fine framework for cloud removal in the remote sensing image sequence. Leveraging on group-sparsity constraint, we first decompose the observed cloud image sequence of the same area into the low-rank component, group-sparse outliers, and sparse noise, corresponding to cloud-free land-covers, clouds (and accompanying shadows), and noise respectively. Subsequently, a discriminative robust principal component analysis (RPCA) algorithm is utilized to assign aggressive penalizing weights to the initially detected cloud pixels to facilitate cloud removal and scene restoration. Moreover, we incorporate geometrical transformation into a low-rank model to address the misalignment of the image sequence. Significantly superior to conventional cloud-removal methods, neither cloud-free reference image(s) nor additional operations of cloud and shadow detection are required in our method. Extensive experiments on both simulated data and real data demonstrate that our method works effectively, outperforming many state-of-the-art approaches.
Article
Filling missing information or removing special objects is often required in the applications of high spatial resolution images. A novel single-image reconstruction method is presented in this letter to solve this task, without the use of any complementary data. Firstly, the spatial pattern of the image is obtained by the statistics of similar patch offsets in the known regions, which provide reliable information for reconstructing the image. The missing regions are then filled by combining a series of shifted pixels via global optimization. The proposed method was tested on a cloudy image for cloud removal and on a public image for military object concealment. The experimental results show that the proposed method can produce visually convincing and coherent reconstructed images, and the accuracy of the reconstruction is better than the existing non-complementation methods.
Conference Paper
Cloud cover impacts the quality of optical remote sensing images. Generally, temporal methods and inpainting methods are used to remove the clouds. The temporal methods reconstruct cloudy areas via a series of multi-temporal images, thus suffer from the assumption that the landscape does not change over a period of time. The inpainting methods fill the areas via image patches from the image itself. Lacking prior information of the cloudy areas, these methods are limited in reconstructing accuracy, especially when clouds lie on the boundaries of two types of landscapes. We propose a new method based on the inpanting method which take the SAR (Synthetic Aperture Radar) images as a prior structure information of contaminated. Using information from two kinds of images acquired at the same time, the proposed method also avoids inaccuracy caused by land changes in temporal methods. This idea has been demonstrated by experiments carried out on Theme Mapper data and Sentinel-1A data. In terms of RMSE (Root Mean Square Error), the proposed method is evaluated and compared with several other cloud removal algorithm.
Article
We consider algorithms and recovery guarantees for the analysis sparse model where the signal is sparse with respect to a highly coherent frame. We first consider the use of the monotone version of the fast iterative shrinkage- thresholding algorithm (MFISTA) to solve the analysis sparse recovery problem. Since the proximal operator in MFISTA does not have a closed-form solution for the analysis model, it cannot be applied directly. Instead, we examine two alternatives based on smoothing and decomposition transformations that relax the original sparse recovery problem, and then implement MFISTA on the relaxed formulation. We refer to these two methods as smoothing-based MFISTA and decomposition-based MFISTA. We analyze the convergence of both algorithms, and establish that smoothing-based MFISTA converges more rapidly when applied to general nonsmooth optimization problems. We then derive a performance bound on the reconstruction error using these algorithms. The bound proves that our methods can recover a sparse signal in terms of a redundant tight frame when the measurement matrix satisfies a properly adapted restricted isometry property. Extensive numerical examples demonstrate the performance of our algorithms and show that smoothing- based MFISTA converges faster than the decomposition-based alternative in real applications, such as CT image reconstruction.
Article
Partial cloud cover is a severe problem in the optical remote sensing images. The problem can be mostly overcome by mosaicking the cloud-free areas of the multi-temporal images. In this paper, multidisciplinary methods are proposed to generate cloud-free mosaic images from multi-temporal SPOT images in three steps. At first, the original images are enhanced in both brightness and chromaticity. Secondly, the linear spectral unmixing method (LSU) is used to extract all cloud cover regions. Then, we choose the base image that has the least thin-cloud cover and divide the base image into grid zones. We find the thin-cloud and cloud-shadow zones in the eight neighbors of the thick-cloud zones based on the relative locations and the sun elevation angle. At last, the cloud and cloud-shadow zones of the base image are replaced by the same-location cloud-free zones on other images. Between each two zones of the base image and the replacing image, we create a transition zone. The multiscale wavelet-based fusion method is then used to fuse the pixels in the zones to generate cloud-free satellite images. Based on our complete and sophisticated approach, the high-quality fused results are produced from the source images that have variant brightness.
Article
Abstract This paper introduces a novel algorithm to approximate the matrix with minimum,nuclear norm among all matrices obeying a set of convex constraints. This problem may be understood as the convex relaxation of a rank minimization problem, and arises in many important applications as in the task of recovering a large matrix from a small subset of its entries (the famous Netix problem). O-the-shelf,algorithms such as interior point methods are not directly amenable to large problems of this kind with over a million unknown,entries. This paper develops a simple,rst-order and easy-to-implement algorithm that is extremely ecient,at addressing problems in which the optimal solution has low rank. The algorithm is iterative and produces a sequence of matricesfX,g is empirically nondecreasing. Both these facts allow the algorithm to make use of very minimal storage space and keep the computational cost of each iteration low. On the theoretical side, we provide a convergence analysis showing that the sequence of iterates converges. On the practical side, we provide numerical examples in which 1; 000 1; 000 matrices are recovered in less than a minute on a modest desktop computer. We also demonstrate that our approach is amenable to very large scale problems by recovering matrices of rank about 10 with nearly a billion unknowns from just about 0.4% of their sampled entries. Our methods are connected with the recent literature on linearized Bregman iterations for ‘1 minimization, and we develop a framework in which one can understand these algorithms in terms of well-known Lagrange multiplier algorithms. Keywords. Nuclear norm minimization, matrix completion, singular value thresholding, La-
Article
We consider the class of iterative shrinkage-thresholding algorithms (ISTA) for solving linear inverse problems arising in signal/image processing. This class of methods, which can be viewed as an ex- tension of the classical gradient algorithm, is attractive due to its simplicity and thus is adequate for solving large-scale problems even with dense matrix data. However, such methods are also known to converge quite slowly. In this paper we present a new fast iterative shrinkage-thresholding algorithm (FISTA) which preserves the computational simplicity of ISTA but with a global rate of convergence which is proven to be significantly better, both theoretically and practically. Initial promising nu- merical results for wavelet-based image deblurring demonstrate the capabilities of FISTA which is shown to be faster than ISTA by several orders of magnitude.
Article
This paper is about a curious phenomenon. Suppose we have a data matrix, which is the superposition of a low-rank component and a sparse component. Can we recover each component individually? We prove that under some suitable assumptions, it is possible to recover both the low-rank and the sparse components exactly by solving a very convenient convex program called Principal Component Pursuit; among all feasible decompositions, simply minimize a weighted combination of the nuclear norm and of the L1 norm. This suggests the possibility of a principled approach to robust principal component analysis since our methodology and results assert that one can recover the principal components of a data matrix even though a positive fraction of its entries are arbitrarily corrupted. This extends to the situation where a fraction of the entries are missing as well. We discuss an algorithm for solving this optimization problem, and present applications in the area of video surveillance, where our methodology allows for the detection of objects in a cluttered background, and in the area of face recognition, where it offers a principled way of removing shadows and specularities in images of faces.
Learned robust PCA: A scalable deep unfolding approach for high-dimensional outlier detection
  • Cai
Learned robust PCA: A scalable deep unfolding approach for high-dimensional outlier detection
  • Hanqin Cai
  • J Liu
  • W Yin