Article

iPiano-Net: Nonconvex optimization inspired multi-scale reconstruction network for compressed sensing

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Compressed sensing (CS) aims to precisely reconstruct the original signal from under-sampled measurements, which is a typical ill-posed problem. Solving such a problem is challenging and generally needs to incorporate suitable priors about the underlying signals. Traditionally, these priors are hand-crafted and the corresponding approaches generally have limitations in expressive capacity. In this paper, a nonconvex optimization inspired multi-scale reconstruction network is developed for block-based CS, abbreviated as iPiano-Net, by unfolding the classic iPiano algorithm. In iPiano-Net, a block-wise inertial gradient descent interleaves with an image-level network-inducing proximal mapping to exploit the local block and global content information alternately. Therein, network-inducing proximal operators can be adaptively learned in each module, which can efficiently characterize image priors and improve the modeling capacity of iPiano-Net. Such learned image-level priors can suppress blocky artifacts and noises/corruptions while preserving the global information. Different from existing discriminative CS reconstruction models trained with specific measurement ratios, an effective single model is learned to handle CS reconstruction with several measurement ratios even the unseen ones. Experimental results demonstrate that the proposed approach is substantially superior to previous CS methods in terms of Peak Signal to Noise Ratio (PSNR) and visual quality, especially at low measurement ratios. Meanwhile, it is robust to noise while maintaining comparable execution speed.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Deep unfolding networks (DUNs) have been proposed to solve different image inverse tasks, such as denoising [33], [34], deblurring [35], [36], and demosaicking [37]. DUN has friendly interpretability on training data pairs {(y j , x j )} Na j=1 , which is usually formulated on CS construction as the bi-level optimization problem: DUNs on CS usually integrate some effective convolutional neural network (CNN) denoisers into some optimization methods, including half quadratic splitting (HQS) algorithm [38], [39], proximal gradient descent (PGD) algorithm [28], [40], [31], [41], [42], and inertial proximal algorithm for nonconvex optimization (iPiano) [43]. Different optimization methods usually lead to different optimization-inspired DUNs. ...
... Each image block of size 33×33 is sampled and reconstructed independently for the first 400 epochs, and for the last ten epochs, we adopt larger image blocks of size 99×99 as the inputs to finetune the model further. To alleviate blocking artifacts, we firstly unfold the blocks of size 99 × 99 into overlapping blocks of size 33×33 while sampling process Φx and then fold the blocks of size 33 × 33 into larger blocks while initialization Φ ⊤ y [43]. We also unfold the whole image with this approach during testing. ...
... 3) Sensitivity to Noise: In the real application, the imaging model may be affected by noise, so we add the experiments of the models with various Gaussian noises on Set11 dataset [25] to demonstrate the robustness in Table X. We finetune ISTA-Net + [28], iPiano-Net [43], DP-DUN, and DPC-DUN with the random noise in [0,10], namely ISTA-Net + * , iPiano-Net * , DP-DUN * , and DPC-DUN * respectively. As shown in Table X, our proposed models outperform the compared methods with all noises. ...
Preprint
Full-text available
Deep unfolding network (DUN) that unfolds the optimization algorithm into a deep neural network has achieved great success in compressive sensing (CS) due to its good interpretability and high performance. Each stage in DUN corresponds to one iteration in optimization. At the test time, all the sampling images generally need to be processed by all stages, which comes at a price of computation burden and is also unnecessary for the images whose contents are easier to restore. In this paper, we focus on CS reconstruction and propose a novel Dynamic Path-Controllable Deep Unfolding Network (DPC-DUN). DPC-DUN with our designed path-controllable selector can dynamically select a rapid and appropriate route for each image and is slimmable by regulating different performance-complexity tradeoffs. Extensive experiments show that our DPC-DUN is highly flexible and can provide excellent performance and dynamic adjustment to get a suitable tradeoff, thus addressing the main requirements to become appealing in practice. Codes are available at https://github.com/songjiechong/DPC-DUN.
... Deep proximal unrolling networks for CS and compressive sensing MRI (CS-MRI) tasks usually integrate the effective CNN denoisers into some optimization methods including half quadratic splitting (HQS) (Zhang et al., 2017;Aggarwal et al., 2018), alternating minimization (AM) (Schlemper et al., 2017;Sun et al., 2018;Zheng et al., 2019), iterative shrinkage-thresholding algorithm (ISTA) (Zhang & Ghanem, 2018;Gilton et al., 2019;Zhang et al., 2020b;You et al., 2021b), approximate message passing (AMP) (Zhang et al., 2020a;Zhou et al., 2021), alternating direction method of multipliers (ADMM) (Yang et al., 2018) and inertial proximal algorithm for nonconvex optimization (iPiano) (Su & Lian, 2020). Different optimization methods usually lead to different optimization-inspired networks with various unrolling frameworks. ...
... Each image block of size 33 × 33 is sampled and reconstructed independently for the first 400 epochs, and for the last ten epochs, we adopt larger image blocks of size 132 × 132 as inputs to further fine-tune the model. The process of finetuning the models is shown in Fig. 4, which is the same as (Su & Lian, 2020). Specifically, to alleviate blocking artifacts, we firstly unfold the blocks of size 132 × 132 into non-overlapping blocks of size 33 × 33 as the input of GDM and then fold the blocks of size 33 × 33 into larger blocks as the input of network in our experiments. ...
... We compare our proposed MAPUN with nine recent representative CS reconstruction methods, including two black box networks and seven deep proximal unrolling networks, namely ReconNet (Kulkarni et al., 2016), DPA-Net , IRCNN (Zhang et al., 2017), ISTA-Net + (Zhang & Ghanem, 2018), DPDNN , GDN (Gilton et al., 2019), MAC-Net , iPiano-Net (Su & Lian, 2020) and COAST (You et al., 2021b). The average PSNR(dB)/SSIM reconstruction performances on Set11 (Zhang & Ghanem, 2018) and CBSD68 datasets with respect to five CS ratios are summarized in Table 5. ...
Article
Full-text available
Mapping a truncated optimization method into a deep neural network, deep proximal unrolling network has attracted attention in compressive sensing due to its good interpretability and high performance. Each stage in such networks corresponds to one iteration in optimization. By understanding the network from the perspective of the human brain’s memory processing, we find there exist two categories of memory transmission: intra-stage and inter-stage. For intra-stage, existing methods increase the number of parameters to maximize the information flow. For inter-stage, there are also two methods. One is to transfer the information between adjacent stages, which can be regarded as short-term memory that is usually lost seriously. The other is a mechanism to ensure that the previous stages affect the current stage, which has not been explicitly studied. In this paper, a novel deep proximal unrolling network with persistent memory is proposed, dubbed deep Memory-Augmented Proximal Unrolling Network (MAPUN). We design a memory-augmented proximal mapping module that ensures maximum information flow for intra- and inter-stage. Specifically, we present a self-modulated block that can adaptively develop feature modulation for intra-stage and introduce two types of memory augmentation mechanisms for inter-stage, namely High-throughput Short-term Memory (HSM) and Cross-stage Long-term Memory (CLM). HSM is exploited to allow the network to transmit multi-channel short-term memory, which greatly reduces information loss between adjacent stages. CLM is utilized to develop the dependency of deep information across cascading stages, which greatly enhances network representation capability. Extensive experiments show that our MAPUN outperforms existing state-of-the-art methods.
... DUNs on CS and compressive sensing MRI (CS-MRI) usually integrate some effective convolutional neural network (CNN) denoisers into some optimization methods including half quadratic splitting (HQS) [ [54], alternating direction method of multipliers (ADMM) [37] and inertial proximal algorithm for nonconvex optimization (iPiano) [30]. Different optimization methods usually lead to different optimization-inspired DUNs. ...
... Each image block of size 33 × 33 is sampled and reconstructed independently for the first 400 epochs, and for the last ten epochs, we adopt larger image blocks of size 99×99 as inputs to further fine-tune the model. To alleviate blocking artifacts, we firstly unfold the blocks of size 99 × 99 into overlapping blocks of size 33 × 33 while sampling process Φx and then fold the blocks of size 33 × 33 into larger blocks while initialization Φ ⊤ y [30]. We also unfold the whole image with Figure 3 is the results when ratio is 25% on Set11 dataset. ...
... The average PSNR reconstruction performances on Set11, CBSD68 and Urban100 dataset with respect to five CS ratios are summarized in Table 3. The models of ReconNet [16], ISTA-Net + [40], DPA-Net [32], IRCNN [46], MAC-Net [2] and iPiano-Net [30] are trained by the same methods and training datasets with the corresponding works, and DPDNN [5] and GDN [8] utilize the same training dataset with our method due to no CS reconstruction task in their original works. One can observe that our MADUN outperforms all the other competing methods in PSNR and SSIM across all the cases. ...
Preprint
Full-text available
Mapping a truncated optimization method into a deep neural network, deep unfolding network (DUN) has attracted growing attention in compressive sensing (CS) due to its good interpretability and high performance. Each stage in DUNs corresponds to one iteration in optimization. By understanding DUNs from the perspective of the human brain's memory processing, we find there exists two issues in existing DUNs. One is the information between every two adjacent stages, which can be regarded as short-term memory, is usually lost seriously. The other is no explicit mechanism to ensure that the previous stages affect the current stage, which means memory is easily forgotten. To solve these issues, in this paper, a novel DUN with persistent memory for CS is proposed, dubbed Memory-Augmented Deep Unfolding Network (MADUN). We design a memory-augmented proximal mapping module (MAPMM) by combining two types of memory augmentation mechanisms, namely High-throughput Short-term Memory (HSM) and Cross-stage Long-term Memory (CLM). HSM is exploited to allow DUNs to transmit multi-channel short-term memory, which greatly reduces information loss between adjacent stages. CLM is utilized to develop the dependency of deep information across cascading stages, which greatly enhances network representation capability. Extensive CS experiments on natural and MR images show that with the strong ability to maintain and balance information our MADUN outperforms existing state-of-the-art methods by a large margin. The source code is available at https://github.com/jianzhangcs/MADUN/.
... Deep unfolding models denote a series of models constructed by mapping iterative algorithms with unfixed numbers of steps onto deep neural networks with fixed numbers of steps [26], [8], [19], [9]. There are a lot of non-linear iterative algorithms that are unfolded, such as ISTA [27], [8], AMP [26], [9], half-quadratic splitting (HQS) [19], alternating direction methods of multipliers (ADMM) [28], [29] and iPiano algorithm [30]. By combining the interpretability of model-based methods and the trainable characteristics of traditional deep learning models, they make a good balance between reconstruction performance and interpretation. ...
... At present, there exist a few methods [32], [30], [33], [34], [17] which reconstruct images at different CS ratios using only one model, and they can be roughly cast into two categories. The first kind [32], [30] trains a single model with a set of sampling matrices with different CS ratios so that the model can adapt to all sampling matrices in this set. ...
... At present, there exist a few methods [32], [30], [33], [34], [17] which reconstruct images at different CS ratios using only one model, and they can be roughly cast into two categories. The first kind [32], [30] trains a single model with a set of sampling matrices with different CS ratios so that the model can adapt to all sampling matrices in this set. The second kind [33], [34] applies only one sampling matrix in a learning way and integrates its rows to achieve sampling and reconstruction at different CS ratios, and we call such a strategy as scalable sampling and reconstruction (SSR) in this paper. ...
Preprint
Full-text available
Deep learning has been used to image compressive sensing (CS) for enhanced reconstruction performance. However, most existing deep learning methods train different models for different subsampling ratios, which brings additional hardware burden. In this paper, we develop a general framework named scalable deep compressive sensing (SDCS) for the scalable sampling and reconstruction (SSR) of all existing end-to-end-trained models. In the proposed way, images are measured and initialized linearly. Two sampling masks are introduced to flexibly control the subsampling ratios used in sampling and reconstruction, respectively. To make the reconstruction model adapt to any subsampling ratio, a training strategy dubbed scalable training is developed. In scalable training, the model is trained with the sampling matrix and the initialization matrix at various subsampling ratios by integrating different sampling matrix masks. Experimental results show that models with SDCS can achieve SSR without changing their structure while maintaining good performance, and SDCS outperforms other SSR methods.
... , c k , ϕ ϕ ϕ k )⟩ (18) where, in Lemma 3, we take g(c) = L ρ (v (k+1) 1:3 ...
... where we use 2⟨a, b⟩ ≤ ∥a∥ 2 2 + ∥b∥ 2 2 in the final inequality. Combining (18) and (19) produces the desired result. ...
Preprint
Full-text available
Since most inverse problems arising in scientific and engineering applications are ill-posed, prior information about the solution space is incorporated, typically through regularization, to establish a well-posed problem with a unique solution. Often, this prior information is an assumed statistical distribution of the desired inverse problem solution. Recently, due to the unprecedented success of generative adversarial networks (GANs), the generative network from a GAN has been implemented as the prior information in imaging inverse problems. In this paper, we devise a novel iterative algorithm to solve inverse problems in imaging where a dual-structured prior is imposed by combining a GAN prior with the compound Gaussian (CG) class of distributions. A rigorous computational theory for the convergence of the proposed iterative algorithm, which is based upon the alternating direction method of multipliers, is established. Furthermore, elaborate empirical results for the proposed iterative algorithm are presented. By jointly exploiting the powerful CG and GAN classes of image priors, we find, in compressive sensing and tomographic imaging problems, our proposed algorithm outperforms and provides improved generalizability over competitive prior art approaches while avoiding performance saturation issues in previous GAN prior-based methods.
... These factors contribute to the success of algorithm unrolled methods, which have shown excellent performance in image estimation while offering interpretability of the network layers [18]. Algorithm unrolling has been applied to many iterative algorithms including ISTA [17]- [21], proximal gradient or gradient descent [22,23], the inertial proximal algorithm for non-convex optimization [24], and the primal-dual algorithm [25]. ...
... We compare DR-CG-Net against ten state-of-the-art methods: (i) compound Gaussian network (CG-Net) [8,26], (ii) memory augmented deep unfolding network (MADUN) [19], (iii) ISTA-Net + [20], (iv) FISTA-Net [21], (v) iPiano-Net [24], (vi) ReconNet [11], (vii) LEARN ++ [23], (viii) Learned Primal-Dual (LPD) [25], (ix) FBPConvNet [12], and (x) iRadonMAP [13]. Although memory augmented proximal unrolled network (MAPUN) [43] was considered, due to the similarity in performance to MADUN only MADUN results are shown. ...
Article
Full-text available
Incorporating prior information into inverse problems, e.g. via maximum-a-posteriori estimation, is an important technique for facilitating robust inverse problem solutions. In this paper, we devise two novel approaches for linear inverse problems that permit problem-specific statistical prior selections within the compound Gaussian (CG) class of distributions. The CG class subsumes many commonly used priors in signal and image reconstruction methods including those of sparsity-based approaches. The first method developed is an iterative algorithm, called generalized compound Gaussian least squares (G-CG-LS), that minimizes a regularized least squares objective function where the regularization enforces a CG prior. G-CG-LS is then unrolled, or unfolded, to furnish our second method, which is a novel deep regularized (DR) neural network, called DR-CG-Net, that learns the prior information. A detailed computational theory on convergence properties of G-CG-LS and thorough numerical experiments for DR-CG-Net are provided. Due to the comprehensive nature of the CG prior, these experiments show that DR-CG-Net outperforms competitive prior art methods in tomographic imaging and compressive sensing, especially in challenging low-training scenarios.
... To address this, deep learning methods, known for their prowess in image processing, have been introduced into the realm of CS reconstruction. Deep-learning-based CS reconstruction algorithms can be broadly classified into two primary categories: deep non-unfolding networks (DNUNs) [18,19,21,25,26] and deep unfolding networks (DUNs) [8,[27][28][29][30][31][32][33]. DNUN treats the reconstruction process as a black-box operation, relying on a data-driven approach to build an end-to-end neural network to address the CS reconstruction problem. ...
... Nonetheless, DUN typically operates in a single-channel form in many cases [27][28][29][30]37,38], as feature maps within the deep reconstruction network are transmitted between phases and updated within each phase. This structural characteristic limits the characterization ability of the feature maps, ultimately degrading the network's reconstruction performance. ...
Article
Full-text available
Deep Unfolding Networks (DUNs) serve as a predominant approach for Compressed Sensing (CS) reconstruction algorithms by harnessing optimization. However, a notable constraint within the DUN framework is the restriction to single-channel inputs and outputs at each stage during gradient descent computations. This constraint compels the feature maps of the proximal mapping module to undergo multi-channel to single-channel dimensionality reduction, resulting in limited feature characterization capabilities. Furthermore, most prevalent reconstruction networks rely on single-scale structures, neglecting the extraction of features from different scales, thereby impeding the overall reconstruction network’s performance. To address these limitations, this paper introduces a novel CS reconstruction network termed the Multi-channel and Multi-scale Unfolding Network (MMU-Net). MMU-Net embraces a multi-channel approach, featuring the incorporation of Adap-SKConv with an attention mechanism to facilitate the exchange of information between gradient terms and enhance the feature map’s characterization capacity. Moreover, a Multi-scale Block is introduced to extract multi-scale features, bolstering the network’s ability to characterize and reconstruct the images. Our study extensively evaluates MMU-Net’s performance across multiple benchmark datasets, including Urban100, Set11, BSD68, and the UC Merced Land Use Dataset, encompassing both natural and remote sensing images. The results of our study underscore the superior performance of MMU-Net in comparison to existing state-of-the-art CS methods.
... 33.16 35.29 NN(TIP2020) [23] 23.90 29.20 30.26 32.31 MAC-Net(ECCV2020) [10] 27.68 32.91 33.96 36.18 iPiano-Net(SPIC2020) [58] 28.05 33.53 34.78 37.00 COAST(TIP2021) [70] 28.74 33.98 35.11 37.11 DPC-DUN(TIP2023) [57] 29. 40 In above compared CS methods, the sampling matrix is jointly optimized with the reconstruction process. ...
... As above, we conduct more comparisons against recent deep networkbased CS methods that use random sampling matrix. Specifically, we select thirteen random matrix-based deep CS algorithms, including four deep black box CS networks (ReconNet [31], I-Recon [37], DR 2 -Net [68] and DPA-Net [61]) and nine deep unfolding CS networks (IRCNN [77], LDAMP [42], ISTA-Net + [73], DPDNN [16], NN [23], MAC-Net [10], iPiano-Net [58], COAST [70] and DPC-DUN [57]). In our experiments, the orthogonalized Gaussian random matrix [61,62] is utilized, and during training process, the pre-defined random sampling matrix remains unchanged. ...
... Song et al. [62] proposed a side-information-aided deep adaptive shrinkage network, which utilized side information to efficiently transmit highthroughput information between adjacent stages. Different from the optimization algorithm used in the above methods, Su et al. [63] expanded the classic iPiano algorithm into a network. They used a multi-scale residual network to replace the proximal operator, eliminating blocking artifacts and noises by utilizing image-level priors. ...
Article
In recent years, a large number of researchers have begun to embed convolutional neural networks into traditional Compressive Sensing (CS) reconstruction algorithms. They have proposed a series of Deep Unfolding Networks (DUNs) with the characteristics of having good interpretability and high-quality reconstruction. However, most DUNs only use the inherent CS model, which leads to the limited reconstruction performance. In addition, simple reconstruction networks cannot well remove noises from features. To address the above issues, this paper proposes a wavelet-domain consistency-constrained CS framework. We introduce discrete wavelet transform into the CS optimization model and unfold its optimization algorithm into a deep dual-domain hybrid reconstruction network. The overall network of the proposed method is named Memory-boosted Dual-domain Guidance Filtering Network (MDGF-Net). MDGF-Net utilizes frequency-domain information to supplement spatial-domain image reconstruction. Furthermore, guided filtering is leveraged to filter out noises in the reconstructed features. Here, we design two consecutive denoising modules, namely, dual-domain memory-boosted guided filtering module and self-guided memory-boosted filtering module. By combining the guided filtering and long short-term memory mechanism, these two modules can effectively reduce the information loss while removing the feature noise, and retain the valid information. Numerous experimental results on various datasets indicate that the proposed MDGF-Net achieves superior reconstruction performance compared to several state-of-the-art DUNs. Our codes are publicly available at https://github.com/mdcnn/MDGF-Net-Plus.
... One is to separate the RGB channels of color images, and then achieve the sampling and reconstruction channel by channel. At present, most encryption algorithms and deep learning compressed sensing methods use this method to process color images (Su & Lian, 2020), but the channel-by-channel sampling reconstruction method ignores the differences between channels and cannot obtain ideal reconstruction results. Another scheme is to separate the channels of color images and construct a measurement matrix to realize cross-channel sampling. ...
... As above, to further evaluate the performance of the proposed CS framework, we conduct the experimental comparisons against some recent deep networkbased CS methods that use Gaussian random sampling matrix. Specifically, we compare our proposed DUN-CSNet with twelve recent random matrix-based deep CS reconstruction algorithms, including five deep black box CS networks (Re-conNet [21], I-Recon [36], DR 2 -Net [20], DPA-Net [38] and NL-CSNet [54]) and seven deep unfolding CS networks (IRCNN [32], LDAMP [26], ISTA-Net + [27], DPDNN [66], NN [67], MAC-Net [65] and iPiano-Net [68]). In our ex- periments, the orthogonalized Gaussian random matrix [38], [52] is utilized, and during the training process, the predefined random sampling matrix remains unchanged. ...
Article
Full-text available
Inspired by certain optimization solvers, the deep unfolding network (DUN) has attracted much attention in recent years for image compressed sensing (CS). However, there still exist the following two issues: 1) In existing DUNs, most hyperparameters are usually content independent, which greatly limits their adaptability for different input contents. 2) In each iteration, a plain convolutional neural network is usually adopted, which weakens the perception of wider context prior and therefore depresses the expressive ability. In this paper, inspired by the traditional Proximal Gradient Descent (PGD) algorithm, a novel DUN for image compressed sensing (dubbed DUN-CSNet) is proposed to solve the above two issues. Specifically, for the first issue, a novel content adaptive gradient descent network is proposed, in which a well-designed step size generation sub-network is developed to dynamically allocate the corresponding step sizes for different textures of input image by generating a content-aware step size map, realizing a content-adaptive gradient updating. For the second issue, considering the fact that many similar patches exist in an image but have undergone a deformation, a novel deformation-invariant non-local proximal mapping network is developed, which can adaptively build the long-range dependencies between the nonlocal patches by deformation-invariant non-local modeling, leading to a wider perception on context priors. Extensive experiments manifest that the proposed DUN-CSNet outperforms existing state-of-the-art CS methods by large margins.
... where Θ denotes the trainable parameters and L(x j , x j ) represents the loss function of estimated clean imagex j with respect to the original image x j . In the community of compressive sensing, DUN-based methods usually integrate some effective convolutional neural network (CNN) denoisers into some optimization methods, e.g., proximal gradient descent (PGD) algorithm [7,9,10,39,41,53,54], approximate message passing (AMP) [60], and inertial proximal algorithm for nonconvex optimization (iPiano) [43]. Different optimization methods lead to different optimization-inspired DUNs. ...
Preprint
Full-text available
By integrating certain optimization solvers with deep neural networks, deep unfolding network (DUN) with good interpretability and high performance has attracted growing attention in compressive sensing (CS). However, existing DUNs often improve the visual quality at the price of a large number of parameters and have the problem of feature information loss during iteration. In this paper, we propose an Optimization-inspired Cross-attention Transformer (OCT) module as an iterative process, leading to a lightweight OCT-based Unfolding Framework (OCTUF) for image CS. Specifically, we design a novel Dual Cross Attention (Dual-CA) sub-module, which consists of an Inertia-Supplied Cross Attention (ISCA) block and a Projection-Guided Cross Attention (PGCA) block. ISCA block introduces multi-channel inertia forces and increases the memory effect by a cross attention mechanism between adjacent iterations. And, PGCA block achieves an enhanced information interaction, which introduces the inertia force into the gradient descent step through a cross attention block. Extensive CS experiments manifest that our OCTUF achieves superior performance compared to state-of-the-art methods while training lower complexity. Codes are available at https://github.com/songjiechong/OCTUF.
... To ensure the fairness, we evaluate the reconstruction performance with two widely used quality evaluation metrics: PSNR and SSIM in terms of various sampling ratios. [25], I-Recon [43], DR 2 -Net [46] and DPA-Net [41]) and seven DUNs (IRCNN [53], LD-AMP [32], ISTA-Net + [50], DPDNN [13], NN [19], MAC-Net [9] and iPiano-Net [40]). For these compared methods, we train them with the same experimental configurations as [39]. ...
Preprint
Full-text available
By integrating certain optimization solvers with deep neural network, deep unfolding network (DUN) has attracted much attention in recent years for image compressed sensing (CS). However, there still exist several issues in existing DUNs: 1) For each iteration, a simple stacked convolutional network is usually adopted, which apparently limits the expressiveness of these models. 2) Once the training is completed, most hyperparameters of existing DUNs are fixed for any input content, which significantly weakens their adaptability. In this paper, by unfolding the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), a novel fast hierarchical DUN, dubbed FHDUN, is proposed for image compressed sensing, in which a well-designed hierarchical unfolding architecture is developed to cooperatively explore richer contextual prior information in multi-scale spaces. To further enhance the adaptability, series of hyperparametric generation networks are developed in our framework to dynamically produce the corresponding optimal hyperparameters according to the input content. Furthermore, due to the accelerated policy in FISTA, the newly embedded acceleration module makes the proposed FHDUN save more than 50% of the iterative loops against recent DUNs. Extensive CS experiments manifest that the proposed FHDUN outperforms existing state-of-the-art CS methods, while maintaining fewer iterations.
... Deep unfolding networks (DUNs) have been proposed to solve various image inverse problems [33,34,35,32,36,37]. For CS and compressive sensing MRI (CS-MRI) tasks, DUNs usually combine convolutional neural network (CNN) denoisers with some optimization algorithms, like alternating minimization (AM) [38,39,40], half quadratic splitting (HQS) [41,42,43], iterative shrinkage-thresholding algorithm (ISTA) [44,25,26,28], alternating direction method of multipliers (ADMM) [45] and inertial proximal algorithm for nonconvex optimization (iPiano) [46]. Although existing DUNs benefit from well-defined interpretability, their inherent design of image-domain-based unfolding limits the feature transmission capability. ...
Preprint
Full-text available
Mapping optimization algorithms into neural networks, deep unfolding networks (DUNs) have achieved impressive success in compressive sensing (CS). From the perspective of optimization, DUNs inherit a well-defined and interpretable structure from iterative steps. However, from the viewpoint of neural network design, most existing DUNs are inherently established based on traditional image-domain unfolding, which takes one-channel images as inputs and outputs between adjacent stages, resulting in insufficient information transmission capability and inevitable loss of the image details. In this paper, to break the above bottleneck, we first propose a generalized dual-domain optimization framework, which is general for inverse imaging and integrates the merits of both (1) image-domain and (2) convolutional-coding-domain priors to constrain the feasible region in the solution space. By unfolding the proposed framework into deep neural networks, we further design a novel Dual-Domain Deep Convolutional Coding Network (D3C2-Net) for CS imaging with the capability of transmitting high-throughput feature-level image representation through all the unfolded stages. Experiments on natural and MR images demonstrate that our D3C2-Net achieves higher performance and better accuracy-complexity trade-offs than other state-of-the-arts.
... On the other hand, using image measurements only in the initial reconstruction limits the final reconstruction performance. To address these problems, the deep iteration-inspired method builds a cascaded network architecture based on the traditional iterative optimization algorithm framework, so that it has better interpretability and reconstruction quality than the deep black box method [20][21][22][23][24][25][26]. All the above-mentioned researches use BCS sampling to reduce the complexity of the encoder and decoder. ...
Article
Full-text available
Block compressed sensing (BCS), which is widely used in compressed image sensing (CIS), brings the advantages of lower complexity for sampling and reconstruction. But it will result in redundant information sampling in the smooth block and insufficient information sampling in the texture block, which makes the reconstruction quality of image details poor. To address this problem, we propose a novel Hybrid Sampling and Gradient Attention Network for CIS, dubbed HSGANet. In HSGANet, a new linear sampling strategy, called hybrid BCS (HBCS) sampling is proposed to realize BCS sampling with balanced information entropy for sampling blocks. Specifically, HBCS is composed of the proposed block permutation sampling (BPS) and BCS sampling, where the BPS is used to increase the proportion of texture block information in the image measurement, which is realized by BCS sampling following image block permutation. Furthermore, a selection algorithm is developed to achieve optimal image information balanced permutation. To match HBCS, we design the initial reconstruction fusion sub-network and the deep reconstruction sub-network which is constructed by cascading GA-Blocks with gradient attention sub-network. In each phase, the gradient attention sub-network can achieve pixel-level adaptive fusion of the gradient map obtained by minimizing measurement errors of BPS and BCS. Extensive experimental results show that our HSGANet has a great improvement in reconstruction accuracy than the state-of-the-art methods with a comparable running speed and model complexity.
... Many of the first order algorithms that we discussed are now enhanced by CNN using unrolling and re-incarnated to learning based methods. For example, FISTAnet [171], ADMM-net [172], learned primal-dual reconstruction [173], iPiano-net [174], SGD-net [175], and many others [176] are obtained in this manner based on the namesake first order algorithms. ...
Article
Full-text available
The past decade has seen the flourish of model based image reconstruction (MBIR) algorithms, which are often applications or adaptations of convex optimization algorithms from the optimization community. We review some state-of-the-art algorithms that have enjoyed wide popularity in medical image reconstruction, emphasize known connections between different algorithms, and discuss practical issues such as computation and memory cost. More recently, deep learning (DL) has forayed into medical imaging, where the latest development tries to exploit the synergy between DL and MBIR to elevate the MBIR's performance. We present existing approaches and emerging trends in DL-enhanced MBIR methods, with particular attention to the underlying role of convexity and convex algorithms on network architecture. We also discuss how convexity can be employed to improve the generalizability and representation power of DL networks in general.
Article
By mapping iterative optimization algorithms into neural networks (NNs), deep unfolding networks (DUNs) exhibit well-defined and interpretable structures and achieve remarkable success in the field of compressive sensing (CS). However, most existing DUNs solely rely on the image-domain unfolding, which restricts the information transmission capacity and reconstruction flexibility, leading to their loss of image details and unsatisfactory performance. To overcome these limitations, this paper develops a dual-domain optimization framework that combines the priors of (1) image- and (2) convolutional-coding-domains and offers generality to CS and other inverse imaging tasks. By converting this optimization framework into deep NN structures, we present a Dual-Domain Deep Convolutional Coding Network (D 3 C 2 -Net), which enjoys the ability to efficiently transmit high-capacity self-adaptive convolutional features across all its unfolded stages. Our theoretical analyses and experiments on simulated and real captured data, covering 2D and 3D natural, medical, and scientific signals, demonstrate the effectiveness, practicality, superior performance, and generalization ability of our method over other competing approaches and its significant potential in achieving a balance among accuracy, complexity, and interpretability. Code is available at https://github.com/lwq20020127/D3C2-Net.
Article
For solving linear inverse problems, particularly of the type that appears in tomographic imaging and compressive sensing, this paper develops two new approaches. The first approach is an iterative algorithm that minimizes a regularized least squares objective function where the regularization is based on a compound Gaussian prior distribution. The compound Gaussian prior subsumes many of the commonly used priors in image reconstruction, including those of sparsity-based approaches. The developed iterative algorithm gives rise to the paper’s second new approach, which is a deep neural network that corresponds to an “unrolling” or “unfolding” of the iterative algorithm. Unrolled deep neural networks have interpretable layers and outperform standard deep learning methods. This paper includes a detailed computational theory that provides insight into the construction and performance of both algorithms. The conclusion is that both algorithms outperform other state-of-the-art approaches to tomographic image formation and compressive sensing, especially in the difficult regime of low training.
Article
As a kind of network structure increasingly studied in compressive sensing, deep unfolding networks (DUNs), which unroll the iterative reconstruction procedure as DNNs for end-to-end training, have high interpretability and remarkable performance. Every phase of the DUN corresponds to one iteration. The input and output of each phase in most DUNs are inherently images, which heavily restricts information transmission. Besides, existing DUNs unfolded by ℓ 1 -regularized optimization usually utilize fixed thresholds for soft-shrinkage operation, which lacks adaptability. To solve these issues, a novel Side-infOrmation-aided Deep Adaptive Shrinkage Network (SODAS-Net) is designed for compressive sensing. Utilizing the side information (SI) allows SODAS-Net to send large volumes of information between adjacent phases, substantially augmenting the network representation capacity and optimizing network performance. Furthermore, an effective adaptive soft-shrinkage strategy is developed, which enables our SODAS-Net to solve ℓ 1 -regularized proximal mapping with content-aware thresholds. The results from extensive experiments on various testing datasets demonstrate that SODAS-Net achieves superior performance. Codes are available at https://github.com/songjiechong/SODAS-Net.
Article
Deep learning-based image compressive sensing (CS) methods have achieved great success in the past few years. However, most of them are content-independent, with a spatially uniform sampling rate allocation for the entire image. Such practises may potentially degrade the performance of image CS with block-based sampling, since the content of different blocks in an image is different. In this paper, we propose a novel rate-adaptive image CS neural network (dubbed RACSNet) to achieve adaptive sampling rate allocation based on the content characteristics of the image with a single model. Specifically, a measurement domain-based reconstruction distortion is first used to guide the sampling rate allocation for different blocks in an image without access to the ground truth image. Then, a step-wise training strategy is designed to train a reusable sampling matrix, which is capable of sampling image blocks to generate the compressed measurements under arbitrary sampling rates. Subsequently, a pyramid-shaped initial reconstruction sub-network and a hierarchical deep reconstruction sub-network that fuse the measurement information of different scales are put forward to reconstruct image blocks from the compressed measurements. Finally, a reconstruction distortion map and an improved loss function are developed to eliminate the blocking artifacts and further enhance the CS reconstruction. Experimental results on both objective metrics and subjective visual qualities show that the proposed RACSNet achieves significant improvements over the state-of-the-art methods.
Article
Deep unfolding network (DUN) that unfolds the optimization algorithm into a deep neural network has achieved great success in compressive sensing (CS) due to its good interpretability and high performance. Each stage in DUN corresponds to one iteration in optimization. At the test time, all the sampling images generally need to be processed by all stages, which comes at a price of computation burden and is also unnecessary for the images whose contents are easier to restore. In this paper, we focus on CS reconstruction and propose a novel Dynamic Path-Controllable Deep Unfolding Network (DPC-DUN). DPC-DUN with our designed path-controllable selector can dynamically select a rapid and appropriate route for each image and is slimmable by regulating different performance-complexity tradeoffs. Extensive experiments show that our DPC-DUN is highly flexible and can provide excellent performance and dynamic adjustment to get a suitable tradeoff, thus addressing the main requirements to become appealing in practice. Codes are available at https://github.com/songjiechong/DPC-DUN .
Article
As an emerging paradigm for signal acquisition and reconstruction, compressive sensing (CS) achieves high-speed sampling and compression jointly and has found its way into many applications. With the fast growth of deep learning in computer vision, various methods of applying neural networks (NNs) in CS imaging tasks have been proposed. One category of them, named the deep unrolling network, is inspired by the physical sampling model and combines the merits of both optimization model- and data-driven methods, becoming the mainstream of this realm. In this review article, we first review the inverse imaging model and optimization algorithms encountered in the CS research and then provide the recent representative developments of CS networks, which are grouped into deep physics-free and physics-inspired approaches with respect to the utilization of sampling matrix and measurement information. Following this, we analyze the conceptual connections and relationships among various existing methods and present our perspectives on recent advances and trends for future research.
Article
Probing the issue of phase retrieval has attracted researchers for many years, due to its wide range of application. Phase retrieval aims to recover an unknown signal from phase-free measurements. Classical alternative projection algorithms have the significant advantages of simplicity and few fine-tuning parameters. However, they suffer from non-convexity and often get stuck in local minima in the presence of noise disturbance. In this work, we develop an efficient hybrid model-based and data-driven approach to solve the phase retrieval problem with deep priors. To effectively utilize the inherent image priors, we propose a deep non-iterative (unfolded) network based on the classic HIO method, referred to as HIONet, which can adaptively learn inherent priors from the truth data distribution. Particularly, we replace the projection operator with trainable deep network, and as a result that learning parameterized function with weights in a supervised manner is equal to learning the prior knowledge from data with truth distributions. In turn, the deep priors learned during training enforce the unfolded network to obtain the optimal solution for phase retrieval problem. In the pipeline of our method, deep priors are incorporated with the physical image formation algorithm, so that the proposed HIONet benefits from the representational capabilities of deep networks, as well as the interpretability and versatility of the traditional well-established algorithms. Moreover, inspired by compounding and aggregating diverse representations to benefit the network for more accurate inference, an enhanced version with cross-blocks features fusion, referred to as HIONet⁺, is designed to further improve the reconstruction. Extensive experimental results on noisy phase-free measurements show that the developed methods outperform the competitors in terms of quantitative metrics such as PSNR, SSIM and visual effects at all noise levels. In addition, non-oversampling sparse phase retrieval experiments consistently demonstrate that our methods outperform compared methods.
Article
Block compressed sensing (BCS) is effective to process high-dimensional images or videos. Due to the block-wise sampling, most BCS methods only exploit local block priors and neglect inherent global image priors, thus resulting in blocky artifacts. To ameliorate this issue, this paper formulates a novel regularized optimization BCS model named BCS-LG to effectively characterize the complementarity of local and global priors. To be tractable, the data-fidelity and regularization terms in BCS-LG are flexibly decoupled with the aid of half quadratic splitting algorithm. Taking the merits of the interpretability of traditional iterative optimization methods and the powerful representation ability of deep learning based ones, the corresponding iterative algorithm of BCS-LG is further unfolded into an interpretable optimization inspired multi-stage progressive reconstruction network abbreviated as LG-Net. In the block-wise and image-level manners, an accelerated proximal gradient inspired sub-network and a global prior induced tiny U-type sub-network are alternately designed. In addition, a single model is trained to address the CS reconstruction with several measurement ratios. Extensive experiments on four benchmark datasets indicate that the proposed approach can effectively eliminate blocky artifacts. Meanwhile, it substantially outperforms existing CS reconstruction methods in terms of Peak Signal to Noise Ratio, Structural SIMilarity and visual effect.
Article
Full-text available
The signal degradation due to the Poisson noise is a common problem in the low‐light imaging field. Recently, deep learning employing the convolution neural network for image denoising has drawn considerable attention owing to its favourable denoising performance. On the basis of the fact that the reconstruction of corrupted pixels can be facilitated by the context information in image denoising, the authors propose a deep multi‐scale cross‐path concatenation residual network (MC²RNet) which incorporates cross‐path concatenation modules for Poisson denoising. Multiple paths are achieved by the cross‐path concatenation operation and the skip connection. As a consequence, multi‐scale context representations of images under different receptive fields can be learnt by MC²RNet. With the residual learning strategy, MC²RNet learns the residual between the noisy image and the latent clean image rather than the direct mapping to facilitate model training. Specially, unlike existing discriminative Poisson denoising algorithms that train a model only for the specific noise level, they aim to train a single model for handling Poisson noise with different levels, i.e. blind Poisson denoising. Quantitative experiments demonstrate that the proposed model is superior over the state‐of‐the‐art Poisson denoising approaches in terms of peak signal‐to‐noise ratio and visual effect.
Article
Full-text available
In this paper we have described ReconNet -- a non-iterative algorithm for CS image reconstruction based on CNNs. The advantages of this algorithm are two-fold -- it can be easily implemented while making it 3 orders of magnitude faster than traditional iterative algorithms essentially making reconstruction real-time and it provides excellent reconstruction quality retaining rich semantic information over a large range of measurement rates. We have also discussed novel ways to improve the basic version of our algorithm. We have proposed learning the measurement matrix jointly with the reconstruction network as well as training with adversarial loss based on recently popular GANs. In both cases, we have shown significant improvements in reconstruction quality over a range of measurement rates. Using the ReconNet + KCF pipeline, efficient real-time tracking is possible using CS measurements even at a very low measurement rate of 0.01. This also means that other high-level inference applications such as image recognition can be performed using a similar framework i.e., ReconNet + Recognition from CS measurements. We hope that this work will generate more interest in building practical real-world devices and applications for compressive imaging.
Conference Paper
Full-text available
While variational methods have been among the most powerful tools for solving linear inverse problems in imaging, deep (convolutional) neural networks have recently taken the lead in many challenging benchmarks. A remaining drawback of deep learning approaches is that they require an expensive retraining whenever the specific problem, the noise level, noise type, or desired measure of fidelity changes. On the contrary, variational methods have a plug-and-play nature as they usually consist of separate data fidelity and regularization terms. In this paper we study the possibility of replacing the proximal operator of the regularization used in many convex energy minimization algorithms by a denoising neural network. The latter therefore serves as an implicit natural image prior, while the data term can still be chosen arbitrarily. Using a fixed denoising neural network in exemplary problems of image deconvolution with different blur kernels and image demosaicking, we obtain state-of-the-art results. Additionally, we discuss novel results on the analysis of possible convex optimization algorithms to incorporate the network into, as well as the choices of algorithm parameters and their relation to the noise level the neural network is trained on.
Article
Full-text available
Traditional patch-based sparse representation modeling of natural images usually suffer from two problems. First, it has to solve a large-scale optimization problem with high computational complexity in dictionary learning. Second, each patch is considered independently in dictionary learning and sparse coding, which ignores the relationship among patches, resulting in inaccurate sparse coding coefficients. In this paper, instead of using patch as the basic unit of sparse representation, we exploit the concept of group as the basic unit of sparse representation, which is composed of nonlocal patches with similar structures, and establish a novel sparse representation modeling of natural images, called group-based sparse representation (GSR). The proposed GSR is able to sparsely represent natural images in the domain of group, which enforces the intrinsic local sparsity and nonlocal self-similarity of images simultaneously in a unified framework. Moreover, an effective self-adaptive dictionary learning method for each group with low complexity is designed, rather than dictionary learning from natural images. To make GSR tractable and robust, a split Bregman based technique is developed to solve the proposed GSR-driven minimization problem for image restoration efficiently. Extensive experiments on image inpainting, image deblurring and image compressive sensing recovery manifest that the proposed GSR modeling outperforms many current state-of-the-art schemes in both PSNR and visual perception.
Article
Full-text available
In this paper we study an algorithm for solving a minimization problem composed of a differentiable (possibly non-convex) and a convex (possibly non-differentiable) function. The algorithm iPiano combines forward-backward splitting with an inertial force. It can be seen as a non-smooth split version of the Heavy-ball method from Polyak. A rigorous analysis of the algorithm for the proposed class of problems yields global convergence of the function values and the arguments. This makes the algorithm robust for usage on non-convex problems. The convergence result is obtained based on the \KL inequality. This is a very weak restriction, which was used to prove convergence for several other gradient methods. First, an abstract convergence theorem for a generic algorithm is proved, and, then iPiano is shown to satisfy the requirements of this theorem. Furthermore, a convergence rate is established for the general problem class. We demonstrate iPiano on computer vision problems: image denoising with learned priors and diffusion based image compression.
Article
Full-text available
A number of techniques for the compressed sensing of imagery are surveyed. Various imaging media are considered, including still images, motion video, as well as multiview image sets and multiview video. A particular emphasis is placed on block-based compressed sensing due to its advantages in terms of both lightweight reconstruction complexity as well as a reduced memory burden for the random-projection measurement operator. For multiple-image scenarios, including video and multiview imagery, motion and disparity compensation is employed to exploit frame-to-frame redundancies due to object motion and parallax, resulting in residual frames which are more compressible and thus more easily reconstructed from compressed-sensing measurements. Extensive experimental comparisons evaluate various prominent reconstruction algorithms for still-image, motion-video, and multiview scenarios in terms of both reconstruction quality as well as computational complexity.
Article
Full-text available
Compressive sensing (CS) is an alternative to Shannon/Nyquist sampling for the acquisition of sparse or compressible signals that can be well approximated by just K ¿ N elements from an N -dimensional basis. Instead of taking periodic samples, CS measures inner products with M < N random vectors and then recovers the signal via a sparsity-seeking optimization or greedy algorithm. Standard CS dictates that robust signal recovery is possible from M = O(K log(N/K)) measurements. It is possible to substantially decrease M without sacrificing robustness by leveraging more realistic signal models that go beyond simple sparsity and compressibility by including structural dependencies between the values and locations of the signal coefficients. This paper introduces a model-based CS theory that parallels the conventional theory and provides concrete guidelines on how to create model-based recovery algorithms with provable performance guarantees. A highlight is the introduction of a new class of structured compressible signals along with a new sufficient condition for robust structured compressible signal recovery that we dub the restricted amplification property, which is the natural counterpart to the restricted isometry property of conventional CS. Two examples integrate two relevant signal models-wavelet trees and block sparsity-into two state-of-the-art CS recovery algorithms and prove that they offer robust recovery from just M = O(K) measurements. Extensive numerical simulations demonstrate the validity and applicability of our new theory and algorithms.
Article
Full-text available
We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2-D image fragments (e.g., blocks) into 3-D data arrays which we call "groups." Collaborative filtering is a special procedure developed to deal with these 3-D groups. We realize it using the three successive steps: 3-D transformation of a group, shrinkage of the transform spectrum, and inverse 3-D transformation. The result is a 3-D estimate that consists of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative filtering reveals even the finest details shared by grouped blocks and, at the same time, it preserves the essential unique features of each individual block. The filtered blocks are then returned to their original positions. Because these blocks are overlapping, for each pixel, we obtain many different estimates which need to be combined. Aggregation is a particular averaging procedure which is exploited to take advantage of this redundancy. A significant improvement is obtained by a specially developed collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its efficient implementation are presented in full detail; an extension to color-image denoising is also developed. The experimental results demonstrate that this computationally scalable algorithm achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.
Article
Full-text available
Humans are visual animals, and imaging sensors that extend our reach – cameras – have improved dramatically in recent times thanks to the introduction of CCD and CMOS digital technology. Consumer digital cameras in the mega-pixel range are now ubiquitous thanks to the happy coincidence that the semiconductor material of choice for large-scale electronics integration (silicon) also happens to readily convert photons at visual wavelengths into electrons. On the contrary, imaging at wavelengths where silicon is blind is considerably more complicated, bulky, and expensive. Thus, for comparable resolution, a 500digitalcameraforthevisiblebecomesa500 digital camera for the visible becomes a 50,000 camera for the infrared. In this paper, we present a new approach to building simpler, smaller, and cheaper digital cameras that can operate efficiently across a much broader spectral range than conventional silicon-based cameras. Our approach fuses a new camera architecture based on a digital micromirror device (DMD – see Sidebar: Spatial Light Modulators) with the new mathematical theory and algorithms of compressive sampling (CS – see Sidebar: Compressive Sampling in a Nutshell). CS combines sampling and compression into a single nonadaptive linear measurement process [1–4]. Rather than measuring pixel samples of the scene under view, we measure inner products
Article
Full-text available
The purpose of this paper is to investigate neural network capability systematically. The main results are: 1) every Tauber-Wiener function is qualified as an activation function in the hidden layer of a three-layered neural network; 2) for a continuous function in S'(R<sup>1 </sup>) to be a Tauber-Wiener function, the necessary and sufficient condition is that it is not a polynomial; 3) the capability of approximating nonlinear functionals defined on some compact set of a Banach space and nonlinear operators has been shown; and 4) the possibility by neural computation to approximate the output as a whole (not at a fixed point) of a dynamical system, thus identifying the system
Article
Full-text available
This report demonstrates theoretically and empirically that a greedy algorithm called Orthogonal Matching Pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(mln d) random linear measurements of that signal. This is a massive improvement over previous results, which require O(m2) measurements. The new results for OMP are comparable with recent results for another approach called Basis Pursuit (BP). In some settings, the OMP algorithm is faster and easier to implement, so it is an attractive alternative to BP for signal recovery problems.
Article
Full-text available
Suppose x is an unknown vector in Ropfm (a digital image or signal); we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n=O(m1/4log5/2(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an lscrp ball for 0<ples1. The N most important coefficients in that expansion allow reconstruction with lscr2 error O(N1/2-1p/). It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients. Moreover, a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing. The nonadaptive measurements have the character of "random" linear combinations of basis/frame elements. Our results use the notions of optimal recovery, of n-widths, and information-based complexity. We estimate the Gel'fand n-widths of lscrp balls in high-dimensional Euclidean space in the case 0<ples1, and give a criterion identifying near- optimal subspaces for Gel'fand n-widths. We show that "most" subspaces are near-optimal, and show that convex optimization (Basis Pursuit) is a near-optimal way to extract information derived from these near-optimal subspaces
Article
The performance of computer vision algorithms can severely degrade in the presence of a variety of distortions. While image enhancement algorithms have evolved to optimize image quality as measured according to human visual perception, their relevance in maximizing the success of computer vision algorithms operating on the enhanced image has been much less investigated. We consider the problem of image enhancement to combat Gaussian noise and low resolution with respect to the specific application of image retrieval from a dataset. We define the notion of image quality as determined by the success of image retrieval and design a deep convolutional neural network (CNN) to predict this quality. This network is then cascaded with a deep CNN designed for image denoising or super resolution, allowing for optimization of the enhancement CNN to maximize retrieval performance. This framework allows us to couple enhancement to the retrieval problem. We also consider the problem of adapting image features for robust retrieval performance in the presence of distortions. We show through experiments on distorted images of the Oxford and Paris buildings datasets that our algorithms yield improved mean average precision when compared to using enhancement methods that are oblivious to the task of image retrieval.
Article
Magnetic resonance imaging (MRI) reconstruction is an active inverse problem which can be addressed by conventional compressed sensing (CS) MRI algorithms that exploit the sparse nature of MRI in an iterative optimization-based manner. However, two main drawbacks of iterative optimization-based CSMRI methods are time-consuming and are limited in model capacity. Meanwhile, one main challenge for recent deep learning-based CSMRI is the trade-off between model performance and network size. To address the above issues, we develop a new multi-scale dilated network for MRI reconstruction with high speed and outstanding performance. Comparing to convolutional kernels with same receptive fields, dilated convolutions reduce network parameters with smaller kernels and expand receptive fields of kernels to obtain almost same information. To maintain the abundance of features, we present global and local residual learnings to extract more image edges and details. Then we utilize concatenation layers to fuse multi-scale features and residual learnings for better reconstruction. Compared with several non-deep and deep learning CSMRI algorithms, the proposed method yields better reconstruction accuracy and noticeable visual improvements. In addition, we perform the noisy setting to verify the model stability, and then extend the proposed model on a MRI super-resolution task.
Article
In the study of compressed sensing (CS), the two main challenges are the design of sampling matrix and the development of reconstruction method. On the one hand, the usually used random sampling matrices (e.g. GRM) are signal independent, which ignore the characteristics of the signal. On the other hand, the state-of-the-art image CS methods (e.g. GSR and MH) achieve quite good performance, however with much higher computational complexity. To deal with the two challenges, we propose an image CS framework using convolutional neural network (dubbed CSNet) that includes a sampling network and a reconstruction network, which are optimized jointly. The sampling network adaptively learns the sampling matrix from the training images, which makes the CS measurements retain more image structural information for better reconstruction. Specifically, three types of sampling matrices are learned, i.e. floating-point matrix, {0,1}-binary matrix, and {-1,+1}-bipolar matrix. The last two matrices are specially designed for easy storage and hardware implementation. The reconstruction network, which contains a linear initial reconstruction network and a non-linear deep reconstruction network, learns an end-to-end mapping between the CS measurements and the reconstructed images. Experimental results demonstrate that CSNet offers state-of-the-art reconstruction quality, while achieving fast running speed. In addition, CSNet with {0,1}-binary matrix, and {-1,+1}-bipolar matrix gets comparable performance with the existing deep learning based CS methods, and outperforms the traditional CS methods. What’s more, the experimental results further suggest that the learned sampling matrices can improve the traditional image CS reconstruction methods significantly.
Article
Compressive sensing (CS) shows that a signal can be measured at a sampling rate lower than Nyquist and recovered from the obtained measurements. In most methods, the scene is recovered uniformly. However, some regions are more important than others in a scene. Thus we want these parts, region of interest (ROI), to be recovered more precisely. In this paper, we propose an ROI-aware compressive sensing network (dubbed ROI-CSNet). The proposed network can achieve a higher reconstruction quality in the ROIs while retaining the scientific quality in the rest of the image, under the constraints of measurements budget. For the sensing procedure, our method consists of preliminary and ROI sensing procedures. In the preliminary sensing procedure, the full scene is measured for extraction of ROIs. After that, more sensing resources are allocated to ROIs of the scene in the ROI sensing procedure. For the ROI recovery, we put the measurements of these two sensing procedures together to recover the ROI-aware image. Meanwhile, to further develop the recovery quality of ROIs, we design an ROI-aware loss function which makes the network focus more on the reconstruction of ROIs. The experimental results show that the proposed method outperforms uniform sampling methods in both recovery quality of ROIs and running speed, under the same measurement rate.
Article
In recent years, deep-learning based aesthetics assessment methods have shown promising results. However, existing methods can only achieve limited success because 1) most of the methods take one fixed-size patch as the training example, which loses the fine grained details and the holistic layout information, and 2) most of the methods ignore ordinal issues in image aesthetic assessment, i.e. image scored 5.3 is more likely to be in the high quality class than image scored 4.5. To address these challenges, we presents a novel convolutional networks with two branches to encode global and local features. The first branch not only captures the spatial layout information but also feedbacks the top-down neural attention. The second branch selects the important attended region to extract the fine details features. A sobel-based attention layer is integrated with the second branch to enhance fine details encoding. Regarding the second problem, we combine the strength of classification approach and regression approach by a multi-task learning framework. Extensive experiments on challenging Aesthetic and Visual Analysis (AVA) dataset and Photo.net dataset indicate the effectiveness of the proposed method.
Article
Most traditional algorithms for compressive sensing image reconstruction suffer from the intensive computation. Recently, deep learning-based reconstruction algorithms have been reported, which dramatically reduce the time complexity than iterative reconstruction algorithms. In this paper, we propose a novel Deep Residual Reconstruction Network (DR²-Net) to reconstruct the image from its Compressively Sensed (CS) measurement. The DR²-Net is proposed based on two observations: (1) linear mapping could reconstruct a high-quality preliminary image, and (2) residual learning could further improve the reconstruction quality. Accordingly, DR²-Net consists of two components, i.e., linear mapping network and residual network, respectively. Specifically, the fully-connected layer in neural network implements the linear mapping network. We then expand the linear mapping network to DR²-Net by adding several residual learning blocks to enhance the preliminary image. Extensive experiments demonstrate that the DR²-Net outperforms traditional iterative methods and recent deep learning-based methods by large margins at measurement rates 0.01, 0.04, 0.1, and 0.25, respectively. The code of DR²-Net has been released on: https://github.com/coldrainyht/caffe_dr2.
Article
The purpose of wireless video multicast is to send a video signal simultaneously to multiple heterogeneous users, each of whom desires video quality that matches its channel condition. The current compressed sensing (CS)-based wireless video multicast has some advantages but also some shortcomings, such as high computational and low reconstruction quality, especially at high packet loss rates. This paper aims to improve the current CS-based multicasts, and proposes two deep compressed sensing networks for wireless video multicast, abbreviated as DCSN-Cast. We first consider a residual DCSN-Cast (DCSRN-Cast), which consists of two parts: a fully connected network and a deep residual network. The fully connected network takes the measurements of CS as input and outputs the preliminary reconstructed image, while the deep residual network takes this preliminary reconstructed image and outputs the final reconstructed image. The second scheme is a fully connected deep neural network (DCSFCN-Cast) which prunes the convolutional neural network under consideration to reduce the complexity. Extensive experiments show that the proposed DCSN-Cast outperforms the state-of-the-art CS-based wireless video multicast methods, especially at high packet loss rates.
Article
A single image super-resolution (SR) algorithm that combines deep convolutional neural networks (CNNs) with multi-scale similarity is presented in this work. The aim of this method is to address the incapability of the existing CNN methods in digging the potential information in the image itself. In order to dig these information, the image patches that look similar within the same scale and across the different scales are firstly searched inside the input image. Subsequently, a spatial transform networks (STNs) are embedded into the CNNs to make the similar patches well aligned. The STNs allow the CNNs to have the ability of spatial manipulation of data. Finally, when SR is performing through the proposed pyramid-shaped CNNs, the high-resolution (HR) image will be predicted gradually according to the complementary information provided by these aligned patches. The experimental results confirm the effectiveness of the proposed method and demonstrate it can be compared with state-of-the-art approaches for single image SR.
Article
Speckle noise is one of the critical disturbances that present in the radar imagery. This noise degrades the quality of synthetic aperture radar (SAR) images and needs to be reduced before using SAR images. This paper investigates a novel method for despeckling of SAR images in the distributed compressed sensing (DCS) framework. A linear matrix-based formulation is developed for the received SAR raw data and a compressive measurement and partitioning (CMP) scheme is proposed to collect and partition the SAR data into some data subsets. Then, a DCS-based despeckling method is proposed, in which, all data subsets are jointly considered in image formation and noise reduction. One of the important features of the proposed method is that the image formation and speckle reduction are done jointly. Finally, experimental results are provided to show the effectiveness of the proposed method in comparison to the previous ones.
Article
Coded diffraction patterns (CDPs) recorded by optical detectors are often affected by Poisson noise in optical applications. How to recover the image of interest from few noisy CDPs is a challenge. In this paper, a double sparse regularization (DSR) model that exploits both the gradient sparsity and the structured sparsity is proposed to recover the image of interest from the recorded CDPs corrupted with Poisson noise. An image patch group matrix is formed by stacking similar image patches one by one. Owing to the similar structure of these image patches, the formed image patch group matrix is low rank. Based on this fact, a group low rank (GLR) regularization model is formulated. Combining the GLR model and the total variation (TV) model, we propose the so-called DSR model. The DSR model is utilized to formulate a phase retrieval optimization problem that consists of two terms: (i) the Poisson likelihood fidelity term, (ii) the proposed DSR model of utilizing TV and GLR. The accelerated gradient descent method that utilizes the adjustable gradient clipping technique is presented to solve the corresponding problem. Experimental results demonstrate that the proposed algorithm can recover the image with high quality from few CDPs, and can be robust to Poisson noise.
Article
Sparse signal recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they are able to run orders of magnitude faster than existing methods. Unfortunately, these methods are difficult to train, often-times specific to a single measurement matrix, and largely unprincipled blackboxes. It was recently demonstrated that iterative sparse-signal-recovery algorithms can be unrolled to form interpretable deep neural networks. Taking inspiration from this work, we develop novel neural network architectures that mimic the behavior of the denoising-based approximate message passing (D-AMP) and denoising-based vector approximate message passing (D-VAMP) algorithms. We call these new networks Learned D-AMP (LDAMP) and Learned D-VAMP (LDVAMP). The LDAMP/LDVAMP networks are easy to train, can be applied to a variety of different measurement matrices, and come with a state-evolution heuristic that accurately predicts their performance. Most importantly, our networks outperforms the state-of-the-art BM3D-AMP and NLR-CS algorithms in terms of both accuracy and runtime. At high resolutions, and when used with matrices which have fast matrix multiply implementations, LDAMP runs over 50×50\times faster than BM3D-AMP and hundreds of times faster than NLR-CS.
Article
The great content diversity of real-world digital images poses a grand challenge to image quality assessment (IQA) models, which are traditionally designed and validated on a handful of commonly used IQA databases with very limited content variation. To test the generalization capability and to facilitate the wide usage of IQA techniques in realworld applications, we establish a large-scale database named the Waterloo Exploration Database, which in its current state contains 4; 744 pristine natural images and 94; 880 distorted images created from them. Instead of collecting the mean opinion score for each image via subjective testing, which is extremely difficult if not impossible, we present three alternative test criteria to evaluate the performance of IQA models, namely the pristine/distorted image discriminability test (D-test), the listwise ranking consistency test (L-test), and the pairwise preference consistency test (P-test). We compare 20 well-known IQA models using the proposed criteria, which not only provide a stronger test in a more challenging testing environment for existing models, but also demonstrate the additional benefits of using the proposed database. For example, in the P-test, even for the best performing no-reference IQA model, more than 6 million failure cases against the model are “discovered” automatically out of over 1 billion test pairs. Furthermore, we discuss how the new database may be exploited using innovative approaches in the future, to reveal the weaknesses of existing IQA models, to provide insights on how to improve the models, and to shed light on how the next-generation IQA models may be developed. The database and codes are made publicly available at: https://ece.uwaterloo.ca/k29ma/exploration/.
Article
In this paper, we develop a new framework for sensing and recovering structured signals. In contrast to compressive sensing (CS) systems that employ linear measurements, sparse representations, and computationally complex convex/greedy algorithms, we introduce a deep learning framework that supports both linear and mildly nonlinear measurements, that learns a structured representation from training data, and that efficiently computes a signal estimate. In particular, we apply a stacked denoising autoencoder (SDA), as an unsupervised feature learner. SDA enables us to capture statistical dependencies between the different elements of certain signals and improve signal recovery performance as compared to the CS approach.
Article
This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discrete-time signal f∈CN and a randomly chosen set of frequencies Ω. Is it possible to reconstruct f from the partial knowledge of its Fourier coefficients on the set Ω? A typical result of this paper is as follows. Suppose that f is a superposition of |T| spikes f(t)=στ∈Tf(τ)δ(t-τ) obeying |T|≤CM·(log N)-1 · |Ω| for some constant CM>0. We do not know the locations of the spikes nor their amplitudes. Then with probability at least 1-O(N-M), f can be reconstructed exactly as the solution to the ℓ1 minimization problem. In short, exact recovery may be obtained by solving a convex optimization problem. We give numerical values for CM which depend on the desired probability of success. Our result may be interpreted as a novel kind of nonlinear sampling theorem. In effect, it says that any signal made out of |T| spikes may be recovered by convex programming from almost every set of frequencies of size O(|T|·logN). Moreover, this is nearly optimal in the sense that any method succeeding with probability 1-O(N-M) would in general require a number of frequency samples at least proportional to |T|·logN. The methodology extends to a variety of other situations and higher dimensions. For example, we show how one can reconstruct a piecewise constant (one- or two-dimensional) object from incomplete frequency samples - provided that the number of jumps (discontinuities) obeys the condition above - by minimizing other convex functionals such as the total variation of f.
Article
Sparsity has been widely exploited for exact reconstruction of a signal from a small number of random measurements. Recent advances have suggested that structured or group sparsity often leads to more powerful signal reconstruction techniques in various compressed sensing (CS) studies. In this paper, we propose a nonlocal low-rank regularization (NLR) approach toward exploiting structured sparsity and explore its application into CS of both photographic and MRI images. We also propose the use of a nonconvex log det(X) as a smooth surrogate function for the rank instead of the convex nuclear norm; and justify the benefit of such a strategy using extensive experiments. To further improve the computational efficiency of the proposed algorithm, we have developed a fast implementation using the alternative direction multiplier method (ADMM) technique. Experimental results have shown that the proposed NLR-CS algorithm can significantly outperform existing state-of-the-art CS techniques for image recovery.
Article
A denoising algorithm seeks to remove perturbations or errors from a signal. The last three decades have seen extensive research devoted to this arena, and as a result, today's denoisers are highly optimized algorithms that effectively remove large amounts of additive white Gaussian noise. A compressive sensing (CS) reconstruction algorithm seeks to recover a structured signal acquired using a small number of randomized measurements. Typical CS reconstruction algorithms can be cast as iteratively estimating a signal from a perturbed observation. This paper answers a natural question: How can one effectively employ a generic denoiser in a CS reconstruction algorithm? In response, in this paper, we develop a denoising-based approximate message passing (D-AMP) algorithm that is capable of high-performance reconstruction. We demonstrate that, for an appropriate choice of denoiser, D-AMP offers state-of-the-art CS recovery performance for natural images. We explain the exceptional performance of D-AMP by analyzing some of its theoretical features. A critical insight in our approach is the use of an appropriate Onsager correction term in the D-AMP iterations, which coerces the signal perturbation at each iteration to be very close to the white Gaussian noise that denoisers are typically designed to remove.
Article
Sparse representation based modeling has been successfully used in many image-related inverse problems such as deblurring, super-resolution and compressive sensing. The heart of sparse representations lies on how to find a space (spanned by a dictionary of atoms) where the local image patch exhibits high sparsity and how to determine the image local sparsity. To identify the locally varying sparsity, it is necessary to locally adapt the dictionary learning process and the sparsity-regularization parameters. However, spatial adaptation alone runs into the risk of over-fitting the data because variation and invariance are two sides of the same coin. In this work, we propose two sets of complementary ideas for regularizing image reconstruction process: (1) the sparsity regularization parameters are locally estimated for each coefficient and updated along with adaptive learning of PCA-based dictionaries; (2) a nonlocal self-similarity constraint is introduced into the overall cost functional to improve the robustness of the model. An efficient alternative minimization algorithm is present to solve the proposed objective function and then an effective image reconstruction algorithm is presented. The experimental results on image deblurring, super-resolution and compressive sensing demonstrate that the proposed image reconstruct method outperforms many existing image reconstruction methods in both PSNR and visual quality assessment.
Conference Paper
Block-based random image sampling is coupled with a projection-driven compressed-sensing recovery that encourages sparsity in the domain of directional transforms simultaneously with a smooth reconstructed image. Both contourlets as well as complex-valued dual-tree wavelets are considered for their highly directional representation, while bivariate shrinkage is adapted to their multiscale decomposition structure to provide the requisite sparsity constraint. Smoothing is achieved via a Wiener filter incorporated into iterative projected Landweber compressed-sensing recovery, yielding fast reconstruction. The proposed approach yields images with quality that matches or exceeds that produced by a popular, yet computationally expensive, technique which minimizes total variation. Additionally, reconstruction quality is substantially superior to that from several prominent pursuits-based algorithms that do not include any smoothing.
Article
Compressed sensing is a technique to sample compressible signals below the Nyquist rate, whilst still allowing near optimal reconstruction of the signal. In this paper we present a theoretical analysis of the iterative hard thresholding algorithm when applied to the compressed sensing recovery problem. We show that the algorithm has the following properties (made more precise in the main text of the paper)•It gives near-optimal error guarantees.•It is robust to observation noise.•It succeeds with a minimum number of observations.•It can be used with any sampling operator for which the operator and its adjoint can be computed.•The memory requirement is linear in the problem size.•Its computational complexity per iteration is of the same order as the application of the measurement operator or its adjoint.•It requires a fixed number of iterations depending only on the logarithm of a form of signal to noise ratio of the signal.•Its performance guarantees are uniform in that they only depend on properties of the sampling operator and signal sparsity.
Article
Compressive sampling offers a new paradigm for acquiring signals that are compressible with respect to an orthonormal basis. The major algorithmic challenge in compressive sampling is to approximate a compressible signal from noisy samples. This paper describes a new iterative recovery algorithm called CoSaMP that delivers the same guarantees as the best optimization-based approaches. Moreover, this algorithm offers rigorous bounds on computational cost and storage. It is likely to be extremely efficient for practical problems because it requires only matrix-vector multiplies with the sampling matrix. For compressible signals, the running time is just O(N log^2 N), where N is the length of the signal.
Article
We address the image denoising problem, where zero-mean white and homogeneous Gaussian additive noise is to be removed from a given image. The approach taken is based on sparse and redundant representations over trained dictionaries. Using the K-SVD algorithm, we obtain a dictionary that describes the image content effectively. Two training options are considered: using the corrupted image itself, or training on a corpus of high-quality image database. Since the K-SVD is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how such Bayesian treatment leads to a simple and effective denoising algorithm. This leads to a state-of-the-art denoising performance, equivalent and sometimes surpassing recently published leading alternative denoising methods.
Conference Paper
Compressed sensing (CS) is a new technique for simultaneous data sampling and compression. In this paper, we propose and study block compressed sensing for natural images, where image acquisition is conducted in a block-by-block manner through the same operator. While simpler and more efficient than other CS techniques, the proposed scheme can sufficiently capture the complicated geometric structures of natural images. Our image reconstruction algorithm involves both linear and nonlinear operations such as Wiener filtering, projection onto the convex set and hard thresholding in the transform domain. Several numerical experiments demonstrate that the proposed block CS compares favorably with existing schemes at a much lower implementation cost.
Conference Paper
This paper presents a database containing `ground truth' segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties
Article
This lecture note presents a new method to capture and represent compressible signals at a rate significantly below the Nyquist rate. This method, called compressive sensing, employs nonadaptive linear projections that preserve the structure of the signal; the signal is then reconstructed from these projections using an optimization process.
Article
Suppose we are given a vector f in a class FsubeRopfN , e.g., a class of digital signals or digital images. How many linear measurements do we need to make about f to be able to recover f to within precision epsi in the Euclidean (lscr2) metric? This paper shows that if the objects of interest are sparse in a fixed basis or compressible, then it is possible to reconstruct f to within very high accuracy from a small number of random measurements by solving a simple linear program. More precisely, suppose that the nth largest entry of the vector |f| (or of its coefficients in a fixed basis) obeys |f|(n)lesRmiddotn-1p/, where R>0 and p>0. Suppose that we take measurements yk=langf# ,Xkrang,k=1,...,K, where the Xk are N-dimensional Gaussian vectors with independent standard normal entries. Then for each f obeying the decay estimate above for some 0<p<1 and with overwhelming probability, our reconstruction ft, defined as the solution to the constraints yk=langf# ,Xkrang with minimal lscr1 norm, obeys parf-f#parlscr2lesCp middotRmiddot(K/logN)-r, r=1/p-1/2. There is a sense in which this result is optimal; it is generally impossible to obtain a higher accuracy from any set of K measurements whatsoever. The methodology extends to various other random measurement ensembles; for example, we show that similar results hold if one observes a few randomly sampled Fourier coefficients of f. In fact, the results are quite general and require only two hypotheses on the measurement ensemble which are detailed
Article
The Time-Frequency and Time-Scale communities have recently developed a large number of overcomplete waveform dictionaries --- stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the Method of Frames (MOF), Matching Pursuit (MP), and, for special dictionaries, the Best Orthogonal Basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l 1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP! and BOB, including better sparsity, and super-resolution. BP has interesting relations to ideas in areas as diverse as ill-posed problems, in abstract harmonic analysis, total variation de-noising, and multi-scale edge de-noising. Basis Pursuit in highly ...
Image compressive sensing recovery via collaborative sparsity
  • Zhang
Deep fully-connected networks for video compressive sensing
  • Lliadis