ArticlePDF Available

Low-rank Bayesian Tensor Factorization for Hyperspectral Image Denoising

Low-rank Bayesian Tensor Factorization for Hyperspectral Image Denoising
Kaixuan Wei, Ying Fu
School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
In this paper, we present a low-rank Bayesian tensor factorization approach for hyperspectral image (HSI) denoising problem, where
zero-mean white and homogeneous Gaussian additive noise is removed from a given HSI. The approach is based on two intrinsic
properties underlying a HSI, i.e., the global correlation along spectrum (GCS) and nonlocal self-similarity across space (NSS).
We first adaptively construct the patch-based tensor representation for the HSI to extract the NSS knowledge while preserving
the property of GCS. Then, we employ the low rank property in this representation to design a hierarchical probabilistic model
based on Bayesian tensor factorization to capture the inherent spatial-spectral correlation of HSI, which can be eectively solved
under the variational Bayesian framework. Furthermore, through incorporating these two procedures in an iterative manner, we
build an eective HSI denoising model to recover HSI from its corruption. This leads to a state-of-the-art denoising performance,
consistently surpassing recently published leading HSI denoising methods in terms of both comprehensive quantitative assessments
and subjective visual quality.
Keywords: Hyperspectral image denoising, full Bayesian CP factorization, nonlocal self-similarity, global correlation along
spectrum, variational Bayesian inference, tensor rank auto determination.
1. Introduction
Hyperspectral image (HSI) is made up of massive contigu-
ous wavebands for each spatial position of real scenes and pro-
vides much richer information about scenes than multiple/RGB
images. It has been widely used for remote sensing, including
mineral identification [1, 2], land cover classification [3], vege-
tation studies [4], and atmospheric studies [5]. Besides, in the
computer vision field, the availability of detailed physical repre-
sentation of HSI has been substantiated to significantly enhance
the performance of numerous computer vision tasks, such as
inpainting [6], tracking [7], unmixing [8], super-resolution [9],
and face recognition [10].
However, in real cases, a HSI is always corrupted by noise,
which severely degrades the quality of the imagery, and nega-
tively impacts all subsequent HSI processing tasks aforemen-
tioned. Noise is inevitable during the acquisition, and caused
at dierent stages in both the optics and photodetector [11].
Therefore, HSI denoising plays a vital role in the typical work-
flow of HSI analysis and processing.
From our observations of several state-of-the-art HSI de-
noising methods [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22],
we find that the essence of successful design of HSI denois-
ing algorithm is to reasonably extract useful prior structures
knowledge underlying a HSI. The most commonly employed
prior structures for HSI recovery including its global correlation
along spectrum (GCS) and nonlocal self-similarity across space
Corresponding author
Email address:,
(Kaixuan Wei, Ying Fu)
(NSS). More specific, the GCS prior denotes a huge amount of
redundancy across the spectral dimension. The high correlation
can be observed among images located in adjacent bands of a
HSI generally. And the NSS prior indicates the enhancement
of sparsity can be achieved by grouping similar images frag-
ments (i.e. blocks), which can further improve the performance
of various HSI recovery methods [6, 16, 20].
Since the traditional 2D image denoising is a well stud-
ied yet still active topic, the simplest way of denoising a HSI
is to apply these o-the-shelf techniques [23, 24, 25, 26, 27]
band by band. However, this kind of coarsely extended meth-
ods ignores the GCS prior completely, which leads to a rela-
tively low-quality result. To address this issue, carefully de-
signed extension of several high performance 2D image denois-
ing methods was proposed recently [12, 15, 14]. One notable
example is a nonlocal transform-domain filter, generally re-
ferred to BM4D [12], which is a non-trivial but straightforward
extension of well-known image denoising method BM3D [24].
Besides, a sparse representation based reconstruction method
[15] which jointly utilize the global and local redundancy and
correlation in spatial/spectral domain, is inspired by previous
outstanding work 3D cubic K-SVD [23]. Similarly, a spec-
tral/spatial adaptive hyperspectral total variance (SSAHTV) de-
noising algorithm [14], in which the spectral noise dierences
and spatial information dierences are both considered in the
process of noise reduction, is a HSI-oriented variation of the
spatially adaptive TV (SATV) model [28].
Instead of constructing method directly from existed one,
low-rank matrix recovery (LRMR) techniques are employed for
HSI denoising, including convex relaxation based approaches
Preprint submitted to Journal of Neurocomputing March 9, 2020
[29, 30, 31, 32] and Bayesian inference approaches [33, 34, 35,
36, 37]. For example, Zhang et al. [19] directly adopt the Go
Decomposition (GoDec) algorithm [32], which is a optimiza-
tion algorithm aimed to solve LRMR model, to estimate the
low-rank HSI patch. Chen et al. [38] employed a modified
version of robust principal component analysis (RPCA) algo-
rithm which models noise with mixture of Gaussian in the con-
text of LRMR model, originally proposed in [37] to deal with
HSI with non-i.i.d. noise structure. Recently, He et al. [21]
proposed a local matrix recovery method while considering the
global spatial-spectral smoothness through TV regularization,
which achieves great performance especially in complex noise
removal of hyperspectral remote sensing images.
Albeit these LRMR-based approaches are eective to cer-
tain HSI denoising cases, but they only consider the GCS prior
knowledge. Since the HSI can be naturally represented as a
3D tensor instead of a 2D matrix, one obvious extension of
these LRMR-based approaches is to facilitate the power of ten-
sor decomposition [39, 40] which attracts growing attention re-
cent years. The two popular tensor factorization framework,
named as Tucker and CANDECOMP/PARAFAC (CP), are used
to denoise HSI and permit to appreciate the denoising eciency
respectively[41, 17]. Nevertheless, the input tensor of their
methods is just the original form of the HSI, including two
spatial dimensions and one spectral dimension. This definitely
degrades their denoising performance, since the two separated
spatial dimensions of the HSI are generally not low-rank rep-
resentable. This intuition can be easily comprehended by Fig-
ure 1, which exhibits typical output bands by directly applying
LRTF method and our integrated model. It can be seen in the
LRTF recovery (top right corner) that some vertical or hori-
zontal artifact textures are introduced due to the unsuitable as-
sumption that the two spatial dimensions of a HSI are low-rank
Figure 1: Typical output bands by the LRTF based methods, which show it is
unreasonable to directly apply LRTF based methods on HSI denoising.
This property of HSI limits the way to straightly apply the
low-rank tensor factorization (LRTF) based methods. To alle-
viate this problem, one question has been raised naturally: Can
we transfer the initial form of a HSI into a more low-rank rep-
resentable one without destroying the spatial and spectral struc-
ture informations? Fortunately, this issue can be eciently ad-
dressed by a remarkable technique called block matching, orig-
inally introduced by the benchmark BM3D [24] for image de-
noising. There are two approaches to extend the block match-
ing strategy into volumetric data in general. One approach is to
impose the NSS prior on the spectral dimension which further
stacks the cubes of similar voxels into 4D ”groups” respectively
[12]. Another approach, which considers the GCS prior, is to
simply extend a normal 2-D patch to a full band patch (FBP),
then follows the same block matching procedure as the BM3D
[16, 13]. Given the spectral property of a HSI, the FBP based
approach is more ecient than the voxel based approach while
preserving eectiveness.
Once the FBP clusters, which can be viewed as a set of 3D
tensors, have been constructed using the extended block match-
ing strategy, the family of tensor decomposition methods can
be employed to maximize the use of underlying knowledge in-
cluding these aforementioned priors. Xie et al. built a new HSI
denoising model ITS-Reg[13] by applying LRTF based method
on tensors formed by nonlocal similar patches within the HSI.
Zhuang et al. [22] combined the power of LRMR and LRTF
based techniques to construct a global local factorization model
called GLF. However, as the noise level varies drastically, these
convex relaxation based low rank approximation approaches
are prone to overfitting due to the incorrect specified regular-
ization parameters, resulting in severe deterioration of recovery
performance. It is also worth noting that the rank minimization
based on convex optimization of nuclear norm is aected by
the tuning parameters, which may tend to over/underestimate
the HSI.
In this paper, we present a hierarchical probabilistic model
for HSI denoising based on full Bayesian CP tensor factoriza-
tion (LBTF)[42], which can not only fit the underlying noise
adaptively without knowing the specific noise intensity, but also
determine the tensor rank automatically to address the over-
fitting issues. We first adaptively transform the original HSI
into patch-based tensor representations (clusters) to extract the
NSS knowledge while preserving the property of GCS in these
new representations. Then we regard each cluster as a low-rank
noisy observation in our hierarchical probability model in or-
der to obtain the inherent spatial-spectral correlation of HSI.
This model is eectively solved by an elegant deterministic al-
gorithm based on variational inference. The empirical study
demonstrates the superiority of our method, which consistently
outperforms other state-of-the-art HSI denoising methods both
quantitatively and visually.
The rest of this paper is organized as follows. Section 2
presents preliminary multilinear operators and notations. Sec-
tion 3 introduces our Bayesian tensor factorization approach for
HSI denoising. The extensive experimental results on both syn-
thetic and real data are presented in Section 4, followed by con-
clusions in Section 5.
2. Notions and Preliminaries
The order of a tensor is the number of dimensions (a.k.a
ways or modes). For clarification, scalars (zero order tensors)
are denoted by lowercase letters, e.g., a. Vectors (first order
tensors) are denoted by boldface lowercase letters, e.g., a. Ma-
trices (second order tensors) are denoted by boldface capital let-
ters, e.g., A. And the k-th row or column vector are denoted by
Ak·,A·krespectively. General tensors (without order constraint)
are denoted by boldface calligraphic letters, e.g., A. Given a
N-th order tensor ARI1×···×IN, its (i1,· · · ,×iN)-th entry is
denoted by Ai1···iNwithout boldface.
The denotations of tensor operations follow [40]. The Kro-
necker product of matrices ARI×Jand BRM×Nis a matrix
of size I M ×JN , denoted by AB. The Khatri-Rao product
of matrices ARI×Kand BRJ×Kis a matrix of size I J ×K
defined by a columnwise Kronecker product, and denoted by
AB. For convenience, the Khatri-Rao product of a set of
matrices {A(n)|n=1,· · · ,N}in a reverse order is defined by
A(n)=A(N)A(N1)  · · · A(1) (1)
while the Khatri-Rao product of a set of matrices except the k-th
matrix, is defined by
A(n)=A(N) · · · A(k+1) A(k1) · · ·  A(1) (2)
The Hadamard product (a.k.a the Schur product or the en-
trywise product) of two tensors {A,B} ∈ RI1×···×INis denoted
by AB. It is a tensor with same dimensions as {A,B}where
each element indexed by i1i2· · · iNis the product of elements
indexed by i1i2· · · iNof the original two tensors.
The inner product of two tensors {A,B}is defined by hA,Bi=
Pi1,··· ,iNAi1,··· ,iNBi1,··· ,iN. For a more general case, we define
DA(1),· · · ,A(N)E=X
i1,··· ,iNY
i1,··· ,iN(3)
Our framework for LRTF is based on the CP decomposition
which can be viewed as a higher-order generalization of the
widely used matrix singular value decomposition (SVD). Given
a tensor XRI1×···×IN, it can be exactly factorized by a CP
model, giving by
·r}· · · }A(N)
·r=[[A(1) ,· · · ,A(N)]] (4)
where }denotes the outer product of vectors, [[· · · ]] denotes a
Kruskal operator of a set of matrices having the same number
of columns, A(n)is a mode-nfactor matrix of size In×Rand R
is assumed to be the upper bound the rank of tensor X. Then
each element of the tensor can be described as
Xi1,··· ,iN=
i2r· · · A(N)
3. The Proposed Method
3.1. Formulation
The HSI can be mathematically viewed as a 3D tensor. Given
the additive noise, the noisy observation Ycan be described as
where Xis the original, unknown clean HSI, and ∼ N(0, τ1)
is the additive noise. YRH×W×B, where H,W,Bare stand for
the corresponding spatial height, spatial width and the number
of spectral band of this HSI respectively. Xand have the
same dimension with Y. In this paper, we mainly consider the
independent and identically distributed (i.i.d.) Gaussian noise
with unknown noise intensity.
For clarification and simplicity, we denote the proposed low-
rank Bayesian tensor factorization technique as LBTF, and the
whole HSI denoising algorithm as LBTF-HSI. The objective of
the proposed LBTF-HSI is to provide an estimation ˆ
Xof the
original Xfrom the noisy observation Y, so that ˆ
Xshould be
similar to Xas much as possible under a commonly adopted
error measure (e.g. l2norm).
3.2. Iterative Denoising Framework
The LBTF-HSI is implemented in an iterative denoising
framework motivated by [24], generally consisted of one to
five near duplicated stages. Each stage comprises three steps:
grouping, low-rank tensor recovery, and aggregation. The group-
ing and aggregation steps are required by the block matching
technique, and the low-rank tensor recovery is what the step we
apply the proposed LBTF algorithm. The flow-diagram of the
LBTF-HSI implementation is illustrated in Figure 2.
In the grouping step, we first separate the noisy HSI Yinto
a set of FBPs with overlap. Then for each local FBP, we con-
struct a FBP cluster by performing block matching. To be spe-
cific, given a reference FBP, the Pnmost similar FBP over all
FBPs will be matched then form together into a FBP cluster,
where Pndenotes the number of patches in a cluster, and the
similarity is measured in a l2norm form for simplicity. Note
this procedure is highly related to the GCS and NSS prior, and
after this operation, both GCS and NSS knowledge are well
preserved and reflected by such new representation, along its
spectral and nonlocal-patch-number modes, respectively.
In the recovery step, we first initialize two zero-entries ten-
sors denoted by Cand Wwith the same size of the noisy ob-
servation Y, then straightly apply the LBTF technique on each
of FBP clusters to exploit the intrinsic low-rank property un-
derlying this new representation. After we reconstruct all clean
FBP clusters, we map every estimation to Clike the original
form Yusing a cumulative scheme. It also does deserve to be
noticed that Wcan be explained as weight with respect to the
cumulative C, which is obtained during the calculation process
of C.
In the aggregation step, like a inverse Hadamard product,
we simply divide the cumulative Cby its corresponding weight
Welementwise to generate an estimate ˆ
Xof the original X.
To regard this estimate ˆ
Xas a regularization, we can construct
Figure 2: Flowchart of the proposed LBTF-HSI method. An noisy HSI is firstly splitted into a set of overlapping full bands patch, after grouping by each reference
FBP (step 1), each FBP cluster is fed into the LBTF model to acquire its clean estimation (step 2), then each clean estimation is mapped onto the corresponding
locations as it in the original noisy HSI by an accumulative scheme (step 3). Then either follow an iterative regularization to repeat this process or output the
estimated clean HSI as a final result.
a updated noisy observation Yby iterative regularization then
repeat the same steps described above. This produces an iter-
ative denoising framework. The whole detained procedures of
this framework are summarized in Algorithm 1.
Algorithm 1 Iterative Denoising Framework
Input: Noisy HSI Y;
1: Initialize ˆ
X(0) =Y;
2: for k=1 : Kdo
3: Iterative regularization Y(k)=ˆ
X(k1) +δ(Yˆ
4: Construct the entire FBP set k;
5: Group matching FBP clusters {Yi}L
6: for Each FBP cluster Yido
7: Recover Xifrom Yiby LBTF;
8: end for
9: Aggregate {Xi}L
i=1to form the clean estimate ˆ
10: end for
11: Assign ˆ
Output: Estimation ˆ
Xof X;
3.3. Hierarchical Probabilistic Model
Now we present the LBTF algorithm used in the recovery
step, which is the key component of our method. In this step,
each observation is a noisy FBP cluster formed by block match-
ing. Given such noisy observation, we apply the LBTF to infer
the underlying clean cluster.
From a Bayesian perspective, the CP tensor factorization
can be formulated by a hierarchical probabilistic model which
is actually an instance of probabilistic graphical model (PGM)
The CP generative model in Equation (4) together with the
observation model in Equation (6) directly gives rise to the fol-
lowing hierarchical probabilistic model. As we discussed in
iterative denoising framework, after we acquire a set of FBP
clusters {Yi}L
i=1, for each cluster YRI1×I2×I3(for brevity, the
subscript iis omitted), its probability density can be derived
through being factorized over tensor elements
n=1, τ =
i3·E, τ1
where A(n)is the latent mode-nfactor matrix of size In×R,
we note n=1,2,3 for spatial, spectral, and nonlocal-patch-
number modes respectively. RtRdenotes the ground-truth
rank of tensor X.N(y|µ, τ1) denotes a Gaussian distribution
of the form
N(y|µ, τ1)=(τ
2ex p τ
In order to further build our hierarchical probabilistic model,
we need to enforce a suitable probabilistic structure on the un-
derlying factor matrices {A(n)}3
n=1. From the CP model in Equa-
tion (4), notice that each outer product contributes at most one
to the rank of X. Since a low-rank estimation of Xis sought,
our goal is to achieve column sparsity in A(n), such that most
column in A(n)are set equal to zero. To enforce this constraint,
we associate the columns of A(n)with Gaussian priors of preci-
sions (inverse variances) λr, that is,
where Λ=diag(λ) denotes an inverse variance matrix and is
shared by latent factor matrices in all modes. Thus, the r-th
columns of {A(n)}3
n=1have the same sparsity profile enforced by
the common precisions λr. As shown later, many of the preci-
sions λrwill assume very large values during inference, which
eectively removes the corresponding outer-products from X,
and hence reduces the rank of the estimation. We can further
define a hyperprior over hyperparameter λ, which is factorized
over latent dimensionality due to the independent assumption
0) (10)
where Gam(x|a,b) denotes a Gamma distribution of the form
Gam(x|a,b)=baxa1eb x
Γ(·) is the Gamma function.
Using the similar technique, we also place a hyperprior over
the noise precision τ, i.e.
p(τ)=Gam(τ|a0,b0) (12)
Combining Equations (7) and (9) to (12) together, we can
complete our hierarchical probabilistic model as a PGM, the
whole graph representation is illustrated in Figure 3.
For brevity of notations, we denotes all unknowns including
both latent variable and hyperparameters by Θ = {A(n)}3
From Figure 3, we can write the the joint distribution of ob-
served data and all model parameters as
The goal turns to infer the posterior of all involved parameters,
which can be done by maximizing Equation (13) without loss
of generality.
Figure 3: The probabilistic graphical model representation of Bayesian CP ten-
sor factorization.
3.4. Variational Inference
However, in contrast to the point estimation, we aim to
compute the full posterior distribution of all parameters in Θ.
Since that, a deterministic approximate inference method under
the variational Bayesian (VB) framework [43] is developed to
learn the aforementioned hierarchical probabilistic model. To
achieve this goal, we therefore seek a distribution q(Θ) to ap-
proximate the true posterior distribution p(Θ|Y) by solving the
following optimization problem
qKL (q(Θ)|| p(Θ|Y))=Zq(Θ)ln (p(Θ|Y)
where KL(q||p) represents the KL divergence between two dis-
tribution qand p. Since the posterior distribution p(Θ|Y) is
computational intractable in our model, it makes our problem
that cannot be reduced from the VB framework into the expec-
tation maximization (EM) framework. Thus, some constraints
need to be imposed on the variational distribution q(Θ) to make
this optimization feasible. Specifically, it will be assumed that
the variational distribution is factorized w.r.t. each parameter
Θj, so that
This factorized form of variational inference corresponds to an
approximation framework developed in physics called mean
field theory [44]. After that, the closed-form optimal solution
j(Θj) can be obtained by
ln q
j(Θj)=hln p(Y,Θ)iΘ\Θj+const (16)
where h·iis a unary operator denoting expectation and Θ\Θj
denotes the set of Θwith Θjremoved. Since the distributions of
all variables are drawn from the distributions over their parent
variables, we can analytically infer the posterior distributions
of model parameters using Equations (13), (15) and (16).
Estimation of mode-n factors A(n).
inn[1,3] (17)
where the posterior parameters can be updated by
inDB(\n)TEvec Y·in·(18)
The most complex term is related to B(\n), which is of size
Qk,nIk×R, and denotes the Khatri-Rao product of latent factors
in all modes except nth-mode. vec Y·in·denotes the vectorized
FBP cluster of size Qk,nIk, whose mode-n index is in.
Estimation of hyperparameters λ.
Gam(λr|cr,dr) (21)
Estimation of hyperparameter τ.
q(τ)=Gam(λr|a,b) (24)
Y[[ A(1) ,A(2),A(3)]]
Algorithm 2 Low-rank Bayesian Tensor Factorization
Input: A FBP cluster Yi;
1: Initialize factor matrices and their covariance A(n)
in, hy-
perpriors a0,b0,c0,d0and hyperparameters τ=a0
b0, λr=
2: while not converge do
3: for n=1 to 3 do
4: Update the posterior q(A(n)) using Equations (18)
to (20);
5: end for
6: Update the posterior q(λ) using Equations (22) and (23);
7: Update the posterior q(τ) using Equations (25) and (26);
8: Update the estimated Rank R by maxnRank(A(n))
9: end while
Output: Estimate FBP cluster ˆ
Xiand Rank R;
The whole procedure of model inference is summarized in
Algorithm 2, It’s worth noting that tensor rank is determined
automatically and implicitly. To be specific, during inference,
most of the hyperparameters λiare driven to very large values,
which will force the posterior means of the columns to go to
zero, eectively removing them from the model and reducing
the rank. For implementation of the algorithm, we keep the size
of {A(n)}unchanged during iterations, while an alternative way
is to remove the zero components of {A(n)}after each iteration.
4. Experiment and Analysis
In this section, extensive simulated and real data experi-
ments are conducted to validate the denoising capabilities of
the proposed LBTF-HSI algorithm, and qualitative and visual
results are illustrated. The detailed analysis about our method
is presented in final.
Figure 4: Simulated pseudo color images from Columbia Dataset
4.1. Simulated HSI Denoising
Columbia Dataset. The Columbia HSI Dataset [46]1is em-
ployed in our simulated experiment, which is commonly used
in other algorithms verification [13, 16]. This dataset consists
of 32 real-world scenes of a wide variety of real-world materi-
als and objects, with spatial resolution 512 ×512 and spectral
bands 31. Each HSI includes full spectral resolution reflectance
data collected from 400 nm to 700 nm with 10 nm interval. The
simulated pseudo color images from this dataset are shown in
Figure 4. In our experiments, the intensity of these HSIs is
scaled into [0,1].
Implementation Details. Additive white Gaussian noise (AWGN),
which comes from many natural sources, is added into these
testing HSIs to generate Ycorresponding to our observation
model with noise intensity ranging from 15 to 100 (It’s need to
be clarified we denote the noise intensity with a base 255, i.e.
15 means the standard deviation of Gaussian noise is 15
255 , simi-
larly hereinafter). Unlike other methods, which require specific
noise intensity as a input parameter, we do not feed this infor-
mation into our method since the internal noise intensity can
be automatically learned during its denoising process. Con-
sequently, except particularly mentioned, we provide the real
noise intensity to comparison methods while our method learns
the noise model automatically.
For parameters setting, we need to care about the initializa-
tion strategy in LBTF (Algorithm 2). There are two parame-
ters which are closely relevant to initialization. One is a binary
parameter which can choose the low-rank components initial-
ization scheme between SVD and random generation (follow a
standard normal distribution). Though the theory of VB frame-
work [43] can guarantee every initialized point converges to a
local minimum, we find using random generation rather than
SVD will achieve better performance in the context of HSI de-
noising. This phenomenon can be interpreted by grouping and
aggregation operations involved in our method, which appreci-
ate miscellaneous initialized points rather relatively stable ones.
Another parameter which dominantly aects the denoising ca-
pability of LBTF is the upper bound of rank Rof the low-rank
components. It’s worth noting that we only need to provide a
roughly estimation of the upper bound of objective rank rather
(a) Clean image
(b) Noisy image
(20.17, 0.19)
(c) BM3D
(34.91, 0.92)
(d) BM4D
(38.61, 0.95)
(e) LRMR
(33.27, 0.72)
(f) LRTV
(29.74, 0.89)
(g) LRTA
(34.53, 0.87)
(35.35, 0.90)
(i) GLF
(40.29, 0.96)
(j) TDL
(38.07, 0.96)
(k) ITSReg
(39.78, 0.95)
(l) Ours
Figure 5: The images at band 590 nm of chart and stued toy under noise level σ=25 on CAVE dataset. Two demarcated areas in each image are amplified at a 3
times larger scale for easy observation of details.
(a) Clean image
(b) Noisy image
(14.15, 0.11)
(c) BM3D
(28.89, 0.82)
(d) BM4D
(32.16, 0.89)
(e) LRMR
(27.20, 0.56)
(f) LRTV
(26.13, 0.77)
(g) LRTA
(29.63, 0.78)
(30.72, 0.85)
(i) GLF
(33.80, 0.91)
(j) TDL
(31.79, 0.88)
(k) ITSReg
(33.67, 0.93)
(l) Ours
Figure 6: The images at band 490 nm of watercolors under noise level σ=50 on CAVE dataset. Two demarcated areas in each image are amplified at a 6 times
larger scale for easy observation of details.
than the indeed objective rank required by other low-rank based
methods [19, 45]. After one iteration of our algorithm, the truth
rank can be automatically estimated. We simply set Rin the
first iteration to 15, and keep track of mean of the truth rank of
all clusters as Rof the next iteration in all of our experiments.
Comparison Methods. The comparison methods include: band-
wise BM3D [24]2, which represents the state-of-the-art for the
2D extended band-wise approach; BM4D [12]2, which repre-
sents state-of-the-arts for the 2D extended 3D-cube-based ap-
proach; LRMR [19], LRTV [45] and LLRGTV [21] which rep-
resent state-of-the-arts for the low-rank matrix-based approach;
LRTA [41], GLF [22], TDL [16]3and ITS-Reg [13]3, which
represent state-of-the-arts for the tensor-based approach. All
parameters involved in the competing algorithms were manu-
ally tuned optimally or automatically chosen as described in
the reference papers.
Performance Metrics. To comprehensively assess the perfor-
mance of all competing methods, we employ five quantitative
picture quality indices (PQI) for performance evaluation, in-
cluding peak signal-to-noise ratio (PSNR), structure similar-
ity (SSIM [47]), feature similarity (FSIM [48]), erreur relative
globale adimensionnelle de synthe‘se (ERGAS [49]) and spec-
tral angle map (SAM [50]). PSNR and SSIM are two conven-
tional PQIs in image processing and computer vision. They
evaluate the similarity between the target image and reference
image based on MSE and structural consistency, respectively.
FSIM emphasizes the perceptual consistency with the reference
(a) Clean image
(b) Noisy image
(10.63, 0.02)
(c) BM3D
(31.66, 0.79)
(d) BM4D
(33.59, 0.74)
(e) LRMR
(25.93, 0.32)
(f) LRTV
(28.29, 0.70)
(g) LRTA
(30.99, 0.69)
(32.59, 0.76)
(i) GLF
(35.39, 0.83)
(j) TDL
(34.16, 0.85)
(k) ITSReg
(34.26, 0.82)
(l) Ours
Figure 7: The images at band 640 nm of flowers under noise level σ=75 on CAVE dataset. One demarcated areas in each image is amplified at a 1.5 times larger
scale for easy observation of details.
(a) Clean image
(b) Noisy image
(8.13, 0.04)
(c) BM3D
(23.89, 0.50)
(d) BM4D
(26.23, 0.64)
(e) LRMR
(21.69, 0.40)
(f) LRTV
(22.79, 0.46)
(g) LRTA
(24.09, 0.46)
(25.31, 0.65)
(i) GLF
(27.59, 0.74)
(j) TDL
(26.08, 0.65)
(k) ITSReg
(26.69, 0.69)
(l) Ours
Figure 8: The images at band 590 nm of cloth under noise level σ=100 on CAVE dataset. Two demarcated areas in each image are amplified at a 6 times larger
scale for easy observation of details.
image. The larger these three measures are, the closer the target
HSI is to the reference one. ERGAS measures fidelity of the re-
stored image based on the weighted sum of MSE in each band.
SAM measures the spectral fidelity between the restored image
and the reference image across all spatial positions. Dierent
from the former three measures, the smaller these two measures
are, the better does the target HSI estimate the reference one.
Performance Evaluation. For each noise setting, all of the five
PQI values for each competing HSI denoising methods on all
32 scenes have been calculated and recorded. Table 1 lists the
average performance over dierent scenes under noise settings
of all methods. From these quantitative comparison, the advan-
tage of the proposed method can be evidently observed. Par-
ticularly, with the increase of noise intensity, our method sur-
passes the second best ITS-Reg under the measure of PSNR
by a large margin (e.g. 0.96 dB under σ=75, 2.5dB under
σ=75). This is due to the overfitting issue commonly ex-
isted in state-of-the-art methods. Our method successfully ad-
dress this issue by automatically determining the tensor rank,
consequently achieving great performance especially in severe
pollution case. Figures 5 to 8 illustrate the visual results of dif-
ferent methods under dierent noise levels. It can be seen that
our method consistently outperform other methods as we mea-
sured in Table 1. Specifically, in Figure 6, we can see except
GLF and our method, none of the competing methods can suc-
cessfully recover the exact edge shape of cloud exhibited in the
green box. In Figure 8, only GLF, ITS-Reg and our method
produce smooth and noise-free results, while the fine-grained
details of ours are much clearer and shaper than ITS-Reg’s. We
also compute the PSNR value of each bands in these four HSIs
Table 1: Average performance of 10 competing methods w.r.t. 5 PQIs. For each specific noise intensity setting, the results are obtained by averaging through the 32
scenes. The best results of each case among these methods are denoted by boldface.
Sigma Index
[24] [12] [19] [45] [41] [21] [22] [16] [13]
PSNR 24.61 39.81 42.38 37.21 33.54 39.21 38.46 43.41 42.30 43.43 43.46
SSIM 0.291 0.951 0.968 0.869 0.912 0.930 0.948 0.977 0.972 0.972 0.976
FSIM 0.794 0.973 0.981 0.974 0.938 0.971 0.978 0.989 0.987 0.989 0.988
ERGAS 325.24 56.41 41.35 76.49 124.88 60.89 71.05 38.49 41.98 37.26 36.72
SAM 0.785 0.157 0.151 0.391 0.204 0.183 0.175 0.128 0.101 0.138 0.103
PSNR 20.17 37.03 39.59 33.49 32.42 36.67 36.63 40.96 39.72 40.57 41.21
SSIM 0.148 0.919 0.943 0.736 0.895 0.893 0.913 0.957 0.957 0.945 0.964
FSIM 0.661 0.955 0.968 0.952 0.922 0.953 0.969 0.984 0.979 0.980 0.982
ERGAS 542.05 77.50 57.16 115.39 136.84 81.21 84.89 50.50 56.39 51.45 48.26
SAM 0.933 0.208 0.215 0.569 0.234 0.218 0.254 0.167 0.123 0.242 0.118
PSNR 14.15 33.49 35.65 28.35 29.82 33.16 33.45 37.15 36.16 37.55 37.83
SSIM 0.052 0.862 0.870 0.470 0.846 0.819 0.812 0.890 0.918 0.919 0.927
FSIM 0.465 0.922 0.938 0.890 0.891 0.919 0.944 0.970 0.956 0.963 0.966
ERGAS 1084.15 116.60 90.13 204.78 183.40 120.99 118.41 77.24 84.58 72.85 71.07
SAM 1.124 0.277 0.340 0.797 0.350 0.278 0.433 0.263 0.186 0.243 0.173
PSNR 10.63 31.36 33.28 25.27 27.98 31.17 31.28 34.75 34.08 34.78 35.74
SSIM 0.026 0.810 0.794 0.310 0.787 0.762 0.716 0.812 0.875 0.881 0.889
FSIM 0.362 0.894 0.908 0.826 0.870 0.892 0.921 0.957 0.934 0.945 0.951
ERGAS 1626.14 147.89 118.14 290.62 224.31 152.38 149.57 101.80 107.73 100.36 90.47
SAM 1.225 0.338 0.429 0.913 0.477 0.318 0.585 0.357 0.243 0.297 0.224
PSNR 8.13 29.83 31.56 23.03 26.50 29.69 29.64 33.03 32.56 31.77 34.26
SSIM 0.015 0.767 0.723 0.214 0.751 0.712 0.635 0.747 0.826 0.835 0.855
FSIM 0.299 0.871 0.879 0.766 0.853 0.869 0.899 0.944 0.911 0.914 0.938
ERGAS 2168.26 175.21 143.73 375.29 267.94 180.21 178.72 123.92 128.06 143.74 107.08
SAM 1.290 0.383 0.496 0.995 0.540 0.350 0.695 0.432 0.299 0.306 0.263
Table 2: Average performance of 10 competing methods w.r.t. 5 PQIs under unkowen Gaussian noise level. The results are obtained by averaging through the 32
scenes. The best results of each case among these methods are denoted by boldface.
None 14.03 ±4.62 0.079 ±0.108 0.462 ±0.197 1235.75±613.62 1.124 ±0.276
BM3D 33.36 ±3.31 0.857 ±0.052 0.919 ±0.030 119.26 ±35.93 0.292 ±0.112
BM4D 35.73 ±3.02 0.877 ±0.055 0.934 ±0.033 92.11 ±27.73 0.320 ±0.145
LRMR 29.35 ±3.77 0.603 ±0.170 0.893 ±0.061 194.22 ±71.11 0.610 ±0.228
LRTV 29.38 ±3.03 0.841 ±0.072 0.894 ±0.043 194.92 ±63.90 0.361 ±0.157
LRTA 33.34 ±3.21 0.844 ±0.065 0.924 ±0.029 119.97 ±37.56 0.236 ±0.080
LLRGTV 32.38 ±3.14 0.783 ±0.092 0.931 ±0.031 135.66 ±42.20 0.410 ±0.200
GLF 36.88 ±3.06 0.859 ±0.086 0.967 ±0.015 82.62 ±27.99 0.287 ±0.168
TDL 36.20 ±3.09 0.915 ±0.035 0.952 ±0.022 85.57 ±24.05 0.183 ±0.085
ITSReg 37.17 ±3.17 0.916 ±0.042 0.959 ±0.023 78.21 ±25.31 0.218 ±0.150
Ours 37.70 ±2.98 0.924 ±0.034 0.965 ±0.017 73.20 ±21.36 0.174 ±0.084
(i.e. watercolors,cloth, etc.). It can be seen in Figure 9, the
PSNR values of all bands obtained by LBTF-HSI are signifi-
cantly higher than those of the other methods.
Denoising under Unknown Noise Level. Motivated by appeal-
ing noise intensity self-adaptive property aforementioned of our
method, we conduct experiments under unknown Gaussian noise
level for further demonstrating the advantages of the proposed
method. Here, we still adopt 32 real-world scenes HSIs from
the Columbia Dataset described above. Unlike former exper-
iment, which recurrently adds Gaussian noise with intensity
from 15 to 100 into 32 clean HSIs to generate 160 corrupted
HSIs, we only generate 32 corrupted HSIs with noise intensi-
ties randomly sampled from a uniform distribution of range [15,
100] in this experiment. Notice the true noise intensities are not
provided, we use an o-the-shelf noise estimation method [51]
to estimate it, which is set as the input parameter for all com-
pared methods except ours. Table 2 summarizes the qualitative
400 450 500 550 600 650 700
45 BM3D
(a) chart and stued toy
400 450 500 550 600 650 700
36 BM3D
(b) watercolors
400 450 500 550 600 650 700
40 BM3D
(c) flowers
400 450 500 550 600 650 700
32 BM3D
(d) cloth
Figure 9: PSNR values across the spectrum corresponding to chart and stued
toy (Fig. 5), watercolors (Fig. 6), flowers (Fig. 7) and cloth (Fig. 8) respec-
results of this experiment, which shows our method surpasses
about 0.48 dB than previous best-performance method ITS-Reg
under the measure of PNSR while with the best stability (less
variance) among all the competing methods.
Run Time. In addition to visual quality, another important as-
pect for an HSI denoising method is the run time. We then
compare the speed of all competing methods. All experiments
are run under the Matlab2016a environment on a machine with
Intel(R) Core(TM) i7-7700K CPU of 4.2GHz and 16 GB RAM.
Figure 10 shows the Time v.s. PSNR of dierent methods for de-
noising HSIs of size 512 ×512×31. The results are obtained by
10 0
10 1
10 2
10 3
10 4
Time (sec)
Figure 10: Time (second) v.s. PSNR (dB) of all competing method for HSI
averaging all 32 scenes with variety of noise intensity. We can
see that eectiveness potentially often sacrifices eciency. Our
method is relatively slower than TDL, BM4D and GLF. How-
ever, taking the great enhancement in denoising eectiveness
into account, our method is still highly completable with these
two state-of-the-art methods. On the other hand, our method
typically achieves 2 times speed even with better denoising ca-
pability compared with ITS-Reg.
4.2. Real HSI Denoising
Here, the Hyperspectral Digital Imagery Collection Exper-
iment (HYDICE) urban dataset4and the Harvard real-world
hyperspectral datasets (HHD)[52] are utilized to evaluate our
method in real-world noise context. The original HSI in HY-
DICE is of size 304 ×304 ×210. As the bands 139-155, 201-
210 are seriously polluted by the atmosphere and water absorp-
tion, and can provide little useful information, we manually
remove them and leave the remaining test data with a size of
304 ×304 ×183 like [13]. The whole HHD dataset consisting
of 50 noisy hyperspectral images of size 1040 ×1392 ×31 are
captured with the wave-lengths in the range of 420-720 nm at
an interval of 10. We scale these HSIs into the interval [0, 1],
and employ the similar implementation strategies and param-
eter settings for all competing methods as previous simulated
experiments. Noise estimation method [51] used before is also
applied in this setting. We illustrate the experimental results in
Figure 11 and Figure 12 respectively.
Figure 11 includes the restorations of bands 1, 109 of the
urban HSI. We finely choose two demarcated area with spe-
cific semantics to conveniently compare the denoising capabil-
ity of dierent methods. Specifically, The red box area of band
1 represents the housing estate in urban area. It can be obvi-
ously observed that most of competing methods (e.g. BM3D,
BM4D, LRTA, TDL, ITS-Reg) cannot remove the inappropri-
ate stripes existed in this area, while some methods (i.e. LRMR,
LRTV) produce oversmooth results, in some degree destroying
(a) Noisy image (b) BM3D (c) BM4D (d) LRMR (e) LRTV (f) LRTA
(g) LLRGTV (h) GLF (i) TDL (j) ITSReg (k) Ours
Figure 11: Real complex noise removal results results at two bands (indexed by 1, 109 respectively) of HYDICE urban HSI. Two demarcated areas in each image
are amplified at a 6 times larger scale for easy observation of details.
(a) Noisy image (b) BM3D (c) BM4D (d) LRMR (e) LRTV
(f) LLRGTV (g) GLF (h) TDL (i) ITSReg (j) Ours
Figure 12: Real random noise removal results on HHD dataset. One demarcated area in each image is amplified at a 2 times larger scale for easy observation of
the original structure of objects of this housing estate. LL-
RGTV, GLF and our LBTF-HSI successfully gets rid of the
stripe noise while preserving the topology structure of this hous-
ing estate. At band 109, the image is highly corrupted by mis-
cellaneous complex noise. Obvious artefacts are still remained
in the results of many competing methods (i.e. BM3D, BM4D,
LRTA, TDL, ITS-Reg). While LLRGTV and GLF do produce
appealing results with good perceptual quality, these results ap-
parent deviate from the underlying ground truth (see green box
region at band 109). This phenomena may be caused by the
incorrect specified subspace dimension (i.e. objective rank re-
quired by their low rank approximation techniques). As a com-
2 4 6 8 10 12
Number of the Nonlocal Patches
PSNR Values
2 4 6 8 10 12
Number of the Nonlocal Patches
SSIM Values
Figure 13: Eects of patch sizes on denoising performance.
0 50 100
Number of the Nonlocal Patches
PSNR Values
0 50 100
Number of the Nonlocal Patches
Times (s)
Figure 14: Eects of the number of nonlocal patches on denoising performance.
10 20 30
Number of Bands
PSNR Values
10 20 30
Number of Bands
SSIM Values
Figure 15: Eects of the number of bands on denoising performance.
parison, Our method does not suer from the rank determina-
tion issue, thus it not only recovers the de facto semantics of the
demarcated area (i.e. the scene of neighbourhood of highway),
but also produces results with high fidelity.
Figure 12 displays the real random noise removal results
On HHD dataset. From the demarcated window, we can ob-
serve that our LBTF-HSI method obtains artifact-free image
with clearer texture and line pattern. In summary, LBTF-HSI
has obtained better performance in terms of noise suppression,
detail preserving, visual pleasure and PSNR value under dier-
ent noise level, even in the real-world unknown noise context.
4.3. Discussion
Besides the initialization strategy aforementioned, there are
other parameters introduced by dierent stages of our model,
i.e. patch size, numbers of nonlocal patches (for grouping)
and numbers of iterations (for iterative framework). Figure 13
shows the PSNR/SSIM values with respect to dierent patch
size. Patch size 6 (6x6) and 7 achieve best PSNR and SSIM
values respectively, among all candidates. Figure 14 illustrates
how PSNR/Times value varies with respect to the number of
nonlocal patches. We can see the denoising results become
gradually better with large number of nonlocal patches, infer-
ring the nonlocal self-similarity could be suciently utilized by
our model, even in a relaxed condition. Nevertheless, given the
computational cost and marginal enhancement through increas-
Number of Iterations
PSNR Values
σ=15 σ=25 σ=50 σ=75 σ=100
Figure 16: Eects of the number of iterations on denoising performance with
respect to dierent noise levels.
ing the number of nonlocal patches, we set it to 50 in all of our
We also show how the number of bands of HSI influences
the denoising capacity of our model. From Figure 15, we can
observe that the denoising results become gradually better with
larger number of bands. This suggests the information con-
tained in one band could be utilized to recover other bands, such
that the global correlation along the spectrum can be eectively
exploited by our model.
Figure 16 displays the eects of numbers of iteration on de-
noising performance with respect to dierent noise levels. Gen-
erally, more stronger noise intensity will require more iteration
times to achieve better performance, while at a expense of com-
puting eciency. we can see when noise intensity is relatively
small (e.g. σ=15), running algorithm in more than 2 iterations
would successively degenerate the performance. Though the
degradation issue is not observed during 5 iterations in strong
corruption cases (e.g. σ=50,75,100), the performance in-
crement through iterations becomes limited while significantly
increasing the computational cost. Therefore, we suggest the
use of {1, 2, 3, 4, 5}for σ={15,25,50,75,100}in the simu-
lated data experiments respectively.
5. Conclusion
In this paper, we presented an eective Low-rank Bayesian
Tensor Factorization based HSI denoising method, which con-
sidered two intrinsic characteristics of HSIs: the nonlocal self-
similarity across space and the global correlation across spec-
trum. In order to suciently embed these useful priors into our
model, the LBTF is utilized to describe the spatial-spectral cor-
relation of each FBP formed by block matching. This model
was eectively solved by our deterministic algorithm derived
under the variational Bayesian framework. Besides, an iterative
denoising framework was introduced for the purpose of further
enhancing the denoising capability of our method. The experi-
mental results on simulated and real HSI denoising showed that
the proposed method outperformed many state-of-the-art meth-
ods and demonstrated the eectiveness of the proposed method.
We encode the noise structure as Gaussian distribution in
our hierarchical probabilistic model. Since in real case, the sta-
tistical distribution of noise structure may be hard to be deter-
mined, it is worth investigating more eective noise model to
model the noise from the real world in future.
6. Acknowledgements
We thank the anonymous reviewers for their helpful com-
ments and suggestions to improve this paper. This work was
supported by the National Science Foundation of China under
Grants no. 61672096.
[1] J. F. Mustard, C. M. Pieters, Photometric phase functions of common ge-
ologic minerals and applications to quantitative analysis of mineral mix-
ture reflectance spectra, Journal of Geophysical Research: Solid Earth
94 (B10) (1989) 13619–13634.
[2] R. Neville, Automatic endmember extraction from hyperspectral data for
mineral exploration, in: International Airborne Remote Sensing Confer-
ence and Exhibition, 4 th/21 st Canadian Symposium on Remote Sensing,
Ottawa, Canada, 1999.
[3] M. Gianinetto, G. Lechi, The development of superspectral approaches
for the improvement of land cover classification, IEEE Transactions on
Geoscience and Remote Sensing 42 (11) (2004) 2670–2679.
[4] M. Lewis, V. Jooste, A. A. de Gasparis, Discrimination of arid vegetation
with airborne multispectral scanner hyperspectral imagery, IEEE Trans-
actions on Geoscience and Remote Sensing 39 (7) (2001) 1471–1479.
[5] R. Marion, R. Michel, C. Faye, Measuring trace gases in plumes from
hyperspectral remotely sensed data, IEEE Transactions on Geoscience
and Remote Sensing 42 (4) (2004) 854–864.
[6] A. Chen, The inpainting of hyperspectral images: A survey and adapta-
tion to hyperspectral data, SPIE Remote Sensing. International Society
for Optics and Photonics (2012) 85371–85371.
[7] H. Van Nguyen, A. Banerjee, R. Chellappa, Tracking via object re-
flectance using a hyperspectral video camera, in: The IEEE Confer-
ence on Computer Vision and Pattern Recognition Workshops (CVPRW),
2010, pp. 44–51.
[8] J. M. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader,
J. Chanussot, Hyperspectral unmixing overview: Geometrical, statistical,
and sparse regression-based approaches, IEEE Journal of Selected Topics
in Applied Earth Observations and Remote Sensing 5 (2) (2012) 354–379.
[9] R. Kawakami, Y. Matsushita, J. Wright, M. Ben-Ezra, Y.-W. Tai,
K. Ikeuchi, High-resolution hyperspectral imaging via matrix factoriza-
tion, in: The IEEE Conference on Computer Vision and Pattern Recogni-
tion (CVPR), IEEE, 2011, pp. 2329–2336.
[10] M. Uzair, A. Mahmood, A. Mian, Hyperspectral face recognition with
spatiospectral information fusion and pls regression, IEEE Transactions
on Image Processing 24 (3) (2015) 1127–1137.
[11] F. Deger, A. Mansouri, M. Pedersen, J. Y. Hardeberg, Y. Voisin, A sensor-
data-based denoising framework for hyperspectral images, Optics express
23 (3) (2015) 1938–1950.
[12] M. Maggioni, V. Katkovnik, K. Egiazarian, A. Foi, Nonlocal transform-
domain filter for volumetric data denoising and reconstruction, IEEE
Transactions on Image Processing 22 (1) (2013) 119–133.
[13] Q. Xie, Q. Zhao, D. Meng, Z. Xu, S. Gu, W. Zuo, L. Zhang, Multispectral
images denoising by intrinsic tensor sparsity regularization, in: The IEEE
Conference on Computer Vision and Pattern Recognition (CVPR), 2016,
pp. 1692–1700.
[14] Q. Yuan, L. Zhang, H. Shen, Hyperspectral image denoising employing
a spectral–spatial adaptive total variation model, IEEE Transactions on
Geoscience and Remote Sensing 50 (10) (2012) 3660–3677.
[15] Y.-Q. Zhao, J. Yang, Hyperspectral image denoising via sparse represen-
tation and low-rank constraint, IEEE Transactions on Geoscience and Re-
mote Sensing 53 (1) (2015) 296–308.
[16] Y. Peng, D. Meng, Z. Xu, C. Gao, Y. Yang, B. Zhang, Decomposable
nonlocal tensor dictionary learning for multispectral image denoising, in:
Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, 2014, pp. 2949–2956.
[17] X. Liu, S. Bourennane, C. Fossati, Denoising of hyperspectral images us-
ing the parafac model and statistical performance analysis, IEEE Trans-
actions on Geoscience and Remote Sensing 50 (10) (2012) 3717–3724.
[18] G. Chen, S.-E. Qian, Denoising of hyperspectral imagery using principal
component analysis and wavelet shrinkage, IEEE Transactions on Geo-
science and Remote Sensing 49 (3) (2011) 973–980.
[19] H. Zhang, W. He, L. Zhang, H. Shen, Q. Yuan, Hyperspectral image
restoration using low-rank matrix recovery, IEEE Transactions on Geo-
science and Remote Sensing 52 (8) (2014) 4729–4743.
[20] Y. Fu, A. Lam, I. Sato, Y. Sato, Adaptive spatial-spectral dictionary learn-
ing for hyperspectral image restoration, International Journal of Com-
puter Vision 122 (2) (2017) 228–245.
[21] W. He, H. Zhang, H. Shen, L. Zhang, Hyperspectral image denoising us-
ing local low-rank matrix recovery and global spatial–spectral total varia-
tion, IEEE Journal of Selected Topics in Applied Earth Observations and
Remote Sensing 11 (3) (2018) 713–729.
[22] L. Zhuang, J. M. Bioucas-Dias, Hyperspectral image denoising based
on global and non-local low-rank factorizations, in: Image Processing
(ICIP), 2017 IEEE International Conference on, IEEE, 2017, pp. 1900–
[23] M. Elad, M. Aharon, Image denoising via sparse and redundant represen-
tations over learned dictionaries, IEEE Transactions on Image Processing
15 (12) (2006) 3736–3745.
[24] K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse
3-d transform-domain collaborative filtering, IEEE Transactions on Im-
age Processing 16 (8) (2007) 2080–2095.
[25] K. Zhang, W. Zuo, Y. Chen, D. Meng, L. Zhang, Beyond a gaussian de-
noiser: Residual learning of deep cnn for image denoising, IEEE Trans-
actions on Image Processing.
[26] S. Gu, L. Zhang, W. Zuo, X. Feng, Weighted nuclear norm minimization
with application to image denoising, in: The IEEE Conference on Com-
puter Vision and Pattern Recognition (CVPR), 2014, pp. 2862–2869.
[27] J. Xu, L. Zhang, W. Zuo, D. Zhang, X. Feng, Patch group based nonlocal
self-similarity prior learning for image denoising, in: The IEEE Interna-
tional Conference on Computer Vision (ICCV), 2015, pp. 244–252.
[28] A. Chopra, H. Lian, Total variation, adaptive total variation and noncon-
vex smoothly clipped absolute deviation penalty for denoising blocky im-
ages, Pattern Recognition 43 (8) (2010) 2609–2619.
[29] E. J. Cand`
es, X. Li, Y. Ma, J. Wright, Robust principal component analy-
sis?, Journal of the ACM 58 (3) (2011) 11.
[30] Z. Lin, A. Ganesh, J. Wright, L. Wu, M. Chen, Y. Ma, Fast convex op-
timization algorithms for exact recovery of a corrupted low-rank matrix,
Computational Advances in Multi-Sensor Adaptive Processing 61 (6).
[31] Z. Lin, M. Chen, Y. Ma, The augmented lagrange multiplier method
for exact recovery of corrupted low-rank matrices, arXiv preprint
[32] T. Zhou, D. Tao, Godec: Randomized low-rank & sparse matrix decom-
position in noisy case, in: International Conference on Machine Learning
(ICML), Omnipress, 2011.
[33] X. Ding, L. He, L. Carin, Bayesian robust principal component analysis,
IEEE Transactions on Image Processing 20 (12) (2011) 3419–3430.
[34] Y. J. Lim, Y. W. Teh, Variational bayesian approach to movie rating pre-
diction, in: Proceedings of KDD cup and workshop, Vol. 7, 2007, pp.
[35] V. Y. Tan, C. F ´
evotte, Automatic relevance determination in nonnega-
tive matrix factorization, in: SPARS’09-Signal Processing with Adaptive
Sparse Structured Representations, 2009.
[36] S. D. Babacan, M. Luessi, R. Molina, A. K. Katsaggelos, Sparse bayesian
methods for low-rank matrix estimation, IEEE Transactions on Signal
Processing 60 (8) (2012) 3964–3977.
[37] Q. Zhao, D. Meng, Z. Xu, W. Zuo, L. Zhang, Robust principal component
analysis with complex noise, in: International Conference on Machine
Learning (ICML), 2014, pp. 55–63.
[38] Y. Chen, X. Cao, Q. Zhao, D. Meng, Z. Xu, Denoising hyperspectral
image with non-iid noise structure, arXiv preprint arXiv:1702.00098.
[39] T. G. Kolda, B. W. Bader, Tensor decompositions and applications, SIAM
review 51 (3) (2009) 455–500.
[40] N. D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E. E. Papalexakis,
C. Faloutsos, Tensor decomposition for signal processing and machine
learning, IEEE Transactions on Signal Processing 65 (13) (2017) 3551–
[41] N. Renard, S. Bourennane, J. Blanc-Talon, Denoising and dimensionality
reduction using multilinear tools for hyperspectral images, IEEE Geo-
science and Remote Sensing Letters 5 (2) (2008) 138–142.
[42] Q. Zhao, L. Zhang, A. Cichocki, Bayesian cp factorization of incomplete
tensors with automatic rank determination, IEEE Transactions on Pattern
Analysis and Machine Intelligence 37 (9) (2015) 1751–1763.
[43] C. M. Bishop, Pattern recognition and machine learning, springer, 2006.
[44] A. Georges, G. Kotliar, W. Krauth, M. J. Rozenberg, Dynamical mean-
field theory of strongly correlated fermion systems and the limit of infinite
dimensions, Reviews of Modern Physics 68 (1) (1996) 13.
[45] W. He, H. Zhang, L. Zhang, H. Shen, Total-variation-regularized low-
rank matrix factorization for hyperspectral image restoration, IEEE
Transactions on Geoscience and Remote Sensing 54 (1) (2016) 178–188.
[46] F. Yasuma, T. Mitsunaga, D. Iso, S. K. Nayar, Generalized assorted pixel
camera: postcapture control of resolution, dynamic range, and spectrum,
IEEE Transactions on Image Processing 19 (9) (2010) 2241–2253.
[47] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli, Image quality as-
sessment: from error visibility to structural similarity, IEEE Transactions
on Image Processing 13 (4) (2004) 600–612.
[48] L. Zhang, L. Zhang, X. Mou, D. Zhang, Fsim: A feature similarity index
for image quality assessment, IEEE Transactions on Image Processing
20 (8) (2011) 2378–2386.
[49] L. Wald, Data fusion: definitions and architectures: fusion of images of
dierent spatial resolutions, Presses des MINES, 2002.
[50] R. H. Yuhas, J. W. Boardman, A. F. Goetz, Determination of semi-arid
landscape endmembers and seasonal trends using convex geometry spec-
tral unmixing techniques, in: Summaries of the 4th Annual JPL Airborne
Geoscience Workshop, 1993.
[51] X. Liu, M. Tanaka, M. Okutomi, Single-image noise level estimation for
blind denoising, IEEE Transactions on Image Processing 22 (12) (2013)
[52] A. Chakrabarti, T. Zickler, Statistics of real-world hyperspectral im-
ages, in: IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), IEEE, 2011, pp. 193–200.
... Nevertheless, limited by imaging techniques, most existing HSI cameras still suffer from various types of noise that might degrade the performance of their applications, which urges the develop- ment of robust HSI denoising algorithms. Motivated by the intrinsic properties of HSI, traditional HSI denoising approaches [65,22] often exploit the optimization schemes with priors, e.g., low rankness [70,60], non-local similarities [41], spatial-spectral correlation [47], and global correlation along the spectrum [61]. Whilst offering appreciable performance, the efficacy of these methods is largely dependent on the degree of similarity between the handcrafted priors and the real-world noise model, and these methods are often challenging to accelerate with modern hardwares due to the complex processing pipelines. ...
... Among these optimization-based methods, non-local similarity [41] has been widely utilized for its ability to integrate the image patches across the spectral and spatial locations. To reduce the computational burden, global spectral low-rank correlation [60,53,70] has also been heavily studied. Besides, different enhanced total variation priors [46,58,65] are also adopted by considering the smoothness of local image patches. ...
... To consider the properties of HSIs, e.g., spatial-spectral correlations, QRNN3D [61] proposes to use 3D convolution and quasi-recurrent unit [6]. Our work adopts techniques, residual learning, 3D convolution, and U-shape architecture, but our blocks, e.g., S3Conv is more efficient than 3D convolution, and our GSSA could prevent vanished correlations for long-range spectral bands of QRU [60]. Separable convolution [28] is first introduced to replace 2D convolution. ...
Full-text available
In this paper, we present a Hybrid Spectral Denoising Transformer (HSDT) for hyperspectral image denoising. Challenges in adapting transformer for HSI arise from the capabilities to tackle existing limitations of CNN-based methods in capturing the global and local spatial-spectral correlations while maintaining efficiency and flexibility. To address these issues, we introduce a hybrid approach that combines the advantages of both models with a Spatial-Spectral Separable Convolution (S3Conv), Guided Spectral Self-Attention (GSSA), and Self-Modulated Feed-Forward Network (SM-FFN). Our S3Conv works as a lightweight alternative to 3D convolution, which extracts more spatial-spectral correlated features while keeping the flexibility to tackle HSIs with an arbitrary number of bands. These features are then adaptively processed by GSSA which per-forms 3D self-attention across the spectral bands, guided by a set of learnable queries that encode the spectral signatures. This not only enriches our model with powerful capabilities for identifying global spectral correlations but also maintains linear complexity. Moreover, our SM-FFN proposes the self-modulation that intensifies the activations of more informative regions, which further strengthens the aggregated features. Extensive experiments are conducted on various datasets under both simulated and real-world noise, and it shows that our HSDT significantly outperforms the existing state-of-the-art methods while maintaining low computational overhead.
... To sum up, traditional optimizationbased methods formulate the problem as an optimization objective that takes into account the unique properties of the spectrum and images. Various handcrafted regularization terms, such as total variation, wavelet constraint, and low-rank representation [42,44,50,54,59,61,71,83], have been incorporated in the optimization framework to enhance the denoising performance. These optimization-based methods are flexible enough to remove different types of noise [14,31] and can be extended to tasks beyond denoising [11,26,27,49]. ...
... In [72], a spatio-spectral deep residual CNN was proposed that utilizes 3D and 2D convolutional filters to capture the dependencies of the images. The MemNet [55] network and one variation MemNetRes [18] (which is a combination of MemNet with the Hyres approach [61]), can provide competitive hyperspectral denoising results. Inspired by the success of the 2D image denoising network DnCNN [79], Chang et al. [10] proposed HSI-DeNet, which learns multi-channel 2-D filters to model spectral correlation. ...
Hyperspectral imaging (HI) has emerged as a powerful tool in diverse fields such as medical diagnosis, industrial inspection, and agriculture, owing to its ability to detect subtle differences in physical properties through high spectral resolution. However, hyperspectral images (HSIs) are often quite noisy because of narrow band spectral filtering. To reduce the noise in HSI data cubes, both model-driven and learning-based denoising algorithms have been proposed. However, model-based approaches rely on hand-crafted priors and hyperparameters, while learning-based methods are incapable of estimating the inherent degradation patterns and noise distributions in the imaging procedure, which could inform supervised learning. Secondly, learning-based algorithms predominantly rely on CNN and fail to capture long-range dependencies, resulting in limited interpretability. This paper proposes a Degradation-Noise-Aware Unfolding Network (DNA-Net) that addresses these issues. Firstly, DNA-Net models sparse noise, Gaussian noise, and explicitly represent image prior using transformer. Then the model is unfolded into an end-to-end network, the hyperparameters within the model are estimated from the noisy HSI and degradation model and utilizes them to control each iteration. Additionally, we introduce a novel U-Shaped Local-Non-local-Spectral Transformer (U-LNSA) that captures spectral correlation, local contents, and non-local dependencies simultaneously. By integrating U-LNSA into DNA-Net, we present the first Transformer-based deep unfolding HSI denoising method. Experimental results show that DNA-Net outperforms state-of-the-art methods, and the modeling of noise distributions helps in cases with heavy noise.
... Traditionally, optimization algorithms are often adopted to solve HSI denoising with different hand-crafted priors exploring the domain knowledge of the HSI, i.e., global correlation along the spectrum and spatial-spectral correlation. Typical priors that are extensively studied includes, total variation [31,36], wavelet [23], low-rank [28,33,44], and etc. By considering the spectral and spatial redundancy, non-local patch-similarity [22] is also widely used in conjugation with variable splitting algorithms [15] and tensorbased dictionary learning [26]. ...
... Traditional methods solve the HSI denoising by treating it as an optimization problem, where they attempt to find unknown clean HSI by minimizing an optimization objective that incorporates the properties of spectrum and images. Such incorporation is generally achieved by designing different hand-crafted priors, e.g., total variation priors [31,36], wavelet priors [23], and low-rank priors [28,31,33,44]. By considering the non-local self-similarity in the spectral and spatial dimensions, many works such as block-matching and 4-D filtering (BM4D) [22] and the tensor dictionary learning [26] are also proposed. ...
Full-text available
Hyperspectral image denoising is unique for the highly similar and correlated spectral information that should be properly considered. However, existing methods show limitations in exploring the spectral correlations across different bands and feature interactions within each band. Besides, the low- and high-level features usually exhibit different importance for different spatial-spectral regions, which is not fully explored for current algorithms as well. In this paper, we present a Mixed Attention Network (MAN) that simultaneously considers the inter- and intra-spectral correlations as well as the interactions between low- and high-level spatial-spectral meaningful features. Specifically, we introduce a multi-head recurrent spectral attention that efficiently integrates the inter-spectral features across all the spectral bands. These features are further enhanced with a progressive spectral channel attention by exploring the intra-spectral relationships. Moreover, we propose an attentive skip-connection that adaptively controls the proportion of the low- and high-level spatial-spectral features from the encoder and decoder to better enhance the aggregated features. Extensive experiments show that our MAN outperforms existing state-of-the-art methods on simulated and real noise settings while maintaining a low cost of parameters and running time.
... Traditional optimization-based HSI restoration methods usually solve an inverse imaging problem with extra regularizations by exploiting the underlying characteristics of HSIs. By considering the spectral correlation, many works, such as total variational methods [9,10], wavelet methods [23], and low-rank models [24,25,10,26], have been developed. The non-local self-similarity is another important property of HSIs and was exploited in works like blockmatching and 4-D filtering (BM4D) [27] and the tensor dictionary learning [28]. ...
Full-text available
Deep-learning-based hyperspectral image (HSI) restoration methods have gained great popularity for their remarkable performance but often demand expensive network retraining whenever the specifics of task changes. In this paper, we propose to restore HSIs in a unified approach with an effective plug-and-play method, which can jointly retain the flexibility of optimization-based methods and utilize the powerful representation capability of deep neural networks. Specifically, we first develop a new deep HSI denoiser leveraging gated recurrent convolution units, short- and long-term skip connections, and an augmented noise level map to better exploit the abundant spatio-spectral information within HSIs. It, therefore, leads to the state-of-the-art performance on HSI denoising under both Gaussian and complex noise settings. Then, the proposed denoiser is inserted into the plug-and-play framework as a powerful implicit HSI prior to tackle various HSI restoration tasks. Through extensive experiments on HSI super-resolution, compressed sensing, and inpainting, we demonstrate that our approach often achieves superior performance, which is competitive with or even better than the state-of-the-art on each task, via a single model without any task-specific training.
... One method is to use probabilistic CP decomposition to estimate the CP rank by MAP estimation [3], which has been applied to channel estimation in MIMO systems [42]. Another approach is to use the EM algorithm to perform CP rank estimation and image denoising [34]. Although these studies are also stochastic approaches, they differ from our method in that they are based on point estimation and, therefore, cannot infer uncertainty in the solution. ...
Full-text available
Tensor completion, which completes high-dimensional data with missing entries, has many applications, such as recommender systems and image inpainting. Low-rank CP decomposition is one of the popular methods in tensor completion and is an extension of matrix decomposition to higher order tensors. However, unlike matrix factorization, it is NP-hard to obtain the rank of CP decomposition directly. In this paper, our objective is simultaneously achieving tensor completion and rank determination in CP decomposition. This can be achieved using Bayesian CP decomposition with Multiplicative Gamma Process (MGP) as the prior distribution. MGP is a distribution that decays the components. Using MGP, the proposed method avoids duplication of components and enables highly accurate rank estimation in Bayesian tensor modeling. In addition, MGP helps to reduce noise sensitivity and estimation time. Numerical experiments using artificial data and image data demonstrate the effectiveness of the proposed method.
The total variation (TV) regularizer is a widely used technique in image processing tasks to model an image’s local smoothness property. Intrinsically, the TV regularizer imposes sparsity constraints on the gradient maps of the image, which inevitably weakens the image texture structure and thus affects the quality of image restoration. To alleviate this issue, we propose a novel texture-preserved total variation (TPTV) regularizer for hyperspectral image (HSI) by introducing a weighting scheme. Specifically, the weights are assigned to the gradient maps of HSI, which help slack the sparsity constraint for the pixels with large variations, thus preserving the texture structure. Additionally, we elaborate an empirical method to learn the weights adaptively from observed HSI. Then, we propose an HSI denoising method based on the TPTV regularizer. Experimental results on synthetic and real HSI illustrate the superiority of our proposed method over other state-of-the-art methods. In addition, the proposed weighting scheme can be finely embedded into other TV regularizers and protect the image texture. The experiment results also demonstrate that the denoising performance of the original method is significantly improved after embedding the weighting scheme.
Hyperspectral images have multi-dimensional information and play an important role in many fields. Recently, based on the compressed sensing (CS), spectral snapshot compressive imaging (SCI) can balance spatial and spectral resolution compared with traditional methods, so it has attached more and more attention. The Plug-and-Play (PnP) framework based on spectral SCI can effectively reconstruct high-quality hyperspectral images, but there exists a serious problem of parameter dependence. In this paper, we propose a PnP hyperspectral reconstruction method based on reinforcement learning (RL), where a suitable policy network through deep reinforcement learning can adaptively tune the parameters in the PnP method to adjust the denoising strength, penalty factor of the deep denoising network, and the terminal time of iterative optimization. Compared with other model-based and learning-based methods and methods with different parameters tuning policies, the reconstruction results obtained by the proposed method have advantages in quantitative indicators and visual effects.
Recently, the successful applications of convolutional neural network (CNN) in computer vision have attracted considerable attentions. Particularly, deep learning models with attention mechanisms have shown impressive performance in hyperspectral (HS) pansharpening. However, most of these existing models follow an early/late-fusion strategy and do not take full advantage of the hierarchical features. In this paper, we specifically design a novel end-to-end multi-level feature fusion network with cross-layer guided attention for HS pansharpening, termed as HP-MFFN, which allows the network to extract the hierarchical features level by level. Specifically, as the different levels of the network have different receptive fields and contain different details, the hierarchical features extracted from the HS image and panchromatic (PAN) image are refined by the cross-layer guided attention fusion module (called CLGAF) to yield more effective spatial-spectral features with fine details and rich semantics. The experimental results conducted on widely-used datasets demonstrate that HP-MFFN provides high-quality pansharpened HS images in terms of perceptually and quantitatively.
The features of aero-engine hollow turbine blades show a complicated inner air tract, which have a significant influence on the engine’s performance. Inspection of the internal structure and flaws of the blades become indispensable. Non-destructive testing, such as computed tomography (CT), is an effective method for detecting internal problems. The purpose of this study is to demonstrate how an iterative excitation can be used to recover an incomplete projection in CT for a turbine blade. Firstly, the variance of the background was gathered as previous information. Then, to make up for the missed sample at the ill-angle position, a noise map was added and filtered as prep work for the forward projection. The original projections were retained, and the revised projections were used to extract additional characteristics from the damaged data. Finally, both simulation and actual tests were studied. The Normalised Mean Square Distance (NMSD) of the reconstructed image was reduced by over 20%. The Structural Similarity Index Measure (SSIM) and the Universal Quality Index (UQI) were both enhanced by at least 60% and 41%, respectively. It was demonstrated that the technique can increase the accuracy of reconstruction for a hollow turbine blade.
Full-text available
Discriminative model learning for image denoising has been recently attracting considerable attentions due to its favorable denoising performance. In this paper, we take one step forward by investigating the construction of feed-forward denoising convolutional neural networks (DnCNNs) to embrace the progress in very deep architecture, learning algorithm, and regularization method into image denoising. Specifically, residual learning and batch normalization are utilized to speed up the training process as well as boost the denoising performance. Different from the existing discriminative denoising models which usually train a specific model for additive white Gaussian noise (AWGN) at a certain noise level, our DnCNN model is able to handle Gaussian denoising with unknown noise level (i.e., blind Gaussian denoising). With the residual learning strategy, DnCNN implicitly removes the latent clean image in the hidden layers. This property motivates us to train a single DnCNN model to tackle with several general image denoising tasks such as Gaussian denoising, single image super-resolution and JPEG image deblocking. Our extensive experiments demonstrate that our DnCNN model can not only exhibit high effectiveness in several general image denoising tasks, but also be efficiently implemented by benefiting from GPU computing.
Full-text available
Hyperspectral imaging is beneficial in a diverse range of applications from diagnostic medicine, to agriculture, to surveillance to name a few. However, hyperspectral images often suffer from degradation such as noise and low resolution. In this paper, we propose an effective model for hyperspectral image (HSI) restoration, specifically image denoising and super-resolution. Our model considers three underlying characteristics of HSIs: sparsity across the spatial-spectral domain, high correlation across spectra, and non-local self-similarity over space. We first exploit high correlation across spectra and non-local self-similarity over space in the degraded HSI to learn an adaptive spatial-spectral dictionary. Then, we employ the local and non-local sparsity of the HSI under the learned spatial-spectral dictionary to design an HSI restoration model, which can be effectively solved by an iterative numerical algorithm with parameters that are adaptively adjusted for different clusters and different noise levels. In experiments on HSI denoising, we show that the proposed method outperforms many state-of-the-art methods under several comprehensive quantitative assessments. We also show that our method performs well on HSI super-resolution.
Full-text available
In this paper, we present a spatial spectral hyperspectral image (HSI) mixed-noise removal method named total variation (TV)-regularized low-rank matrix factorization (LRTV). In general, HSIs are not only assumed to lie in a low-rank subspace from the spectral perspective but also assumed to be piecewise smooth in the spatial dimension. The proposed method integrates the nuclear norm, TV regularization, and L1-norm together in a unified framework. The nuclear norm is used to exploit the spectral low-rank property, and the TV regularization is adopted to explore the spatial piecewise smooth structure of the HSI. At the same time, the sparse noise, which includes stripes, impulse noise, and dead pixels, is detected by the L1-norm regularization. To tradeoff the nuclear norm and TV regularization and to further remove the Gaussian noise of the HSI, we also restrict the rank of the clean image to be no larger than the number of endmembers. A number of experiments were conducted in both simulated and real data conditions to illustrate the performance of the proposed LRTV method for HSI restoration.
Objective methods for assessing perceptual image quality have traditionally attempted to quantify the visibility of errors between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a Structural Similarity Index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MatLab implementation of the proposed algorithm is available online at
Tensors or multi-way arrays are functions of three or more indices $(i,j,k,\cdots)$ -- similar to matrices (two-way arrays), which are functions of two indices $(r,c)$ for (row,column). Tensors have a rich history, stretching over almost a century, and touching upon numerous disciplines; but they have only recently become ubiquitous in signal and data analytics at the confluence of signal processing, statistics, data mining and machine learning. This overview article aims to provide a good starting point for researchers and practitioners interested in learning about and working with tensors. As such, it focuses on fundamentals and motivation (using various application examples), aiming to strike an appropriate balance of breadth and depth that will enable someone having taken first graduate courses in matrix algebra and probability to get started doing research and/or developing tensor algorithms and software. Some background in applied optimization is useful but not strictly required. The material covered includes tensor rank and rank decomposition; basic tensor factorization models and their relationships and properties (including fairly good coverage of identifiability); broad coverage of algorithms ranging from alternating optimization to stochastic gradient; statistical performance analysis; and applications ranging from source separation to collaborative filtering, mixture and topic modeling, classification, and multilinear subspace learning.
Conference Paper
Patch based image modeling has achieved a great success in low level vision such as image denoising. In particular, the use of image nonlocal self-similarity (NSS) prior, which refers to the fact that a local patch often has many nonlocal similar patches to it across the image, has significantly enhanced the denoising performance. However, in most existing methods only the NSS of input degraded image is exploited, while how to utilize the NSS of clean natural images is still an open problem. In this paper, we propose a patch group (PG) based NSS prior learning scheme to learn explicit NSS models from natural images for high performance denoising. PGs are extracted from training images by putting nonlocal similar patches into groups, and a PG based Gaussian Mixture Model (PG-GMM) learning algorithm is developed to learn the NSS prior. We demonstrate that, owe to the learned PG-GMM, a simple weighted sparse coding model, which has a closed-form solution, can be used to perform image denoising effectively, resulting in high PSNR measure, fast speed, and particularly the best visual quality among all competing methods.