Chisquare unbiased risk estimate for denoising magnitude MR images.
ABSTRACT In this article we develop Steintype results for unbiased estimation of the risk associated with parametric estimators of the noncentrality parameter of chisquared random variables on two degrees of freedom. These results allow for estimator adaptivity, and thus can be used to optimize the parameters of a broad class of typical denoising functions, subject only to weak smoothness assumptions. We show how to apply these results to the problem of enhancing magnitude magnetic resonance images, which are known to be corrupted by Rician noise. As an example, we propose a transformdomain pointwise estimator based on linear expansion of thresholds. Finally, we apply this estimator to synthetic and real image data in conjunction with the undecimated Haar wavelet transform, and conclude that it is able to outperform previous waveletbased techniques and compares favorably with a more recent approach based on nonlocal means.
 [Show abstract] [Hide abstract]
ABSTRACT: In this article we derive an unbiased expression for the expected meansquared error associated with continuously differentiable estimators of the noncentrality parameter of a chisquare random variable. We then consider the task of denoising squaredmagnitude magnetic resonance image data, which are well modeled as independent noncentral chisquare random variables on two degrees of freedom. We consider two broad classes of linearly parameterized shrinkage estimators that can be optimized using our risk estimate, one in the general context of undecimated filterbank transforms, and another in the specific case of the unnormalized Haar wavelet transform. The resultant algorithms are computationally tractable and improve upon stateoftheart methods for both simulated and actual magnetic resonance image data.IEEE Transactions on Image Processing 06/2011; · 3.20 Impact Factor
Page 1
CHISQUARE UNBIASED RISK ESTIMATE
FOR DENOISING MAGNITUDE MR IMAGES
Florian Luisier and Patrick J. Wolfe
Statistics and Information Sciences Laboratory
School of Engineering and Applied Sciences
Harvard University, Cambridge, MA 02138, USA
ABSTRACT
In this article we develop Steintype results for unbiased estimation
of the risk associated with parametric estimators of the noncentrality
parameter of chisquared random variables on two degrees of free
dom. These results allow for estimator adaptivity, and thus can be
used to optimize the parameters of a broad class of typical denoising
functions, subject only to weak smoothness assumptions. We show
how to apply these results to the problem of enhancing magnitude
magnetic resonance images, which are known to be corrupted by
Rician noise. As an example, we propose a transformdomain point
wise estimator based on linear expansion of thresholds. Finally, we
apply this estimator to synthetic and real image data in conjunction
with the undecimated Haar wavelet transform, and conclude that it is
able to outperform previous waveletbased techniques and compares
favorably with a more recent approach based on nonlocal means.
Index Terms— Magnetic resonance, Image denoising, Rician
noise, Chisquare, Unbiased MSE estimate, Shrinkage estimator
1. INTRODUCTION
Magnetic resonance (MR) imaging is an in vivo medical imaging
technique which is particularly useful to visualize soft tissues in the
human body. As in many imaging modalities, any efforts developed
for lowering the scanning time, increasing the spatiotemporal reso
lution and reducing the overall system cost have a negative impact
on the signaltonoise ratio of the resulting image [1]. A prior noise
reduction step thus becomes decisive for an enhanced visualization
and a meaningful computeraided diagnosis.
In MR image acquisition, the raw measurements consist of k
space samples that are commonly assumed to be corrupted by i.i.d.
additive complex white Gaussian noise [2]. The most straightfor
ward way of reconstructing an image is to compute the inverse dis
crete Fourier transform of the kspace samples. In magnitude MR
imaging, the phase is disregarded and only the magnitude forms the
final output image. Although the samples of the magnitude image
are still statistically independent, they are no longer Gaussian, but
Rician distributed [3].
The main issue which makes the Rician distribution so diffi
cult to handle directly is the fact that both its mean and its variance
depend on the underlying noisefree magnitude image. Although
some Ricianbased maximumlikelihood estimators have been de
rived [4–6], most stateoftheart techniques operate on the squared
magnitude [7–10], where the bias induced by the Rician noise is no
longer signaldependent.
In this work, we derive a purely dataadaptive unbiased estimate
of the expected meansquared error (MSE) between the unknown
noisefree squared magnitude image and an estimate of it (Sec
tion 3). Since the squared magnitude image follows a (scaled) non
central chisquare distribution, we call this estimator a chisquared
unbiased risk estimate. In practice, it can be used to optimize the
tuning parameters of a broad class of denoising functions, provided
that the latter satisfy some (weak) technical constraints (mainly
differentiability). As an illustrative example, we devise a transform
domain pointwise estimator, which consists of a linear expansion
of thresholds [11], whose weights are globally optimized by mini
mizing the proposed MSE estimate (Section 4). Applied within the
undecimated Haar wavelet transform, the resulting denoising algo
rithm is shown to outperform (in the minimum MSE sense) some
previous waveletbased techniques [7, 8] and favorably compare
with a recent unbiased nonlocal means approach [9] (Section 5).
2. OBSERVATION MODEL
Our objective is to estimate the original (unknown) magnitudes
µn = (?{µn}2+?{µn}2)1/2, n = 1...N
from their corresponding noisy measurements mn, where
?{mn} ∼ N(?{µn},σ2) and?{mn} ∼ N(?{µn},σ2).
If we define two Ndimensional vectors
x=[µn2/σ2]1≤n≤N ∈ RN
[mn2/σ2]1≤n≤N ∈ RN
+
(1)
(2)
y=
+,
then the statistical measurement model for y given x is the prod
uct of N independent noncentral chisquare distributions with two
degrees of freedom and noncentrality parameter xn:
P(yx) =
N
Y
n=1
1
2e−xn+yn
2
I0(√xnyn)
(3)
where I0(√xy) =P
Our original problem thus boils down to estimating the noncen
trality parameter xnof a noncentral chisquare distribution with two
degrees of freedom.
k∈N(xy)k/(4kk!2) is the zeroorder modified
Bessel function of the first kind.
3. CHISQUARE UNBIASED RISK ESTIMATE
Since in practice we do not have access to the true MSE between x
and an estimate of it ˆ x = f(y), we want to find an unbiased estimate
of its expected value (risk), that solely depends on the observed data
y. The expected MSE is defined as
1
=1
N
n=1
E {MSE}=E
N?f(y) − x?2
“
ff
N
X
E˘fn(y)2¯− 2E {xnfn(y)}
{z}
(I)
+ x2
{z}
n
(II)
”
(4)
where E {·} is the mathematical expectation operator taken over all
possible realizations of the random vector y. Note that x may be
either random (independent of y) or deterministic.
Page 2
Theorem 1. Lety ∈ RN
chisquare distribution with two degrees of freedom and noncentral
ityparameterx ∈ RN
such that
8
>
and define ∂f(y) =
∂yn
1≤n≤N
Then, the following random variable
+be a random vector following a noncentral
+. Letf(y)beaNdimensionalvectorfunction
>
:
<
E˘∂2fn(y)/∂y2
lim
n¯
<
=
∞
0lim
yn→∞fn(y)P(yx)
yn→∞∂fn(y)/∂ynP(yx)
h
=0
∂fn(y)
i
and ∂2f(y) =
h
∂2fn(y)
∂y2
n
i
1≤n≤N.
ε=
1
N
2
N
`?f(y)?2+ ?y?2− 8Ty´+ 8 −
`(y − 2)Tf(y) − 4`(y − 1)T∂f(y) − yT∂2f(y)´´
is an unbiased estimate of the expected MSE; i.e.,
(5)
E {ε} = E {MSE}
The formal proof of this theorem will appear elsewhere. We just
give hereafter the main steps leading to the desired result.
Estimation of (I)
We need to find an unbiased estimate of
Z
After a few algebraic manipulations, we obtain the following
useful differential relation:
E {xnfn(y)} =
1
2
RN
+
xnfn(y)e−xn+yn
2
I0(√xnyn)dNy (6)
xnI0(√xnyn)=4
∂
∂yn
„
yn
∂
∂ynI0(√xnyn)
«
(7)
Putting back (7) into (6) and performing two successive integra
tions by parts finally leads to
E {xnfn(y)}
=
E {(yn− 2)fn(y)} −
4E
(8)
(yn− 1)∂fn(y)
∂yn
− yn∂2fn(y)
∂y2
n
ff
Estimation of (II)
Since E {yn} = xn+ 2, we can successively write
x2
n= E {xn(yn− 2)}
Note that the variance of ε is inversely proportional to the num
ber of samples N. The use of ε as an estimate of the actual MSE is
therefore particularly reliable for multidimensional signal process
ing, where the number of available samples is high.
(8)
= E˘y2
n
¯− 8E {yn} + 8
(9)
4. TRANSFORMDOMAIN POINTWISE ESTIMATOR
We propose the following generic class of estimators
f(y;a) = Rθ(Dy,Dy;a)
(10)
This class extends the one proposed in [11] for the effective reduc
tion of additive white Gaussian noise.
• The L×N matrix D = [DT
culant matrices Dj = [dj
decomposition matrices implements a channel of an undeci
mated filterbank transform, where the underlying analysis fil
ters have unitnorm. We assume that each Dj implements
0...DT
k,l]1≤k,l≤N. Each of these N × N
J]Tis made of (J +1) cir
an highpass channel for j = 1...J; i.e., ∀l,P
withP
(resp. noisefree) transform coefficient, we have
E˘wj
While the noisy highpass coefficients are unbiased estimates
of their noisefree counterparts, the lowpass coefficients need
to be debiased.
• The N×L matrix R = [R0...RJ]Tis made of (J+1) N×
N circulant matrices Rj such that the perfect reconstruction
condition RD = Id is satisfied.
• The L × N matrix D = [D
circulant matrices Dj. Each of these N ×N matrices imple
ments a coarse estimation of the variance wj
each noisy transform coefficient wj
• The vector a = [aT
adaptive parameters aj ∈ RK, K << N, that will be
optimized by minimizing the MSE estimate (5).
• The vector function θ : RL× RL→ RLis pointwise and
subbandadaptive; i.e.,
ndj
l,n= 0,
whereas D0implements the complementary lowpass channel
nd0
[wj
l,n= 2J/2, ∀l. Hence, denoting by wj = Djy =
n]1≤n≤N (resp. ωj = Djx = [ωj
n]1≤n≤N) each noisy
∀n = 1...N,
n
¯
= ωj
= ω0
n, for j = 1...J
n+ 2J/2+1
E˘w0
n
¯
(11)
T
0...D
T
J]Tis made of (J + 1)
n= [Djy]n of
n.
0...aT
J]Tcontains a set of subband
θ(w,w;a) = [θj
n(wj
n,wj
n;aj)]0≤j≤J,1≤n≤N
(12)
From now on, we drop the subband index j and denote by D =
[dk,l]1≤k,l≤Nand wn(resp. D = [dk,l]1≤k,l≤Nand wn) any of the
Dj and wj
n(resp. Dj and wj
n).
4.1. Choice of D
The variance of each noisy transform coefficient is given by
var{wl} =
N
X
n=1
d2
l,nvar{yn} = 4
N
X
n=1
d2
l,n(xn+ 1)
(13)
Sincexn = E {yn}−2, wecansimplychooseD = [d2
and we have
l,n]1≤l,n≤N
var{wl} = 4(E {wl} − 1)
(14)
4.2. Choice of θ
We need to specify a particular shrinkage/thresholding rule to esti
mate the unknown highpass coefficients ωlfrom the observed noisy
highpass coefficients wl. From now on, we drop the subscript in
dex l, and denote by w (resp. ω) any of the transform coefficient
wl(resp. ωl). In the expected MMSE sense, the optimal pointwise
shrinkage factor is given by
a∗= argmin
a
E˘(aw − ω)2¯(11),(14)
=1 −4(E {w} − 1)
E {w2}
(15)
There are various possible implementations of the above for
mula to come up with an effective shrinkage function. For instance,
Nowak proposed in [7]
“
where the particular choice λ = 3 was motivated by a Gaussian prior
on the noisy coefficients w.
θ(w,w) = max1 − λ4 · max(w − 1,1)
w2
,0
”
w
(16)
Page 3
Our experiments indicate that replacing max(w − 1,1) by w
gives slightly better MSE performance. Following the recent idea
of linear expansion of thresholds (LET) [11], we thus propose the
following shrinkage/thresholding rule:
θ(w,w;a) =
K
X
k=1
akmax

“
1 − λk4w
{z
w2,0
”
w
}
θk(w,w)
(17)
which can be seen as an optimized generalization of the estima
tor (16). To satisfy the requirements of Theorem 1, we have used
a continuously differentiable approximation of the max(·) function.
We found that the best (in the minimum MSE/variance sense) re
sults were obtained with K = 2 terms (per subband). The non
linear parameters λ1 and λ2, as well as the set of linear parame
ters a = [a1a2]Tcould be optimized by minimizing the MSE esti
mate (5). However, for a much faster parameters optimization, we
suggest to fix the two nonlinear parameters and only optimize the re
maining linear parameters a. In all our experiments, the best results
were achieved with λ1 = 3 and λ2 = 3λ1.
−100
−50
0
50
100
0
100
200
300
−100
−50
0
50
100
w
w
θ(w, w; a)
Fig. 1. Possible realizations of the proposed shrinkage function (17)
with λ1 = 3, λ2 = 9 and a = [0.75 0.25]T.
4.3. Implementation
The whole transformdomain pointwise estimator can finally be ex
pressed as:
f(y;a)=
J
X
j=1
2
X
k=1
aj
kRjθj
k(wj,wj)+a0
1(R0w0− 2)
{z}
Debiased
lowpass
=
X
k∈K
akfk(y)
where K is the set [0,J] × [1,2].
The vector of optimal (in the minimum ε sense) parameters a =
[ak]k∈Kis then the solution of the linear system of equations Ma =
c, where
c
We have then several ways of finally estimating the vector of
unknown magnitudes µ = [µn]1≤n≤N, mainly
hp
=
ˆ(y − 2)Tfk(y) − 4`(y − 1)T∂fk(y) − yT∂2fk(y)´˜
k∈K
M = [fk(y)Tfl(y)]k,l∈K
ˆ µ = σ
fn(y)
i
1≤n≤Nor ˆ µ = σ
hp
max(fn(y),0)
i
1≤n≤N
To get the best out of these two possible inversions, we propose
to consider a convex combination of both; i.e., for λ ∈ [0,1],
h
5. EXPERIMENTS
ˆ µ = σλ
p
fn(y) + (1 − λ)
p
max(fn(y),0)
i
1≤n≤N
(18)
We have applied the proposed denoising strategy in the framework
of the dyadic 2D undecimated Haar wavelet transform. We have
performed 4 levels of decomposition which leads to J = 3 · 4 =
12 subbands and thus 24 linear parameters optimized with the MSE
estimate (5). We have used formula (18) with λ = 0.5 to get the
final estimated magnitude image.
As benchmarks for comparisons, we have selected three com
petitive denoising techniques specifically adapted to magnitude MR
image denoising: two waveletbased algorithms [7,8] and a recent
unbiased nonlocal means approach [9]. In all experiments, we used
the default values of the parameters involved in each of these meth
ods.
5.1. Synthetic Data
We have conducted a set of experiments on simulated noisy data
with various noise levels obtained by varying the Rician noise pa
rameter σ in the range [5,100]. Here, σ is assumed to be known.
In practice, it can be accurately estimated in signalfree or constant
regions of the image by moment matching [7,10] or maximum like
lihood [4,6,12] techniques. If no background is available, more so
phisticated approaches can be considered [13]. The noisefree test
images (Fig. 2) consist of two highresolution MR images freely
available at http://pubimage.hcuge.ch:8080/. The de
noising performances have been evaluated in term of peak signalto
noise ratio (PSNR) defined as:
PSNR = 10log10
N?µ?2
?ˆ µ − µ?2
∞
The PSNR comparisons of the various algorithms reported in
Table 1 indicate that the proposed approach achieves the best per
formances on average. The improvements (up to +2dB) are even
more significant at low SNR, where the signaldependent nature of
the Rician noise is more pronounced.
Note that the computational complexity of the proposed algo
rithm is dominated by that of the FFT. Consequently, it only takes
around 2s to denoise a 256 × 256 image under Matlab (R2010a)
running on a 2.66 GHz Mac Intel Core 2 Duo. Under the same con
ditions, the execution time of the unbiased nonlocal means algo
rithm (full Matlab implementation without precompiled mexfiles)
is about 21s.
Brain 512 × 512
Pelvis 256 × 256
Fig. 2. Test images used in the simulations.
Page 4
Table 1. PSNR Comparisons
Brain 512 × 512
Noisy[7]
32.6238.68
26.4834.09
20.36 29.41
16.8026.56
12.3222.88
6.1417.87
Pelvis 256 × 256
Noisy[7]
33.8435.73
27.7331.50
21.6427.37
18.1325.03
13.7122.14
7.3618.15
σ
5
10
20
30
50
100
[8]
33.79
32.17
26.64
22.00
16.55
9.47
[9]
39.85
35.98
31.72
28.89
25.08
19.61
Proposed
Approach
39.42
35.95
32.06
29.53
26.28
21.55
σ
5
10
20
30
50
100
[8]
28.20
26.32
24.63
24.15
20.98
12.31
[9]
36.16
32.24
28.55
26.43
23.43
18.68
Proposed
Approach
36.20
32.24
28.58
26.49
23.80
20.04
Output PSNRs have been averaged over 10 noise realizations.
5.2. Real Data
We have tested the various denoising algorithms on a very noisy real
magnitude MR image of the knee (see Fig. 3). The noise parameter
σ has been estimated by matching the mean of the signalfree back
ground extracted from the squared magnitude image with the theo
retical mean of a noncentral chisquare distribution. An estimated
value of ˆ σ = 41 was found, confirming the visual impression of a
poor input PSNR. We observe on Fig. 3 that the output of the pro
posed denoising algorithm exhibits fewer artifacts and has a higher
contrast than the other waveletbased outputs. Due to the lack of
space, the output of the unbiased nonlocal means is not displayed,
but the same observations hold.
6. ACKNOWLEDGEMENTS
The authors would like to thank Prof. Thierry Blu for useful dis
cussions, as well as Prof. WenYih Isaac Tseng for providing the
MRI data used in the experimental section. They are also grateful
to Aleksandra Piˇ zurica and Jos´ e Manj´ on for making their respective
software available online.
ThisworkwassupportedbytheSwissNationalScienceFounda
tion Fellowship BELP2133245, and by the National Science Foun
dation Grant DMS0652743.
7. REFERENCES
[1] G. Wright, “Magnetic Resonance Imaging,” IEEE Signal Pro
cessing Magazine, vol. 14, no. 1, pp. 56–66, January 1997.
[2] R. M. Henkelman, “Measurement of Signal Intensities in the
Presence of Noise in MR Images,” Medical Physics, vol. 12,
no. 2, pp. 232–233, 1985.
[3] H.Gudbjartsson and S. Patz, “The Rician Distribution of Noisy
MRI Data,” Magnetic Resonance in Medicine, vol. 34, no. 6,
pp. 910–914, 1995.
[4] J. Sijbers, A. J. den Dekker, P. Scheunders, and D. V. Dyck,
“MaximumLikelihood Estimation of Rician Distribution Pa
rameters,” IEEE Transactions on Medical Images, vol. 17,
no. 3, pp. 357–361, June 1998.
[5] J.Sijbers andA. J. denDekker, “Maximum LikelihoodEstima
tion of Signal Amplitude and Noise Variance from MR Data,”
Raw Data[8]
[7] Proposed Approach
Fig. 3. Waveletbased denoising of a real magnitude MR image of
the knee.
Magnetic Resonance in Medicine, vol. 51, no. 3, pp. 586–594,
2004.
[6] L. He and I. Greenshields, “A Nonlocal Maximum Likelihood
Estimation Method for Rician Noise Reduction in MR Im
ages,” IEEE Transactions on Medical Imaging, vol. 28, no. 2,
pp. 165 –172, February 2009.
[7] R. D.Nowak, “WaveletBased Rician Noise Removal for Mag
netic Resonance Imaging,” IEEE Transactions on Image Pro
cessing, vol. 8, no. 10, October 1999.
[8] A. Piˇ zurica, W. Philips, I. Lemahieu, and M. Acheroy, “A Ver
satile Wavelet Domain Noise Filtration Technique for Medical
Imaging,” IEEE Transactions on Medical Imaging,, vol. 22,
no. 3, pp. 323–331, March 2003.
[9] J. V. Manj´ on, J. CarbonellCaballero, J. J. Lull, G. Garc´ ıa
Mart´ ı, L. Mart´ ıBonmat´ ı, and M. Robles, “MRI Denoising Us
ing Non Local Means,” Medical Image Analysis, vol. 12, no. 2,
pp. 514–523, 2008.
[10] S. AjaFernandez, C. AlberolaLopez, and C.F. Westin,
“Noise and Signal Estimation in Magnitude MRI and Rician
Distributed Images: A LMMSE Approach,” IEEE Transac
tions on Image Processing, vol. 17, no. 8, pp. 1383 –1398,
August 2008.
[11] T. Blu and F. Luisier, “The SURELET Approach to Image
Denoising,” IEEE Transactions on Image Processing, vol. 16,
no. 11, pp. 2778–2786, November 2007.
[12] J. Sijbers, D. Poot, A. J. den Dekker, and W. Pintjens, “Auto
matic Estimation of the Noise Variance from the Histogram of
a Magnetic Resonance Image,” Physics in Medicine and Biol
ogy, vol. 52, no. 5, pp. 1335–1348, 2007.
[13] J. Rajan, D. Poot, J. Juntu, and J. Sijbers, “Noise Measurement
from Magnitude MRI Using Local Estimates of Variance and
Skewness,” Physics in Medicine and Biology, vol. 55, no. 16,
pp. 441–449, 2010.