Content uploaded by Kaixuan Wei
Author content
All content in this area was uploaded by Kaixuan Wei on Mar 27, 2020
Content may be subject to copyright.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Kaixuan Wei 1Angelica AvilesRivero 2Jingwei Liang 3Ying Fu 1CarolaBibiane Schnlieb 3Hua Huang 1
Abstract
Plugandplay (PnP) is a nonconvex framework
that combines ADMM or other proximal algo
rithms with advanced denoiser priors. Recently,
PnP has achieved great empirical success, espe
cially with the integration of deep learningbased
denoisers. However, a key problem of PnP based
approaches is that they require manual parameter
tweaking. It is necessary to obtain highquality
results across the high discrepancy in terms of
imaging conditions and varying scene content. In
this work, we present a tuningfree PnP proximal
algorithm, which can automatically determine the
internal parameters including the penalty parame
ter, the denoising strength and the terminal time.
A key part of our approach is to develop a pol
icy network for automatic search of parameters,
which can be effectively learned via mixed model
free and modelbased deep reinforcement learn
ing. We demonstrate, through numerical and vi
sual experiments, that the learned policy can cus
tomize different parameters for different states,
and often more efﬁcient and effective than exist
ing handcrafted criteria. Moreover, we discuss
the practical considerations of the plugged denois
ers, which together with our learned policy yield
stateoftheart results. This is prevalent on both
linear and nonlinear exemplary inverse imaging
problems, and in particular, we show promising
results on Compressed Sensing MRI and phase
retrieval.
1. Introduction
The problem of recovering an underlying unknown im
age
x∈RN
from noisy and/or incomplete measured data
y∈RM
is fundamental in computational imaging, in ap
plications including magnetic resonance imaging (MRI)
1
School of Computer Science and Technology, Beijing Institute
of Technology, Beijing, China
2
DPMMS, University of Cambridge,
Cambridge, United Kingdom
3
DAMTP, University of Cambridge,
Cambridge, United Kingdom.
(Fessler,2010), computed tomography (CT) (Elbakri &
Fessler,2002), microscopy (Aguet et al.,2008;Zheng et al.,
2013), and inverse scattering (Katz et al.,2014;Metzler
et al.,2017b). This image recovery task is often formulated
as an optimization problem that minimizes a cost function,
i.e.,
minimize
x∈RND(x) + λR(x),(1)
where
D
is a dataﬁdelity term that ensures consistency
between the reconstructed image and measured data.
R
is a regularizer that imposes certain prior knowledge, e.g.
smoothness (Osher et al.,2005;Ma et al.,2008), sparsity
(Yang et al.,2010;Liao & Sapiro,2008;Ravishankar &
Bresler,2010), low rank (Semerci et al.,2014;Gu et al.,
2017) and nonlocal selfsimilarity (Mairal et al.,2009;Qu
et al.,2014), regarding the unknown image. The problem in
Eq.
(1)
is often solved by ﬁrstorder iterative proximal algo
rithms, e.g. fast iterative shrinkage/thresholding algorithm
(FISTA) (Beck & Teboulle,2009) and alternating direc
tion method of multipliers (ADMM) (Boyd et al.,2011), to
tackle the nonsmoothness of the regularizers.
To handle the nonsmoothness caused by regularizers, ﬁrst
order algorithms rely on the proximal operators (Beck &
Teboulle,2009;Boyd et al.,2011;Chambolle & Pock,2011;
Parikh et al.,2014;Geman,1995;Esser et al.,2010) deﬁned
by
Proxσ2R(v) = argmin
xR(x) + 1
2σ2kx−vk2
2.(2)
Interestingly, given the mathematical equivalence of the
proximal operator to the regularized denoising, the proximal
operators
Proxσ2R
can be replaced by any offtheshelf
denoisers
Hσ
with noise level
σ
, yielding a new framework
namely plugandplay (PnP) prior (Venkatakrishnan et al.,
2013). The resulting algorithms, e.g. PnPADMM, can be
written as
xk+1 = Proxσ2
kR(zk−uk) = Hσk(zk−uk),(3)
zk+1 = Prox 1
µk
D(xk+1 +uk),(4)
uk+1 =uk+xk+1 −zk+1,(5)
where
k∈[0, τ )
denotes the
k
th iteration,
τ
is the terminal
time,
σk
and
µk
indicate the denoising strength (of the
arXiv:2002.09611v1 [eess.IV] 22 Feb 2020
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
denoiser) and the penalty parameter used in the
k
th iteration
respectively.
In this formulation, the regularizer
R
can be implicitly de
ﬁned by a plugged denoiser, which opens a new door to
leverage the vast progress made on the image denoising
front to solve more general inverse imaging problems. To
plug wellknown image denoisers, e.g. BM3D (Dabov et al.,
2007) and NLM (Buades et al.,2005), into optimization al
gorithms often leads to sizeable performance gain compared
to other explicitly deﬁned regularizers, e.g. total variantion.
That is PnP as a standalone framework can combine the ben
eﬁts of both deep learning based denoisers and optimization
methods, e.g. (Zhang et al.,2017b;Rick Chang et al.,2017;
Meinhardt et al.,2017). These highly desirable beneﬁts are
in terms of fast and effective inference whilst circumvent
ing the need of expensive network retraining whenever the
speciﬁc problem changes.
Whilst a PnP framework offers promising image recovery re
sults, a major drawback is that its performance is highly sen
sitive to the internal parameter selection, which generically
includes the penalty parameter
µ
, the denoising strength
(of the denoiser)
σ
and the terminal time
τ
. The body of
literature often utilizes manual tweaking e.g. (Rick Chang
et al.,2017;Meinhardt et al.,2017) or handcrafted crite
ria e.g. (Chan et al.,2017;Zhang et al.,2017b;Eksioglu,
2016;Tirer & Giryes,2018) to select parameters for each
speciﬁc problem setting. However, manual parameter tweak
ing requires several trials, which is very cumbersome and
timeconsuming. Semiautomated handcrafted criteria (for
example monotonically decreasing the denoising strength)
can, to some degree, ease the burden of exhaustive search of
large parameter space, but often leads to suboptimal local
minimum. Moreover, the optimal parameter setting differs
imagebyimage, depending on the measurement model,
noise level, noise type and unknown image itself. These dif
ferences can be noticed in the further detailed comparison in
Fig. 1, where peak signaltonoise ratio (PSNR) curves are
displayed for four images under varying denoising strength.
This paper is devoted to addressing the aforementioned
challenge – how to deal with the manual parameter tuning
problem in a PnP framework. To this end, we formulate the
internal parameter selection as a sequential decisionmaking
problem. To do this, a policy is adopted to select a sequence
of internal parameters to guide the optimization. Such prob
lem can be naturally ﬁt into a reinforcement learning (RL)
framework, where a policy agent seeks to map observations
to actions, with the aim of maximizing cumulativereward.
The reward reﬂects the to do or not to do events for the
agent, and a desirable high reward can be obtained if the
policy leads to a faster convergence and better restoration
accuracy.
We demonstrate, through extensive numerical and visual
Figure 1.
Compressed Sensing MRI using radial sampling pattern
with 20
%
sampling rate, where PSNR curves of four medical
images are displayed  using PnPADMM with different denoising
strengths. Different images requires different denoising strengths
to reach the optimal performance.
experiments, the advantage of our algorithmic approach on
Compressed Sensing MRI and phase retrieval problems. We
show that the policy well approximates the intrinsic function
that maps the input state to its optimal parameter setting.
By using the learned policy, the guided optimization can
reach comparable results to the ones using oracle parameters
tuned via the inaccessible ground truth. An overview of
our algorithm is shown in Fig. 2. Our contributions are as
follows:
1.
We present a tuningfree PnP algorithm that can cus
tomize parameters towards diverse images, which often
demonstrates faster practical convergence and better
empirical performance than handcrafted criteria.
2.
We introduce an efﬁcient mixed modelfree and model
based RL algorithm. It can optimize jointly the dis
crete terminal time, and the continuous denoising
strength/penalty parameters.
3.
We validate our approach with an extensive range of
numerical and visual experiments, and show how the
performance of the PnP is affected by the parameters.
We also show that our welldesigned approach leads to
better results than stateoftheart techniques on com
pressed sensing MRI and phase retrieval.
2. Related Work
The body of literature has reported several PnP algorithmic
techniques. In this section, we provide a short overview of
these techniques.
Plugandplay (PnP).
The deﬁnitional concept of PnP was
ﬁrst introduced in (Danielyan et al.,2010;Zoran & Weiss,
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
FOURIER
DOMAIN
SAMPLIG
ZERO
FILLING
X STEP 1
Z STEP 1
U STEP 1
X STEP 2
Z STEP 2
U STEP 2
RECOVERED
IMAGE
1
0
1
0
1
1
X STEP
POLICY
AT STEP 1
𝜋
0
x
0
u
0
z
𝑎0
2
a2
a2
a
U STEP
Z STEP
𝑠
STATE
𝑥𝑥𝑥
𝑠
STATE 𝑠
STATE 𝑠
STATE
If 𝑎𝑎
𝑎1
If
POLICY
AT STEP
𝜋
𝜏
POLICY
AT STEP 2
𝜋
Figure 2.
Overview of our tuningfree plugandplay framework
(taking CSMRI problem as example).
2011;Venkatakrishnan et al.,2013), which has attracted
great attention owing to its effectiveness and ﬂexibility to
handle a wide range of inverse imaging problems. Follow
ing this philosophy, several works have been developed,
and can be roughly categorized in terms of four aspects,
i.e., proximal algorithms, imaging applications, denoiser
priors, and the convergence.
(i)
proximal algorithms in
clude halfquadratic splitting (Zhang et al.,2017b), primal
dual method (Ono,2017), generalized approximate message
passing (Metzler et al.,2016b) and (stochastic) accelerated
proximal gradient method (Sun et al.,2019a).
(ii)
imaging
applications have such as bright ﬁeld electronic tomography
(Sreehari et al.,2016); diffraction tomography (Sun et al.,
2019a); lowdose CT imaging (He et al.,2018); Compressed
Sensing MRI (Eksioglu,2016); electron microscopy (Sree
hari et al.,2017); singlephoton imaging (Chan et al.,2017);
phase retrieval (Metzler et al.,2018); Fourier ptychogra
phy microscopy (Sun et al.,2019b); lightﬁeld photography
(Chun et al.,2019); hyperspectral sharpening (Teodoro et al.,
2018); denoising (Rond et al.,2016); and image processing –
e.g. demosaicking, deblurring, superresolution and inpaint
ing (Heide et al.,2014;Meinhardt et al.,2017;Zhang et al.,
2019a;Tirer & Giryes,2018).
Moreover,
(iii)
denoiser priors include BM3D (Heide et al.,
2014;Dar et al.,2016;Rond et al.,2016;Sreehari et al.,
2016;Chan et al.,2017), nonlocal means (Venkatakrishnan
et al.,2013;Heide et al.,2014;Sreehari et al.,2016), Gaus
sian mixture models (Teodoro et al.,2016;2018), weighted
nuclear norm minimization (Kamilov et al.,2017), and deep
learningbased denoisers (Meinhardt et al.,2017;Zhang
et al.,2017b;Rick Chang et al.,2017). Finally,
(iv)
theo
retical analysis on the convergence include the symmetric
gradient (Sreehari et al.,2016), the bounded denoiser (Chan
et al.,2017) and the nonexpansiveness assumptions (Sree
hari et al.,2016;Teodoro et al.,2018;Sun et al.,2019a;Ryu
et al.,2019;Chan,2019).
Differing from these aspects, in this work we focus on the
challenge of parameter selection in PnP, where a bad choice
of parameters often leads to severe degradation of the results
(Romano et al.,2017;Chan et al.,2017). Unlike existing
semiautomated parameter tuning criteria (Wang & Chan,
2017;Chan et al.,2017;Zhang et al.,2017b;Eksioglu,2016;
Tirer & Giryes,2018), our method is fully automatic and is
purely learned from the data, which signiﬁcantly eases the
burden of manual parameter tuning.
Automated Parameter Selection.
There are some works
that considering automatic parameter selection in inverse
problems. However, the prior term in these works is re
stricted to certain types of regularizers, e.g. Tikhonov reg
ularization (Hansen & Ołeary,1993;Golub et al.,1979),
smoothed versions of the
`p
norm (Eldar,2008;Giryes et al.,
2011), or general convex functions (Ramani et al.,2012). To
the best of our knowledge, none of them can be applicable
to the PnP framework with sophisticated nonconvex and
learned priors.
Deep Unrolling.
Perhaps the most confusable concept to
PnP in the deep learning era is the socalled deep unrolling
methods (Gregor & LeCun,2010;Hershey et al.,2014;
Wang et al.,2016;Yang et al.,2016;Zhang & Ghanem,
2018;Diamond et al.,2017;Metzler et al.,2017a;Adler &
Oktem,2018;Dong et al.,2018;Xie et al.,2019), which
explicitly unroll/truncate iterative optimization algorithms
into learnable deep architectures. In this way, the penalty
parameters (and the denoiser prior) are treated as trainable
parameters, meanwhile the number of iterations has to be
ﬁxed to enable endtoend training. By contrast, our PnP
approach can adaptively select a stop time and penalty
parameters given varying input states, though using the
offtheshelf denoiser as prior.
Reinforcement Learning for Image Recovery.
Although
Reinforcement Learning (RL) has been applied in a range
of domains, from game playing (Mnih et al.,2013;Silver
et al.,2016) to robotic control (Schulman et al.,2015), only
few works have successfully employed RL to the image
recovery tasks. Authors of that (Yu et al.,2018) learned
a RL policy to select appropriate tools from a toolbox to
progressively restore corrupted images. The work of (Zhang
et al.,2019b) proposed a recurrent image restorer whose
endpoint was dynamically controlled by a learned policy.
In (Furuta et al.,2019), authors used RL to select a sequence
of classic ﬁlters to process images gradually. The work
of (Yu et al.,2019) learned network path selection for image
restoration in a multipath CNN. In contrast to these works,
we apply a mixed modelfree and modelbased deep RL
approach to automatically select the parameters for the
PnP image recovery algorithm.
3. Tuningfree PnP Proximal Algorithm
In this work,we elaborate on our tuningfree PnP proximal
algorithm, as described in
(3)

(5)
. This section describes in
detail our approach, which contains three main parts. Firstly,
we describe how the automated parameter selection is driven.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Secondly, we introduce our environment model, and ﬁnally,
we introduce the policy learning, which is guided by a mixed
modelfree and a modelbased RL.
It is worth mentioning that our method is generic, and can
be applicable to PnP methods derived from other proximal
algorithms, e.g. forward backward splitting, as well. The
reason is that these are distinct methods, they share the same
ﬁxed points as PnPADMM (Meinhardt et al.,2017).
3.1. RL Formulation for Automated Parameter
Selection
This work mainly focuses on the automated parameter selec
tion problem in the PnP framework, where we aim to select
a sequence of parameters
(σ0, µ0, σ1, µ1,· · · , στ−1, µτ−1
)
to guide optimization such that the recovered image
xτ
is
close to the underlying image
x
. We formulate this prob
lem as a Markov decision process (MDP), which can be
addressed via reinforcement learning (RL).
We denote the MDP by the tuple
(S,A, p, r)
, where
S
is the
state space,
A
is the action space,
p
is the transition function
describing the environment dynamics, and
r
is the reward
function. Speciﬁcally, for our task,
S
is the space of opti
mization variable states, which includes the initialization
(x0, z0, u0)
and all intermedia results
(xk, zk, uk)
in the op
timization process.
A
is the space of internal parameters,
including both discrete terminal time
τ
and the continuous
denoising strength/penalty parameters (
σk
,
µk
). The transi
tion function
p:S × A 7→ S
maps input state
s∈ S
to its
outcome state
s0∈ S
after taking action
a∈ A
. The state
transition can be expressed as
st+1 =p(st, at)
, which is
composed of one or several iterations of optimization. On
each transition, the environment emits a reward in terms of
the reward function
r:S×A 7→ R
, which evaluates actions
given the state. Applying a sequence of parameters to the
initial state
s0
results in a trajectory
T
of states, actions
and rewards:
T={s0, a0, r0,· · · , sN, aN, rN}
. Given a
trajectory
T
, we deﬁne the return
rγ
t
as the summation of
discounted rewards after st,
rγ
t=
N−t
X
t0=0
γt0
r(st+t0, at+t0),(6)
where
γ∈[0,1]
is a discount factor and prioritizes earlier
rewards over later ones.
Our goal is to learn a policy
π
, denoted as
π(as) : S 7→ A
for the decisionmaking agent, in order to maximize the
objective deﬁned as
J(π) = Es0∼S0,T ∼π[rγ
0],(7)
where
E
represents expectation,
s0
is the initial state, and
S0
is the corresponding initial state distribution. Intuitively,
the objective describes the expected return over all possible
trajectories induced by the policy
π
. The expected return on
states and stateaction pairs under the policy
π
are deﬁned
by statevalue functions
Vπ
and actionvalue functions
Qπ
respectively, i.e.,
Vπ(s) = ET∼π[rγ
0s0=s],(8)
Qπ(s, a) = ET∼π[rγ
0s0=s, a0=a].(9)
In our task, we decompose actions into two parts: a dis
crete decision
a1
on terminal time and a continuous deci
sion
a2
on denoising strength and penalty parameter. The
policy also consists of two subpolicies:
π= (π1, π2)
, a
stochastic policy and a deterministic policy that generate
a1
and
a2
respectively. The role of
π1
is to decide whether
to terminate the iterative algorithm when the next state is
reached. It samples a booleanvalued outcome
a1
from a
twoclass categorical distribution
π1(·s)
, whose probability
mass function is calculated from the current state
s
. We
move forward to the next iteration if
a1= 0
, otherwise
the optimization would be terminated to output the ﬁnal
state. Compared to the stochastic policy
π1
, we treat
π2
deterministically, i.e.
a2=π2(s)
since
π2
is differentiable
with respect to the environment, such that its gradient can
be precisely estimated.
3.2. Environment Model
In RL, the environment is characterized by two components:
the environment dynamics and reward function. In our task,
the environment dynamics is described by the transition
function
p
related to the PnPADMM. Here, we elucidate
the detailed setting of the PnPADMM as well as the reward
function used for training policy.
Denoiser Prior.
Differentiable environment makes the
policy learning more efﬁcient. To make the environment
differentiable with respect to
π21
, we take a convolutional
neural network (CNN) denoiser as the image prior. In prac
tice, we use a residual UNet (Ronneberger et al.,2015)
architecture, which was originally designed for medical im
age segmentation, but was founded to be useful in image
denoising recently. Besides, we incorporate an additional
tunable noise level map into the input as (Zhang et al.,2018),
enabling us to provide continuous noise level control (i.e.
different denoising strength) within a single network.
Proximal operator of dataﬁdelity term.
Enforcing con
sistency with measured data requires evaluating the proxi
mal operator in
(4)
. For inverse problems, there might exist
fast solutions due to the special structure of the observation
model. We adopt the fast solution if feasible (e.g. closed
form solution using fast Fourier transform, rather than the
general matrix inversion) otherwise a single step of gradient
descent is performed as an inexact solution for (4).
1π1
is nondifferentiable towards environment regardless of the
formulation of the environment.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Transition function.
To reduce the computation cost, we
deﬁne the transition function
p
to involve
m
iterations of
the optimization. At each time step, the agent thus needs to
decide the internal parameters for
m
iterates. We set
m= 5
and the max time step
N= 6
in our algorithm, leading to
30 iterations of the optimization at most.
Reward function.
To take both image recovery perfor
mance and runtime efﬁciency into account, we deﬁne the
reward function as
r(st, at) = ζ(p(st, at)) −ζ(st)−η. (10)
The ﬁrst term,
ζ(p(st, at))−ζ(st)
, denotes the PSNR incre
ment made by the policy, where
ζ(st)
denotes the PSNR of
the recovered image at step
t
. A higher reward is acquired
if the policy leads to higher performance gain in terms of
PSNR. The second term,
η
, implies penalizing the policy
as it does not select to terminate at step
t
, where
η
sets the
degree of penalty. A negative reward is given if the PSNR
gain does not exceed the degree of penalty, thereby encour
aging the policy to early stop the iteration with diminished
return. We set η= 0.05 in our algorithm2.
3.3. RLbased policy learning
In this section, we present a mixed modelfree and model
based RL algorithm to learn the policy. Speciﬁcally, model
free RL (agnostic to the environment dynamics) is used
to train
π1
, while modelbased RL is utilized to optimize
π2
to make full use of the environment model
3
. We ap
ply the actorcritic framework (Sutton et al.,2000), that
uses a policy network
πθ(atst)
(actor) and a value network
Vπ
φ(st)
(critic) to formulate the policy and the statevalue
function respectively
4
. The policy and the value networks
are learned in an interleaved manner. For each gradient step,
we optimize the value network parameters
φ
by minimizing
Lφ=Es∼D,a∼πθ(s)1
2(r(s, a) + γV π
ˆ
φ(p(s, a)) −Vπ
φ(s))2,
(11)
where
D
is the distribution of previously sampled states,
practically implemented by a state buffer. This partly serves
as a role of the experience replay mechanism (Lin,1992),
which is observed to ”smooth” the training data distribution
(Mnih et al.,2013). The update makes use of a target value
network
Vπ
ˆ
φ
, where
ˆ
φ
is the exponentially moving average
of the value network weights and has been shown to stabilize
training (Mnih et al.,2015).
2
The choice of the hyperparameters
m, N
and
η
is discussed
in the suppl. material.
3π2
can also be optimized in a modelfree manner. The com
parison can be found in the Section 4.2.
4Details of networks are given in the suppl. material.
Table 1.
Comparisons of different CNNbased denoisers: we show
the results of (1) Gaussian denoising performance (PSNR) un
der noise level
σ= 50
; (2) the CSMRI performance (PSNR)
when plugged into the PnPADMM; (3) the GPU runtime (ms) of
denoisers when processing an image with size 256 ×256.
Performance DnCNN MemNet UNet
DENOISING PER F. 27.18 27.32 27.40
PNP PE RF. 25.43 25.67 25.76
TIMES 8.09 64.65 5.65
The policy network has two subpolicies, which employs
shared convolutional layers to extract image features, fol
lowed by two separated groups of fullyconnected layers
to produce termination probability
π1(·s)
(after softmax)
or denoising strength/penalty parameters
π2(s)
(after sig
moid). We denote the parameters of the subpolices as
θ1
and
θ2
respectively, and we seek to optimize
θ= (θ1, θ2)
so that the objective
J(πθ)
is maximized. The policy net
work is trained using policy gradient methods (Peters &
Schaal,2006). The gradient of
θ1
is estimated in a model
free manner by a likelihood estimator, while the gradient
of
θ2
is estimated relying on backpropagation via environ
ment dynamics in a modelbased manner. Speciﬁcally, for
discrete terminal time decision
π1
, we apply the policy
gradient theorem (Sutton et al.,2000) to obtain unbiased
Monte Carlo estimate of
Oθ1J(πθ)
using advantage func
tion Aπ(s, a) = Qπ(s, a)−Vπ(s)as target, i.e.,
Oθ1J(πθ) =Es∼D,a∼πθ(s)[Oθ1log π1(a1s)Aπ(s, a)] .
(12)
For continuous denoising strength and penalty parameter
selection
π2
, we utilize the deterministic policy gradient
theorem (Silver et al.,2014) to formulate its gradient, i.e.,
Oθ2J(πθ) =Es∼D,a∼πθ(s)[Oa2Qπ(s, a)Oθ2π2(s)] ,
(13)
where we approximate the actionvalue function
Qπ(s, a)
by r(s, a) + γV π
φ(p(s, a)) given its unfolded deﬁnition.
Using the chain rule, we can directly obtain the gradient of
θ2
by backpropagation via the reward function, the value
network and the transition function, in contrast to relying on
the gradient backpropagated from only the learned action
value function in the modelfree DDPG algorithm (Lillicrap
et al.,2016).
4. Experiments
In this section, we detail the experiments and evaluate our
proposed algorithm. We mainly focus on the tasks of Com
pressed Sensing MRI (CSMRI) and phase retrieval (PR),
which are the representative linear and nonlinear inverse
imaging problems respectively.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Table 2.
Comparisons of different policies used in PnPADMM
algorithm for CSMRI on seven widely used medical images under
various acceleration factors (x2/x4/x8) and noise level 15. We
show both PSNR and the number of iterations (#IT.) used to induce
results. * denotes to report the best PSNR over all iterations (i.e.
with optimal early stopping). The best results are indicated by
orange color and the second best results are denoted by blue color.
×2×4×8
POLICIES PSNR #IT. PSNR #IT. PSNR #IT.
handcrafted 30.05 30.0 27.90 30.0 25.76 30.0
handcrafted∗30.06 29.1 28.20 18.4 26.06 19.4
ﬁxed 23.94 30.0 24.26 30.0 22.78 30.0
ﬁxed∗28.45 1.6 26.67 3.4 24.19 7.3
ﬁxed optimal 30.02 30.0 28.27 30.0 26.08 16.7
ﬁxed optimal∗30.03 6.7 28.34 12.6 26.16 30.0
oracle 30.25 30.0 28.60 30.0 26.41 30.0
oracle∗30.26 8.0 28.61 13.9 26.45 21.6
modelfree 28.79 30.0 27.95 30.0 26.15 30.0
Ours 30.33 5.0 28.42 5.0 26.44 15.0
4.1. Implementation Details
Our algorithm requires two training processes for: the de
noising network and the policy network (and value network).
For training the denoising network, we follow the common
practice that uses 87,000 overlapping patches (with size
128 ×128
) drawn from 400 images from the BSD dataset
(Martin et al.,2001). For each patch, we add white Gaussian
noise with noise level sampled from
[1,50]
. The denoising
networks are trained with 50 epoch using
L1
loss and Adam
optimizer (Kingma & Ba,2014) with batch size 32. The
base learning rate is set to
10−4
and halved at epoch 30,
then reduced to 10−5at epoch 40.
To train the policy network and value network, we use the
17,125 resized images with size
128×128
from the PASCAL
VOC dataset (Everingham et al.,2014). Both networks are
trained using Adam optimizer with batch size 48 and 1500
iterations, with a base learning rate of
3×10−4
for the
policy network and
10−3
for the value network. Then we
set these learning rates to
10−4
and
3×10−4
at iteration
1000. We perform 10 gradient steps at every iteration.
For the CSMRI application, a single policy network is
trained to handle multiple sampling ratios (with x2/x4/x8
acceleration) and noise levels (5/10/15), simultaneously.
Similarly, one policy network is learned for phase retrieval
under different settings.
4.2. Compressed sensing MRI
The forward model of CSMRI can be mathematically
described as
y=Fpx+ω
, where
x∈CN
is the un
derlying image, the operator
Fp:CN→CM
, with
M < N
, denotes the partiallysampled Fourier transform,
and
ω∼ N (0, σnIM)
is the additive white Gaussian noise.
The dataﬁdelity term is
D(x) = 1
2ky− Fpxk2
whose prox
imal operator is given in (Eksioglu,2016).
Denoiser priors.
To show how denoiser priors affect the
performance of the PnP, we train three stateoftheart CNN
based denoisers, i.e. DnCNN (Zhang et al.,2017a), Mem
Net (Tai et al.,2017) and residual UNet (Ronneberger et al.,
2015), with tunable noise level map. We compare both the
Gaussian denoising performance and the PnP performance
5
using these denoisers. As shown in Table 1, the resid
ual UNet and MemNet consistently outperform DnCNN
in terms of denoising and CSMRI. It seems to imply a
better Gaussian denoiser is also a better denoiser prior for
the PnP framework
6
. Since UNet is signiﬁcantly faster than
MemNet, we choose UNet as our denoiser prior.
Comparisons of different policies.
We start by giving
some insights of our learned policy by comparing the per
formance of PnPADMM with different polices: i) the hand
crafted policy used in IRCNN (Zhang et al.,2017b); ii) the
ﬁxed policy that uses ﬁxed parameters (
σ= 15
,
µ= 0.1
);
iii) the ﬁxed optimal policy that adopts ﬁxed parameters
searched to maximize the average PSNR across all testing
images; iv) the oracle policy that uses different parameters
for different images such that the PSNR of each image is
maximized and v) our learned policy based on a learned
policy network to optimize parameters for each image. We
remark that all compared polices are run for 30 iteration
whilst ours automatically choose the terminal time.
To understand the usefulness of the early stopping mecha
nism, we also report the results of these polices with optimal
early stopping
7
. Moreover, we analyze whether the model
based RL beneﬁts our algorithm by comparing it with the
learned policy by modelfree RL whose
π2
is optimized us
ing the modelfree DDPG algorithm (Lillicrap et al.,2016).
The results of all aforementioned policies are provided in
Table 2. We can see that the bad choice of parameters (see
“ﬁxed”) induces poor results, in which the early stopping is
quite needed to rescue performance (see “ﬁxed
∗
”). When
the parameters are properly assigned, the early stopping
would be helpful to reduce computation cost. Our learned
policy leads to fast practical convergence as well as excellent
performance, sometimes even outperforms the oracle policy
tuned via inaccessible ground truth (in
×2
case). We note
this is owing to the varying parameters across iterations
generated automatically in our algorithm, which yield extra
ﬂexibility than constant parameters over iterations. Besides,
we ﬁnd the learned modelfree policy produces suboptimal
5
We exhaustively search the best denoising strength/penalty
parameters to exclude the impact of internal parameters.
6
Further investigation of this argument can be found in the
suppl. material.
7
It should be noted some policies (e.g. ”ﬁxed optimal” and ”or
acle”) requires to access the ground truth to determine parameters,
which is generally impractical in real testing scenarios.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Table 3.
Quantitative results (PSNR) of different CSMRI methods on two datasets under various acceleration factors
f
and noise levels
σn. The best results are indicated by orange color and the second best results are denoted by blue color.
DATASET f σn
TRADITIONAL DEEP UN ROL LI NG PNP
RecPF FCSA ADMMNet ISTANet BM3DMRI IRCNN Ours
Medical7
×2
5 32.46 31.70 33.10 34.58 33.33 34.67 34.78
10 29.48 28.33 31.37 31.81 29.44 31.80 32.00
15 27.08 25.52 29.16 29.99 26.90 29.96 30.27
×4
5 28.67 28.21 30.24 31.34 30.33 31.36 31.62
10 26.98 26.67 29.20 29.71 28.30 29.52 29.68
15 25.58 24.93 27.87 28.38 26.66 27.94 28.43
×8
5 24.72 24.62 26.57 27.65 26.53 27.32 28.26
10 23.94 24.04 26.21 26.90 25.81 26.44 27.35
15 23.18 23.36 25.49 26.23 25.09 25.53 26.41
MICCAI
×2
5 36.39 34.90 36.74 38.17 36.00 38.42 38.57
10 31.95 30.12 34.20 34.81 31.39 34.93 35.06
15 28.91 26.68 31.42 32.65 28.46 32.81 33.09
×4
5 33.05 32.30 34.15 35.46 34.79 35.80 36.11
10 30.21 29.56 32.58 33.13 31.63 32.99 33.07
15 28.13 26.93 30.55 31.48 29.35 30.98 31.42
×8
5 28.35 28.71 30.36 31.62 31.34 31.66 32.64
10 26.86 27.68 29.78 30.54 29.86 30.16 30.89
15 25.70 26.35 28.83 29.50 28.53 28.72 29.65
Table 4.
Quantitative results of different PR algorithms on four
CDP measurements and varying amount of Possion noise (large
α
indicates low sigmatonoise ratio).
α= 9 α= 27 α= 81
Algorithms PSNR PSNR PSNR
HIO 35.96 25.76 14.82
WF 34.46 24.96 15.76
DOLPHIn 29.93 27.45 19.35
SPAR 35.20 31.82 22.44
BM3DprGAMP 40.25 32.84 25.43
prDeep 39.70 33.54 26.82
Ours 40.33 33.90 27.23
denoising strength/penalty parameters compared with our
mixed modelfree and modelbased policy, and it also fails
to learn early stopping behavior.
Comparisons with stateofthearts.
We compare our
method against six stateoftheart methods for CSMRI,
including the traditional optimizationbased approaches
(RecPF (Yang et al.,2010) and FCSA (Huang et al.,2010)),
the PnP approaches (BM3DMRI (Eksioglu,2016) and IR
CNN (Zhang et al.,2017b)), and the deep unrolling ap
proaches (ADMMNet (Yang et al.,2016) and ISTANet
(Zhang & Ghanem,2018)). To keep comparison fair, for
each deep unrolling method, only single network is trained
to tackle all the cases using the same dataset as ours. Table
3shows the method performance on two set of medical im
ages, i.e. 7 widely used medical images (Medical7) (Huang
et al.,2010) and 50 medical images from MICCAI 2013
grand challenge dataset
8
. The visual comparison can be
8https://my.vanderbilt.edu/masi/
found in Fig. 3. It can be seen that our approach signiﬁcantly
outperforms the stateoftheart PnP method (IRCNN) by
a large margin, especially under the difﬁcult
×8
case. In
the simple cases (e.g.
×2
), our algorithm only runs 5 it
erations to arrive at the desirable performance, in contrast
with 30 or 70 iterations required in IRCNN and BM3DMRI
respectively.
4.3. Phase retrieval
The goal of phase retrieval (PR) is to recover the underlying
image from only the amplitude, or intensity of the output
of a complex linear system. Mathematically, PR can be
deﬁned as the problem of recovering a signal
x∈RN
or
CN
from measurement
y
of the form
y=Ax+ω
, where
the measurement matrix
A
represents the forward operator
of the system, and
ω
represents shot noise. We approximate
it with
ω∼ N (0, αAx)
. The term
α
controls the sigma
tonoise ratio in this problem.
We test algorithms with coded diffraction pattern (CDP)
(Cands et al.,2015). Multiple measurements, with different
random spatial modulator (SLM) patterns are recorded. We
model the capture of four measurements using a phaseonly
SLM as (Metzler et al.,2018). Each measurement opera
tor can be mathematically described as
Ai=FDi, i ∈
[1,2,3,4]
, where
F
can be represented by the 2D Fourier
transform and
Di
is diagonal matrices with nonzero ele
ments drawn uniformly from the unit circle in the complex
planes.
We compare our method with three classic approaches (HIO
(Fienup,1982), WF (Candes et al.,2014), and DOLPHIn
(Mairal et al.,2016)) and three PnP approaches (SPAR
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
RecPF FCSA ADMMNet ISTANet BM3DMRI IRCNN Ours GroundTruth
22.57 22.27 24.15 24.61 23.64 24.16 25.28 PSNR
18.74 19.23 20.48 21.37 20.62 20.91 22.02 PSNR
24.89 24.47 26.85 27.90 26.72 27.74 28.65 PSNR
Figure 3. CSMRI reconstruction results of different algorithms on medical images. (best view on screen with zoom).
HIO WF DOLPHIn SPAR BM3DprGAMP prDeep Ours GroundTruth
14.40 15.52 19.35 22.48 25.66 27.72 28.01 PSNR
15.10 16.27 19.62 22.51 23.61 24.59 25.12 PSNR
Figure 4.
Recovered images from noisy intensityonly CDP measurements with seven PR algorithms. (
Details are better appreciated
on screen.).
(Katkovnik,2017), BM3DprGAMP (Metzler et al.,2016a)
and prDeep (Metzler et al.,2018)). Table 4and Fig. 4
summarize the results of all competing methods on twelve
images used in (Metzler et al.,2018). It can be seen that
our method still leads to stateoftheart performance in this
nonlinear inverse problem, and produces cleaner and clearer
results than other competing methods.
5. Conclusion
In this work, we introduce RL into the PnP framework,
yielding a novel tuningfree PnP proximal algorithm for
a wide range of inverse imaging problems. We underline
the main message of our approach the main strength of our
proposed method is the policy network, which can customize
wellsuited parameters for different images. Through nu
merical experiments, we demonstrate our learned policy
often generates highlyeffective parameters, which even of
ten reaches to the comparable performance to the ”oracle”
parameters tuned via the inaccessible ground truth.
References
Adler, J. and Oktem, O. Learned primaldual reconstruc
tion. IEEE Transactions on Medical Imaging, 37(6):
1322–1332, 2018.
Aguet, F., Van De Ville, D., and Unser, M. Modelbased 2.5
d deconvolution for extended depth of ﬁeld in brightﬁeld
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
microscopy. IEEE Transactions on Image Processing, 17
(7):1144–1153, 2008.
Beck, A. and Teboulle, M. A fast iterative shrinkage
thresholding algorithm for linear inverse problems. SIAM
Journal on Imaging Sciences, 2(1):183–202, 2009.
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J., et al.
Distributed optimization and statistical learning via the
alternating direction method of multipliers. Foundations
and Trends R
in Machine learning, 3(1):1–122, 2011.
Buades, A., Coll, B., and Morel, J.M. A nonlocal al
gorithm for image denoising. In IEEE Conference on
Computer Vision and Pattern Recognition (CVPR), pp.
60–65, 2005.
Candes, E., Li, X., and Soltanolkotabi, M. Phase retrieval
via wirtinger ﬂow: Theory and algorithms. IEEE Trans
actions on Information Theory, 61, 07 2014.
Cands, E. J., Li, X., and Soltanolkotabi, M. Phase retrieval
from coded diffraction patterns. Applied and Computa
tional Harmonic Analysis, 39(2):277–299, 2015.
Chambolle, A. and Pock, T. A ﬁrstorder primaldual algo
rithm for convex problems with applications to imaging.
Journal of Mathematical Imaging and Vision, 40(1):120–
145, 2011.
Chan, S. H. Performance analysis of plugandplay admm:
A graph signal processing perspective. IEEE Transactions
on Computational Imaging, 5(2):274–286, 2019.
Chan, S. H., Wang, X., and Elgendy, O. A. Plugandplay
admm for image restoration: Fixedpoint convergence
and applications. IEEE Transactions on Computational
Imaging, 3(1):84–98, 2017.
Chun, I. Y., Huang, Z., Lim, H., and Fessler, J. A.
Momentumnet: Fast and convergent iterative neu
ral network for inverse problems. arXiv preprint
arXiv:1907.11818, 2019.
Dabov, K., Foi, A., Katkovnik, V., and Egiazarian, K. Image
denoising by sparse 3d transformdomain collaborative
ﬁltering. IEEE Transactions on Image Processing, 16(8):
2080, 2007.
Danielyan, A., Katkovnik, V., and Egiazarian, K. Image
deblurring by augmented lagrangian with bm3d frame
prior. In Workshop on Information Theoretic Methods in
Science and Engineering, pp. 16–18, 2010.
Dar, Y., Bruckstein, A. M., Elad, M., and Giryes, R. Postpro
cessing of compressed images via sequential denoising.
IEEE Transactions on Image Processing, 25(7):3044–
3058, 2016.
Diamond, S., Sitzmann, V., Heide, F., and Wetzstein, G.
Unrolled optimization with deep priors. arXiv preprint
arXiv:1705.08041, 2017.
Dong, W., Wang, P., Yin, W., Shi, G., Wu, F., and Lu, X.
Denoising prior driven deep neural network for image
restoration. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 41(10):2305–2318, 2018.
Eksioglu, E. M. Decoupled algorithm for mri reconstruction
using nonlocal block matching model: Bm3dmri. Jour
nal of Mathematical Imaging and Vision, 56(3):430–440,
2016.
Elbakri, I. A. and Fessler, J. A. Segmentationfree statis
tical image reconstruction for polyenergetic xray com
puted tomography. In IEEE International Symposium on
Biomedical Imaging, pp. 828–831, 2002.
Eldar, Y. C. Generalized sure for exponential families: Ap
plications to regularization. IEEE Transactions on Signal
Processing, 57(2):471–481, 2008.
Esser, E., Zhang, X., and Chan, T. F. A general framework
for a class of ﬁrst order primaldual algorithms for con
vex optimization in imaging science. SIAM Journal on
Imaging Sciences, 3(4):1015–1046, 2010.
Everingham, M., Eslami, S., Van Gool, L., Williams, C.,
Winn, J., and Zisserman, A. The pascal visual object
classes challenge: A retrospective. International Journal
of Computer Vision, 111, 01 2014.
Fessler, J. A. Modelbased image reconstruction for mri.
IEEE Signal Processing Magazine, 27(4):81–89, 2010.
Fienup, J. R. Phase retrieval algorithms: a comparison.
Applied Optics, 21(15):2758–2769, 1982.
Furuta, R., Inoue, N., and Yamasaki, T. Fully convolutional
network with multistep reinforcement learning for image
processing. In AAAI Conference on Artiﬁcial Intelligence,
pp. 3598–3605, 2019.
Geman, D. Nonlinear image recovery with halfquadratic
regularization. IEEE Transactions on Image Processing,
4(7):932–946, 1995.
Giryes, R., Elad, M., and Eldar, Y. C. The projected gsure for
automatic parameter tuning in iterative shrinkage meth
ods. Applied and Computational Harmonic Analysis, 30
(3):407–422, 2011.
Golub, G. H., Heath, M., and Wahba, G. Generalized cross
validation as a method for choosing a good ridge parame
ter. Technometrics, 21(2):215–223, 1979.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Gregor, K. and LeCun, Y. Learning fast approximations of
sparse coding. In International Conference on Machine
Learning (ICML), pp. 399–406, 2010.
Gu, S., Xie, Q., Meng, D., Zuo, W., Feng, X., and Zhang, L.
Weighted nuclear norm minimization and its applications
to low level vision. International Journal of Computer
Vision, 121(2):183–208, 2017.
Hansen, P. C. and Ołeary, D. P. The use of the lcurve in
the regularization of discrete illposed problems. SIAM
Journal on Scientiﬁc Computing, 14(6):1487–1503, 1993.
He, J., Yang, Y., Wang, Y., Zeng, D., Bian, Z., Zhang, H.,
Sun, J., Xu, Z., and Ma, J. Optimizing a parameterized
plugandplay admm for iterative lowdose ct reconstruc
tion. IEEE Transactions on Medical Imaging, 38(2):
371–382, 2018.
Heide, F., Steinberger, M., Tsai, Y.T., Rouf, M., Pajak, D.,
Reddy, D., Gallo, O., Liu, J., Heidrich, W., Egiazarian,
K., et al. Flexisp: A ﬂexible camera image processing
framework. ACM Transactions on Graphics, 33(6):231,
2014.
Hershey, J. R., Roux, J. L., and Weninger, F. Deep unfold
ing: Modelbased inspiration of novel deep architectures.
arXiv preprint arXiv:1409.2574, 2014.
Huang, J., Zhang, S., and Metaxas, D. Efﬁcient mr image re
construction for compressed mr imaging. Medical Image
Analysis, 15:135–142, 2010.
Kamilov, U. S., Mansour, H., and Wohlberg, B. A plug
andplay priors approach for solving nonlinear imaging
inverse problems. IEEE Signal Processing Letters, 24
(12):1872–1876, 2017.
Katkovnik, V. Phase retrieval from noisy data based on
sparse approximation of object phase and amplitude.
arXiv preprint arXiv:1709.01071, 2017.
Katz, O., Heidmann, P., Fink, M., and Gigan, S. Non
invasive singleshot imaging through scattering layers
and around corners via speckle correlations. Nature Pho
tonics, 8(10):784, 2014.
Kingma, D. P. and Ba, J. Adam: A method for stochastic
optimization. arXiv preprint arXiv:1412.6980, 2014.
Liao, H. Y. and Sapiro, G. Sparse representations for limited
data tomography. In IEEE International Symposium on
Biomedical Imaging: From Nano to Macro, pp. 1375–
1378. IEEE, 2008.
Lillicrap, T., Hunt, J. J., Pritzel, A., Heess, N., Erez, T.,
Tassa, Y., Silver, D., and Wierstra, D. Continuous control
with deep reinforcement learning. international confer
ence on learning representations (ICLR), 2016.
Lin, L. Selfimproving reactive agents based on reinforce
ment learning, planning and teaching. Machine Learning,
8(3):293–321, 1992.
Ma, S., Yin, W., Zhang, Y., and Chakraborty, A. An efﬁ
cient algorithm for compressed mr imaging using total
variation and wavelets. In IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), pp. 1–8. IEEE,
2008.
Mairal, Julien, Tillmann, Andreas, M., Eldar, Yonina, and
C. Dolphindictionary learning for phase retrieval. IEEE
Transactions on Signal Processing, 2016.
Mairal, J., Bach, F. R., Ponce, J., Sapiro, G., and Zisser
man, A. Nonlocal sparse models for image restoration.
In IEEE International Conference on Computer Vision
(ICCV), volume 29, pp. 54–62, 2009.
Martin, D., Fowlkes, C., Tal, D., and Malik, J. A database
of human segmented natural images and its application
to evaluating segmentation algorithms and measuring
ecological statistics. In IEEE International Conference
on Computer Vision (ICCV), pp. 416–423, 2001.
Meinhardt, T., Moller, M., Hazirbas, C., and Cremers, D.
Learning proximal operators: Using denoising networks
for regularizing inverse imaging problems. In IEEE In
ternational Conference on Computer Vision (ICCV), Oct
2017.
Metzler, C., Mousavi, A., and Baraniuk, R. Learned d
amp: Principled neural network based compressive image
recovery. In Advances in Neural Information Processing
Systems (NIPS), pp. 1772–1783. 2017a.
Metzler, C., Schniter, P., Veeraraghavan, A., et al. prdeep:
Robust phase retrieval with a ﬂexible deep network. In
International Conference on Machine Learning (ICML),
pp. 3498–3507, 2018.
Metzler, C. A., Maleki, A., and Baraniuk, R. G. Bm3d
prgamp: Compressive phase retrieval based on bm3d
denoising. In IEEE International Conference on Image
Processing, 2016a.
Metzler, C. A., Maleki, A., and Baraniuk, R. G. From
denoising to compressed sensing. IEEE Transactions on
Information Theory, 62(9):5117–5144, 2016b.
Metzler, C. A., Sharma, M. K., Nagesh, S., Baraniuk, R. G.,
Cossairt, O., and Veeraraghavan, A. Coherent inverse
scattering via transmission matrices: Efﬁcient phase re
trieval algorithms and a public dataset. In IEEE In
ternational Conference on Computational Photography
(ICCP), pp. 1–16, 2017b.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A.,
Antonoglou, I., Wierstra, D., and Riedmiller, M. Playing
atari with deep reinforcement learning. arXiv preprint
arXiv:1312.5602, 2013.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness,
J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidje
land, A. K., Ostrovski, G., et al. Humanlevel control
through deep reinforcement learning. Nature, 518(7540):
529–533, 2015.
Ono, S. Primaldual plugandplay image restoration. IEEE
Signal Processing Letters, 24(8):1108–1112, 2017.
Osher, S., Burger, M., Goldfarb, D., Xu, J., and Yin, W. An
iterative regularization method for total variationbased
image restoration. Multiscale Modeling and Simulation,
4(2):460–489, 2005.
Parikh, N., Boyd, S., et al. Proximal algorithms. Founda
tions and Trends
R
in Optimization, 1(3):127–239, 2014.
Peters, J. and Schaal, S. Policy gradient methods for
robotics. International Conference on Intelligent Robots
and Systems (IROS), pp. 2219–2225, 2006.
Qu, X., Hou, Y., Lam, F., Guo, D., Zhong, J., and Chen,
Z. Magnetic resonance image reconstruction from un
dersampled measurements using a patchbased nonlocal
operator. Medical Image Analysis, 18(6):843–856, 2014.
Ramani, S., Liu, Z., Rosen, J., Nielsen, J.F., and Fessler,
J. A. Regularization parameter selection for nonlinear iter
ative image restoration and mri reconstruction using gcv
and surebased methods. IEEE Transactions on Image
Processing, 21(8):3659–3672, 2012.
Ravishankar, S. and Bresler, Y. Mr image reconstruction
from highly undersampled kspace data by dictionary
learning. IEEE Transactions on Medical Imaging, 30(5):
1028–1041, 2010.
Rick Chang, J. H., Li, C.L., Poczos, B., Vijaya Kumar, B.
V. K., and Sankaranarayanan, A. C. One network to solve
them all – solving linear inverse problems using deep
projection models. In IEEE International Conference on
Computer Vision (ICCV), 2017.
Romano, Y., Elad, M., and Milanfar, P. The little engine that
could: Regularization by denoising (red). SIAM Journal
on Imaging Sciences, 10(4):1804–1844, 2017.
Rond, A., Giryes, R., and Elad, M. Poisson inverse prob
lems by the plugandplay scheme. Journal of Visual
Communication and Image Representation, 41:96–108,
2016.
Ronneberger, O., Fischer, P., and Brox, T. Unet: Convolu
tional networks for biomedical image segmentation. In
International Conference on Medical Image Computing
and ComputerAssisted Intervention, pp. 234–241, 2015.
Ryu, E., Liu, J., Wang, S., Chen, X., Wang, Z., and Yin,
W. Plugandplay methods provably converge with prop
erly trained denoisers. In International Conference on
Machine Learning (ICML), pp. 5546–5557, 2019.
Schulman, J., Levine, S., Abbeel, P., Jordan, M., and Moritz,
P. Trust region policy optimization. In International Con
ference on Machine Learning (ICML), pp. 1889–1897,
2015.
Semerci, O., Hao, N., Kilmer, M. E., and Miller, E. L.
Tensorbased formulation and nuclear norm regulariza
tion for multienergy computed tomography. IEEE Trans
actions on Image Processing, 23(4):1678–1693, 2014.
Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., and
Riedmiller, M. Deterministic policy gradient algorithms.
International Conference on Machine Learning (ICML),
2014.
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L.,
Van Den Driessche, G., Schrittwieser, J., Antonoglou, I.,
Panneershelvam, V., Lanctot, M., et al. Mastering the
game of go with deep neural networks and tree search.
Nature, 529(7587):484, 2016.
Sreehari, S., Venkatakrishnan, S. V., Wohlberg, B., Buz
zard, G. T., Drummy, L. F., Simmons, J. P., and Bouman,
C. A. Plugandplay priors for bright ﬁeld electron to
mography and sparse interpolation. IEEE Transactions
on Computational Imaging, 2(4):408–423, 2016.
Sreehari, S., Venkatakrishnan, S., Bouman, K. L., Simmons,
J. P., Drummy, L. F., and Bouman, C. A. Multiresolution
data fusion for superresolution electron microscopy. In
IEEE Conference on Computer Vision and Pattern Recog
nition Workshops, pp. 88–96, 2017.
Sun, Y., Wohlberg, B., and Kamilov, U. S. An online plug
andplay algorithm for regularized image reconstruction.
IEEE Transactions on Computational Imaging, 2019a.
Sun, Y., Xu, S., Li, Y., Tian, L., Wohlberg, B., and Kamilov,
U. S. Regularized fourier ptychography using an on
line plugandplay algorithm. In IEEE International
Conference on Acoustics, Speech and Signal Processing
(ICASSP), pp. 7665–7669, 2019b.
Sutton, R., Mcallester, D., Singh, S., and Mansour, Y. Policy
gradient methods for reinforcement learning with func
tion approximation. Advances in Neural Information
Processing Systems (NIPS), 2000.
Tuningfree PlugandPlay Proximal Algorithm for Inverse Imaging Problems
Tai, Y., Yang, J., Liu, X., and Xu, C. Memnet: A persistent
memory network for image restoration. In IEEE Inter
national Conference on Computer Vision (ICCV), Oct
2017.
Teodoro, A. M., BioucasDias, J. M., and Figueiredo, M. A.
Image restoration and reconstruction using variable split
ting and classadapted image priors. In IEEE Interna
tional Conference on Image Processing, pp. 3518–3522,
2016.
Teodoro, A. M., BioucasDias, J. M., and Figueiredo, M. A.
A convergent image fusion algorithm using sceneadapted
gaussianmixturebased denoising. IEEE Transactions
on Image Processing, 28(1):451–463, 2018.
Tirer, T. and Giryes, R. Image restoration by iterative de
noising and backward projections. IEEE Transactions on
Image Processing, 28(3):1220–1234, 2018.
Venkatakrishnan, S. V., Bouman, C. A., and Wohlberg, B.
Plugandplay priors for model based reconstruction. In
IEEE Global Conference on Signal and Information Pro
cessing, pp. 945–948, 2013.
Wang, S., Fidler, S., and Urtasun, R. Proximal deep struc
tured models. In Advances in Neural Information Pro
cessing Systems (NIPS), pp. 865–873, 2016.
Wang, X. and Chan, S. H. Parameterfree plugandplay
admm for image restoration. In IEEE International
Conference on Acoustics, Speech and Signal Processing
(ICASSP), pp. 1323–1327, 2017.
Xie, X., Wu, J., Liu, G., Zhong, Z., and Lin, Z. Differen
tiable linearized admm. In International Conference on
Machine Learning (ICML), pp. 6902–6911, 2019.
Yang, J., Zhang, Y., and Yin, W. A fast alternating direc
tion method for tvl1l2 signal reconstruction from partial
fourier data. IEEE Journal of Selected Topics in Signal
Processing, 4(2):288–297, 2010.
Yang, Y., Sun, J., Li, H., and Xu, Z. Deep admmnet for
compressive sensing mri. In Advances in Neural Infor
mation Processing Systems (NIPS), pp. 10–18. 2016.
Yu, K., Dong, C., Lin, L., and Change Loy, C. Crafting
a toolchain for image restoration by deep reinforcement
learning. In IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 2443–2452, 2018.
Yu, K., Wang, X., Dong, C., Tang, X., and Loy, C. C.
Pathrestore: Learning network path selection for image
restoration. arXiv preprint arXiv:1904.10343, 2019.
Zhang, J. and Ghanem, B. Istanet: Interpretable
optimizationinspired deep network for image compres
sive sensing. In IEEE Conference on Computer Vision
and Pattern Recognition (CVPR), 2018.
Zhang, K., Zuo, W., Chen, Y., Meng, D., and Zhang, L.
Beyond a gaussian denoiser: Residual learning of deep
cnn for image denoising. IEEE Transactions on Image
Processing, 26(7):3142–3155, 2017a.
Zhang, K., Zuo, W., Gu, S., and Zhang, L. Learning deep
cnn denoiser prior for image restoration. In IEEE Con
ference on Computer Vision and Pattern Recognition
(CVPR), 2017b.
Zhang, K., Zuo, W., and Zhang, L. Ffdnet: Toward a
fast and ﬂexible solution for cnnbased image denoising.
IEEE Transactions on Image Processing, 27(9):4608–
4622, 2018.
Zhang, K., Zuo, W., and Zhang, L. Deep plugandplay
superresolution for arbitrary blur kernels. In IEEE Con
ference on Computer Vision and Pattern Recognition
(CVPR), 2019a.
Zhang, X., Lu, Y., Liu, J., and Dong, B. Dynamically un
folding recurrent restorer: A moving endpoint control
method for image restoration. In International Confer
ence on Learning Representations (ICLR), 2019b.
Zheng, G., Horstmeyer, R., and Yang, C. Wideﬁeld, high
resolution fourier ptychographic microscopy. Nature Pho
tonics, 7(9):739, 2013.
Zoran, D. and Weiss, Y. From learning models of natural
image patches to whole image restoration. In IEEE In
ternational Conference on Computer Vision (ICCV), pp.
479–486, 2011.