Solving RED with Weighted Proximal Methods

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

REgularization by Denoising (RED) is an attractive framework for solving inverse problems by incorporating state-of-the-art denoising algorithms as the priors. A drawback of this approach is the high computational complexity of denoisers, which dominate the computation time. In this paper, we apply a general framework called weighted proximal methods (WPMs) to solve RED efficiently. We first show that two recently introduced RED solvers (using the fixed point and accelerated proximal gradient methods) are particular cases of WPMs. Then we show by numerical experiments that slightly more sophisticated variants of WPM can lead to reduced run times for RED by requiring a significantly smaller number of calls to the denoiser.

💡 Research Summary

The paper addresses the computational bottleneck of REgularization by Denoising (RED), a framework that incorporates modern image denoisers as explicit priors for solving general inverse problems. In RED the objective function is
  E(x)= (1/(2σ²))‖Hx−y‖² + α R(x), R(x)=½‖x−f(x)‖²,
where f(·) is a denoiser. The gradient of the regularizer is simply the denoiser residual, ∇R(x)=x−f(x), so each iteration of any RED solver requires a full denoiser call, which dominates the total runtime.

The authors propose to view RED as a special case of Weighted Proximal Methods (WPM). For a composite problem
  minₓ  g(x)+h(x) with g(x)=αR(x), h(x)= (1/(2σ²))‖Hx−y‖²,
WPM iterates
  x_{k+1}=prox_{a_k h}^{B_k}\bigl(x_k – a_k B_k^{-1}∇g(x_k)\bigr),
where B_k is a positive‑definite weighting matrix and a_k a step size. The proximal operator with respect to B_k reduces to solving a linear system that involves HᵀH and B_k.

The paper first shows that two recent RED solvers are particular instances of WPM:

Fixed‑Point (FP) method corresponds to B_k=αI and a_k=1, yielding exactly the FP update used in the original RED paper.
Accelerated Proximal Gradient (APG) is the same weighted scheme with Nesterov’s momentum applied, i.e., the accelerated version of WPM.

The main contribution is a more sophisticated choice of B_k that captures curvature information of the regularizer. Because the exact Hessian of R(x) is unavailable (the denoiser is a black‑box), the authors employ a Symmetric Rank‑One (SR1) quasi‑Newton update to approximate it. At iteration k they compute
  s_k = x_k – x_{k‑1}, m_k = ∇g(x_k) – ∇g(x_{k‑1}),
and construct a scalar τ = γ (m_kᵀ s_k) / (‖s_k‖²). If τ is positive and a stability condition holds, they set
  B_k = τ I + (u_k u_kᵀ) / (s_kᵀ u_k), u_k = m_k – τ s_k,
otherwise they fall back to B_k = αI. The SR1 matrix is never formed explicitly; it is stored as a linear‑operator that can be applied to vectors, keeping memory and computational overhead low. The linear system in the proximal step is solved approximately with Conjugate Gradient (CG).

Step size a_k is fixed to 1 for simplicity, and only reduced by half if the objective value increases beyond a small relative threshold (ε = 10⁻²). This avoids costly line‑searches that would otherwise require extra denoiser evaluations.

Experimental evaluation uses the Trainable Nonlinear Reaction Diffusion (TNRD) denoiser on two classic RED tasks: image deblurring (uniform 9×9 blur and Gaussian blur σ=1.6) and super‑resolution (7×7 Gaussian blur followed by 3× down‑sampling). Performance is measured in terms of PSNR, number of denoiser calls, and CPU time on a modest laptop (i7‑6500U, 8 GB RAM).

Key findings:

For a target PSNR around 30 dB, the proposed WPM requires roughly 30‑40 % fewer denoiser evaluations than both FP‑MPE (vector‑extrapolated FP) and APG.
Correspondingly, wall‑clock time is reduced by 20‑35 % across all test images.
On a set of eight additional images, WPM achieves the smallest number of denoiser calls in all but two cases (the “Boats” and “House” deblurring experiments).
Visual quality of the reconstructions is comparable to or slightly better than the baselines.

The authors acknowledge that RED’s theoretical assumptions—linearity of the denoiser (f(cx)=c f(x)) and a Jacobian spectral radius ≤ 1—are often violated in practice (as noted in prior work

Solving RED with Weighted Proximal Methods

💡 Research Summary

Comments & Academic Discussion

Leave a Comment