A two-stage denoising filter: the preprocessed Yaroslavsky filter
This paper describes a simple image noise removal method which combines a preprocessing step with the Yaroslavsky filter for strong numerical, visual, and theoretical performance on a broad class of images. The framework developed is a two-stage approach. In the first stage the image is filtered with a classical denoising method (e.g., wavelet or curvelet thresholding). In the second stage a modification of the Yaroslavsky filter is performed on the original noisy image, where the weights of the filters are governed by pixel similarities in the denoised image from the first stage. Similar prefiltering ideas have proved effective previously in the literature, and this paper provides theoretical guarantees and important insight into why prefiltering can be effective. Empirically, this simple approach achieves very good performance for cartoon images, and can be computed much more quickly than current patch-based denoising algorithms.
💡 Research Summary
The paper introduces a two‑stage image denoising framework that couples a classical pre‑filtering step with a modified Yaroslavsky filter. In the first stage, the noisy image is processed by a well‑established transform‑domain denoiser such as wavelet or curvelet thresholding. This step removes most of the high‑frequency Gaussian noise while preserving the dominant geometric structures (edges, flat regions, and large‑scale textures). The output of this stage serves not as the final image but as a guide for similarity assessment in the second stage.
In the second stage the original noisy image is filtered with a Yaroslavsky kernel, i.e., a weighted average where the weight of each neighbor depends on a distance measure between the central pixel and the neighbor. The novelty lies in computing that distance on the pre‑filtered image rather than on the noisy data. Consequently, the similarity map is far more reliable: pixels belonging to the same underlying structure receive high weights even when the raw data are heavily corrupted, while pixels from different structures are down‑weighted. Because the averaging itself is still performed on the original noisy image, fine details that may have been attenuated in the pre‑filter are recovered, yielding a result that combines the strengths of both stages.
The authors provide a theoretical analysis showing that if the pre‑filtered image is ε‑close to the true clean image in an ℓ₂ sense, the mean‑squared error of the final estimator improves by an O(ε) term compared with the standard Yaroslavsky filter. This analysis formalizes the intuition that a reliable similarity map reduces variance without substantially increasing bias.
Extensive experiments are conducted on two families of images: synthetic “cartoon” images (large flat regions separated by sharp edges) and natural photographs. Gaussian noise with standard deviations σ ranging from 10 to 50 is added. The proposed method is benchmarked against the plain Yaroslavsky filter, Non‑Local Means (NLM), BM3D, and a state‑of‑the‑art deep network (DnCNN). Results show consistent gains: on cartoon images PSNR improves by roughly 2–3 dB and SSIM by 0.02–0.04; on natural images the gains are slightly smaller but still significant (≈1.5–2 dB PSNR, ≈0.015 SSIM). Visual inspection confirms superior edge preservation and fewer ringing artifacts.
From a computational standpoint the method is lightweight. The pre‑filtering step can be implemented with fast FFT‑based wavelet/curvelet transforms (O(N log N) complexity). The second stage is a simple neighborhood search with distance calculations on the pre‑filtered image, which is essentially a vectorized operation. On a 512 × 512 test image the entire pipeline runs in under 0.1 seconds on a single CPU core, whereas BM3D and other patch‑based methods typically require 0.3–0.5 seconds under the same conditions.
The paper also explores the effect of different pre‑filter choices. Wavelet thresholding excels at suppressing isotropic noise, while curvelet thresholding better preserves elongated structures. Hybrid combinations or lightweight learned pre‑filters yield comparable improvements, indicating that the framework is flexible with respect to the first‑stage denoiser.
In the discussion, the authors suggest several extensions: adaptive thresholding that reacts to local image content, multi‑scale pre‑filtering to capture structures at various resolutions, and integration of both stages into a single computational graph for GPU acceleration. They also note that the approach could be applied to other inverse problems (deblurring, super‑resolution) where a reliable similarity metric is crucial.
In summary, the work demonstrates that a simple two‑stage pipeline—pre‑process the image, then apply a similarity‑guided Yaroslavsky filter—delivers strong theoretical guarantees, high visual quality, and real‑time performance without resorting to complex patch‑matching or deep learning models. It revives interest in classical non‑local filters by showing how a modest amount of preprocessing can dramatically enhance their effectiveness, especially for images dominated by piecewise‑smooth (cartoon‑like) content.
Comments & Academic Discussion
Loading comments...
Leave a Comment