DCT-like Transform for Image Compression Requires 14 Additions Only

DCT-like Transform for Image Compression Requires 14 Additions Only
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A low-complexity 8-point orthogonal approximate DCT is introduced. The proposed transform requires no multiplications or bit-shift operations. The derived fast algorithm requires only 14 additions, less than any existing DCT approximation. Moreover, in several image compression scenarios, the proposed transform could outperform the well-known signed DCT, as well as state-of-the-art algorithms.


💡 Research Summary

The paper introduces a novel 8‑point orthogonal approximation of the discrete cosine transform (DCT) that can be computed using only 14 addition operations, with no multiplications or bit‑shifts required. The authors start from the CB‑2011 approximate DCT matrix, replace selected non‑essential entries with zeros, and obtain a sparse matrix T whose elements belong to the set {0, ±1}. By multiplying T with a diagonal scaling matrix D (containing √8, √2, ½, etc.), they construct the final approximate transform Ĉ = D·T. Because T contains only unit‑magnitude entries, the transform is multiplication‑free; the scaling matrix can be merged into the quantization step of a JPEG‑like codec, avoiding any extra arithmetic.

To achieve the low addition count, the authors factor T into four sparse matrices P·A₃·A₂·A₁, each representing simple permutations, sign changes, and pairwise additions. The resulting signal‑flow graph (Fig. 1) shows that the entire transform can be realized with exactly 14 additions. A complexity table (Table 1) compares the proposed method with several state‑of‑the‑art DCT approximations: the Signed DCT (SDCT) requires 24 additions, the Level‑1 approximation also 24, and the BAS‑2008/2009/2011 series need 18 additions (with occasional shifts). Thus the new transform reduces the arithmetic cost by 22 %–42 % relative to these competitors.

For empirical validation, the authors performed image compression experiments on a set of 45 standard 512 × 512 grayscale images from the USC‑SIPI database. Each image was partitioned into 8 × 8 blocks, transformed with the proposed method and with the reference DCT, SDCT, and BAS‑2011, then quantized using the standard JPEG zig‑zag ordering. Compression ratios were varied by selecting the first r coefficients (2 ≤ r ≤ 45), corresponding to overall compression rates from 96.9 % down to 2.9 %. Reconstruction quality was measured by average peak‑signal‑to‑noise ratio (PSNR) and mean‑square error (MSE) across all images, providing a robust assessment less sensitive to individual image idiosyncrasies.

Results show that at low compression ratios (high r, i.e., modest compression), the proposed transform matches the exact DCT and outperforms SDCT and BAS‑2011. In the mid‑range (20 < r ≤ 35) the new method consistently yields higher PSNR and lower MSE than both competitors, despite its dramatically lower computational load. At very high compression (r ≤ 20) the transform still delivers PSNR values comparable to SDCT and superior to BAS‑2011. A visual example using the classic Lena image at r = 25 (≈60 % compression) demonstrates that the reconstructed picture from the proposed transform is virtually indistinguishable from the one obtained with the exact DCT, with PSNR ≈ 31.4 dB versus 37.2 dB for the exact DCT and only a few tenths of a decibel difference relative to the other approximations.

The authors emphasize three main contributions: (1) an ultra‑low‑complexity DCT approximation requiring only 14 additions, the smallest reported for an 8‑point orthogonal transform; (2) a method to absorb the scaling matrix into the quantization step, eliminating any extra arithmetic overhead; and (3) a comprehensive experimental validation showing that the reduction in arithmetic does not compromise, and in many cases improves, compression quality compared with existing approximations.

Limitations are acknowledged. The transform is defined only for 8‑point blocks, so extensions to larger block sizes (16, 32 points) remain to be investigated. All tests were performed on grayscale images; color image or video coding would require additional analysis of inter‑channel correlations and possibly different quantization strategies. Finally, while the theoretical operation count is minimal, a concrete hardware implementation (ASIC or FPGA) with measured area, power, and latency is not presented, leaving the practical gains in embedded systems to be demonstrated in future work.

In conclusion, the paper delivers a compelling solution for low‑power, low‑complexity image compression scenarios such as wireless image sensor networks, battery‑operated cameras, and other embedded vision platforms where arithmetic resources are scarce. By achieving orthogonality, preserving energy compaction, and delivering competitive PSNR/MSE performance with only 14 additions, the proposed DCT‑like transform stands out as a valuable addition to the toolbox of signal‑processing engineers focused on efficient multimedia coding.


Comments & Academic Discussion

Loading comments...

Leave a Comment