Improved Hybrid Layered Image Compression using Deep Learning and Traditional Codecs

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Recently deep learning-based methods have been applied in image compression and achieved many promising results. In this paper, we propose an improved hybrid layered image compression framework by combining deep learning and the traditional image codecs. At the encoder, we first use a convolutional neural network (CNN) to obtain a compact representation of the input image, which is losslessly encoded by the FLIF codec as the base layer of the bit stream. A coarse reconstruction of the input is obtained by another CNN from the reconstructed compact representation. The residual between the input and the coarse reconstruction is then obtained and encoded by the H.265/HEVC-based BPG codec as the enhancement layer of the bit stream. Experimental results using the Kodak and Tecnick datasets show that the proposed scheme outperforms the state-of-the-art deep learning-based layered coding scheme and traditional codecs including BPG in both PSNR and MS-SSIM metrics across a wide range of bit rates, when the images are coded in the RGB444 domain.

💡 Research Summary

The paper introduces an improved hybrid layered image compression framework that combines a deep learning autoencoder with traditional codecs, specifically FLIF for lossless base‑layer coding and BPG (HEVC‑based) for lossily coding the residual. The authors simplify the previous DSSLIC architecture by removing the semantic‑segmentation‑based synthesis layer, resulting in a two‑layer system: (1) a compact representation of the input image is generated by a convolutional neural network (CompNet), downsampled to 1/16 of the original resolution, and losslessly encoded with FLIF; (2) a second network (RecNet) reconstructs a coarse image from this compact code. The difference between the original image and the coarse reconstruction (the residual) is then encoded by BPG.

Both CompNet and RecNet consist of 13 layers, featuring four down‑sampling (CompNet) and four up‑sampling (RecNet) stages. Between each pair of down‑/up‑sampling layers a modified residual block is inserted. The authors adapt the classic ResNet block by (i) removing the non‑linear activation after the skip connection, (ii) inserting a dropout layer between the two 3×3 convolutions, (iii) replacing ReLU with parametric ReLU (PReLU) for better gradient flow, and (iv) using tanh in the final layer to keep outputs in the range

Improved Hybrid Layered Image Compression using Deep Learning and Traditional Codecs

💡 Research Summary

Comments & Academic Discussion

Leave a Comment