A Training-Free Defense Framework for Robust Learned Image Compression
We study the robustness of learned image compression models against adversarial attacks and present a training-free defense technique based on simple image transform functions. Recent learned image compression models are vulnerable to adversarial att…
Authors: Myungseo Song, Jinyoung Choi, Bohyung Han
A T raining-Fr ee Defense Framework f or Rob ust Learned Image Compr ession Myungseo Song Jiny oung Choi Bohyung Han Computer V ision Laboratory , Seoul National Univ ersity { micmic123, jin0.choi, bhhan } @snu.ac.kr Abstract W e study the robustness of learned image compres- sion models against adversarial attacks and present a training-free defense technique based on simple image transform functions. Recent learned image compression models are vulnerable to adversarial attacks that result in poor compression rate, lo w re- construction quality , or weird artifacts. T o address the limitations, we propose a simple but ef fecti ve two-way compression algorithm with random in- put transforms, which is con veniently applicable to existing image compression models. Unlike the na ¨ ıve approaches, our approach preserves the orig- inal rate-distortion performance of the models on clean images. Moreov er, the proposed algorithm requires no additional training or modification of existing models, making it more practical. W e demonstrate the ef fecti veness of the proposed tech- niques through extensiv e e xperiments under mul- tiple compression models, ev aluation metrics, and attack scenarios. 1 Introuduction It is well-known that deep neural networks trained for image recognition are vulnerable to adversarial attacks [ Szegedy et al. , 2014 ] . By small and imperceptible perturbations on input images, the networks are easily deceived to behave for the intent of the attackers. The performance of the models often drops significantly , which directly hampers the security and robustness of a whole system. As with other fields, adv ersarial attacks against learned im- age compression models are possible as well. There are two feasible threats to lossy image compression, i.e. , failure of bitrate reduction and se vere distortion of decoded images. Figure 1 presents an example of perturbed image and cor- responding decoded image by an image compression model with weird artifacts. These limitations of image compression hav e f ar-reaching power af fecting subsequent do wnstream tasks such as classification and detection. In this respect, it is worth paying attention to the robustness of image compres- sion models and their defense techniques against attacks. Compared to the recognition domains, the robustness of Clean P er turbed Input Source Image Reconstruction without defense Reconstruction with defense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.8110 / 32.0145 / 0.9602 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 0.9136 / 8.3073 / 0.2963 0.3262 / 39.3297 / 0.9873 Source Image Reconstruction without defense Reconstruction with defense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.8110 / 32.0145 / 0.9602 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 0.9136 / 8.3073 / 0.2963 0.3262 / 39.3297 / 0.9873 w/o Def ense Source Image Reconstruction without defense Reconstruction with defense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.8110 / 32.0145 / 0.9602 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 0.9136 / 8.3073 / 0.2963 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 Source Image Reconstruction without defense Reconstruction with defense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.8110 / 32.0145 / 0.9602 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 0.9136 / 8.3073 / 0.2963 0.3262 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 w/ Def ense Source Image Reconstruction without defense Reconstruction with defense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.8110 / 32.0145 / 0.9602 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 0.9136 / 8.3073 / 0.2963 0.3262 / 39.3297 / 0.9873 0.3262 / 39.3297 / 0.9873 Source Image Reconstruction without defense Reconstruction with defense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.8110 / 32.0145 / 0.9602 0.3262 / 39.3297 / 0.9873 0.3258 / 39.3297 / 0.9873 0.9136 / 8.3073 / 0.2963 0.3262 / 39.3297 / 0.9873 0.4192 / 35.6980 / 0.9770 Figure 1: Demonstration of the vulnerability of learned image com- pression model to adversarial attacks and effecti veness of our de- fense method. The yellow annotations in each reconstructed image denote bits per pixel (bpp)/PSNR (dB)/MS-SSIM. deep image compression models hav e not been studied com- prehensiv ely . Some attack algorithms proposed for other tasks have turned out to be generalizable to image com- pression models [ Chen and Ma, 2023; Liu et al. , 2023; Sui et al. , 2023; Y u et al. , 2023 ] . Howe ver , defense tech- niques for image compression are not mature yet, and a na ¨ ıve application of defense methods designed for other tasks may not work properly in image compression. T o enhance the robustness of image compression models, one can adopt approaches such as adversarial fine-tuning, a straightforward method suggested in [ Chen and Ma, 2023 ] . Howe ver , this approach requires additional model training and consequently degrades the original compression perfor- mance of the models on normal, unattacked images. An- other defense strategy performs preprocessing on input im- ages such as Gaussian blurring and bit depth reduction [ Xu et al. , 2018 ] . Howe ver , these methods inevitably increase re- construction errors of normal images due to the content loss caused by the preprocessing, as discussed in [ Y u et al. , 2023 ] . This work in vestigates the vulnerability of learned image Figure 2: Examples of adversarially perturbed images (top) and cor - responding reconstructed images (bottom). compression models and introduces a training-free defense strategy . W e sho w that the performance of recent image com- pression models are easily harmed by basic attack algorithms in terms of rate and distortion. T o avoid these risks, we pro- pose a simple yet effecti ve image compression framework for defense. Our framew ork improv es the stability of compres- sion performance to div erse adversarial attacks with negligi- ble performance degradation on clean images. It le verages in- put randomization in a safe way based on the self-supervised nature of the image compression problem. Our approach is directly applicable to pretrained compression models without additional training, hence practical. The effecti veness of our defense method against the attack is illustrated in Figure 1. The main contributions of this paper are summarized as (i) the in vestigation of adversarial attacks on learned image compression models, (ii) the proposal of simple and ef fective defense techniques ag ainst the attacks, and (iii) the e valuation on the robustness of the proposed compression frame work. 2 Related W orks This section briefly describes adversarial attack and defense methods in classificaiton and compression fields. 2.1 Adversarial Rob ustness of Image Classification After Szegedy et al. [ 2014 ] first showed the adversarial vul- nerabilites of classifiers, sev eral attack methods ha ve been introduced, including FGSM [ Goodfellow et al. , 2015 ] , C&W [ Carlini and W agner, 2017 ] , DeepFool [ Moosavi- Dezfooli et al. , 2016 ] , and PGD [ Madry et al. , 2018 ] . They share the ke y idea of adding minimal perturbations on an im- age iteratively tow ards the decision boundary of a classifier . FD A [ Ganeshan et al. , 2019 ] perturbs an image by disrupt- ing the statistics of the intermediate features of a model. For defense, the adv ersarial training, adding adversarial examples into training dataset, is a mainstream technique [ Goodfellow et al. , 2015; Madry et al. , 2018; Tram ` er et al. , 2018; Kannan et al. , 2018 ] . As another line of research, [ Guo et al. , 2018; Xie et al. , 2018 ] attempts to reduce the chance of success- ful attacks by randomizing inputs while [ Xu et al. , 2018; Samangouei et al. , 2018 ] defend the models by denoising through optimization. 2.2 Adversarial Rob ustness of Image Compression Learend image compression methods typically adopt autoen- coder networks with auxiliary entropy models for probabil- ity distribution estimation of latent representations [ Ball ´ e et Model Low bitrate High bitrate SH 5M 12M M&S 7M 18M M&S+C 14M 26M Anchor 12M 27M T able 1: The number of parameters of the compression models used in our experiments with respect to their tar get bitrates. al. , 2018; Minnen et al. , 2018; Cheng et al. , 2020 ] . Adver - sarial attacks on image compression models are achiev ed by either increasing the bitstream lengths of latent representa- tions or degrading the quality of decoded images. Recently , researchers start to e xplore and in vestigate the adversarial ro- bustness of image compression models. For example, Chen and Ma [ 2023 ] corrupt the reconstruction quality of the mod- els via distortion attack. Although they le verage adversar- ial fine-tuning to address the vulnerabilites of the models, it leads to compression quality degradation of unattacked im- ages. Liu et al. [ 2023 ] conduct transferring attacks [ Papernot et al. , 2016 ] using a JPEG-like substitution model in a black- box attack scenario. Sui et al. [ 2023 ] propose a distortion at- tack algorithm with less perceptible perturbations, and Y u et al. [ 2023 ] introduce a trigger injection model for backdoor attack. 3 Adversarial Attack on Lear ned Image Compression This section presents the basic techniques of learned image compression and adversarial attacks on it. Ne xt, we discuss the vulnerability of image compression in div erse apsects. 3.1 Preliminaries The goal of lossy image compression is to minimize the bit- stream length of an image while preserving the content in the image as much as possible. T ypically , a compression system consists of an encoder E , a decoder D , a quantizer Q , and an entropy model P . Giv en a source image x , E transforms x to a latent rep- resentation y = E ( x ) , which is then con verted to a quan- tized latent representation ˆ y = Q ( y ) . T o sav e ˆ y , an entropy coding algorithm like the arithmetic coding [ Rissanen and Langdon, 1981 ] encodes ˆ y into a bitstream with the prob- ability distribution of ˆ y estimated by P . The length of the resulting bitstream is approximately − log P ( ˆ y ) with minor ov erhead hence is often used as a surrogate of the rate loss term. For decoding, D generates the reconstructed image ˆ x from the quantized latent representation ˆ y , i.e. , ˆ x = D ( ˆ y ) . Giv en a distrotion metric d ( · , · ) such as the mean squared er- ror (MSE), the rate-distortion loss L RD is given by the sum of the rate loss L rate = − log P ( ˆ y ) and the distortion loss L dist = d ( x , ˆ x ) as follo ws: L RD = L rate + λ L dist = − log P ( ˆ y ) + λd ( x , ˆ x ) , (1) where a Lagrangian multiplier λ controls the rate-distortion trade-off. Then, the objectiv e of the image compression model is giv en by min E x ∼ p x [ L RD ] . (2) (a) SH (b) M&S (c) M&S+C (d) Anchor Figure 3: Results of adversarial attacks on image compression models for poor compression rates with v arious ϵ values for PGD algorithm. T op: results of low-bitrate models. Bottom: results of high-bitrate models. Clean denotes the performance on clean ( i.e. , unperturbed) images. Our experiments use four pretrained lossy image com- pression models available at an open-source compression li- brary [ B ´ egaint et al. , 2020 ] : Scale Hyperprior (SH) [ Ball ´ e et al. , 2018 ] , Mean & Scale Hyperprior (M&S) [ Min- nen et al. , 2018 ] , Mean & Scale Hyperprior with context model (M&S+C) [ Minnen et al. , 2018 ] , and Anchor (An- chor) [ Cheng et al. , 2020 ] . T able 1 shows the number of pa- rameters of the models. Note that the models for high bitrates hav e more parameters than the low-bitrate counterparts. 3.2 Attack Algorithm f or Image Compression Among the adversarial attack strategies, we mainly adopt a famous optimization-based attack method, called the PGD al- gorithm [ Madry et al. , 2018 ] . T o generate an adversarial ex- ample from a source image x , PGD iterati vely updates x with a step size α under the ℓ ∞ -norm constraint of the maximum per-pix el perturbation ϵ , which is given by x t +1 = x t + α · sgn ( ▽L ) , (3) where L denotes a task-specific loss and sgn ( · ) ∈ {− 1 , 1 } is the sign function. Since compression models minimize the rate-distortion trade-off L RD , one can attack the model in terms of rate and distortion, for which the objectiv e functions L are defined as L rate and L dist , respecti vely . It is also possi- ble to employ the joint rate-distortion objective for attack by setting L = L RD , b ut it makes the analysis more complex due to the conflicting properties of the tw o terms. For the lossless image compression, only the rate loss is treated as a target since the source image content should be perfectly reco vered. 3.3 Results on Adversaries Qualitative results Figure 2 illustrates several adversaries of distortion attacks on M&S and their corresponding recon- structed images. The weird artifacts in the reconstructed im- ages are easily induced by the attack, which shows the vul- nerability of the model. Quantitative results Figure 3 presents the results of adv er- sarial attacks on four compression models with respect to the rate by varying the value of ϵ for the PGD algorithm. The larger ϵ is, the more performance de gradation is observed consistently for all models. Also, the high-bitrate models tend to be more vulnerable to the attacks than the low-bitrate ones. This is partly because (i) the high-bitrate models with more parameters hav e more overfitting issues than the low-bitrate ones and (ii) the low-bitrate models have high reconstruction errors especially for high-frequency signals and hence tend to be robust to the adversarial noise giv en to input images. The relationship between the model complexity and the vulnera- bility is discussed more in Appendix A. The result of distor- tion attack is presented in Appendix B. T o mitigate these ad- versarial ef fects, appropriate defense techniques are requird. 4 Defending Adversarial Attacks This section revie ws the input randomization defense tech- nique [ Xie et al. , 2018 ] proposed for image classification, and discusses its limitations of direct application to image compression. Then, we present our main idea of training-free defense technique for image compression models. 4.1 Input Randomization f or Image Classification The input randomization [ Xie et al. , 2018 ] is a technique without training for mitigating the adversarial effects of im- age classification models. It first defines a set of image trans- formations T = { τ 1 , ..., τ n } , where τ θ is an image transfor- mation ( e.g . , cropping). For an input image x , a transform Quantizer Source Image Random T ransform Encoder Recon. Image Bitstream 𝛳 Decoder Inverse T ransform Source Image Random T ransform Classifier Class probabilities Source Image Random T ransform Encoder Decoded Image Bitstream 𝛳 Decoder Inverse T ransform Bitstream 𝛳 Source Image Random T ransform Classifier Probabilities Source Image Random T ransform Classifier Probabilities Quantizer Source Image Random T ransform Encoder Recon. Image Bitstream 𝛳 Decoder Inverse T ransform Source Image Random T ransform Classifier Class probabilities Source Image Random T ransform Encoder Decoded Image Bitstream 𝛳 Decoder Inverse T ransform Bitstream 𝛳 Source Image Random T ransform Classifier Probabilities Source Image Random T ransform Encoder Bitstream 𝛳 Quantizer Source Image Random T ransform Encoder Recon. Image Bitstream 𝛳 Decoder Inverse T ransform Source Image Random T ransform Classifier Class probabilities Source Image Random T ransform Encoder Decoded Image Bitstream 𝛳 Decoder Inverse T ransform Bitstream 𝛳 Source Image Random T ransform Classifier Probabilities Decoded Image Decoder Inverse T ransform Bitstream 𝛳 (a) (b) (c) Figure 4: (a) Input randomization for image classification. (b), (c) Input randomization for encoder and decoder of image compression. τ θ is randomly sampled from T and the transformed image is giv en by x t = τ θ ( x ) , where τ θ ∈ T . (4) Then, x t is fed to the classification model for prediction. Specifically , [ Xie et al. , 2018 ] adopts resizing followed by zero padding for the transforms, T . The randomness provided by random transforms impro ves the robustness of the model. The attackers cannot perform precise inference due to the randomness; the attack is subop- timal because the attackers should consider all possible trans- forms if n is suf ficiently large. Next, we describe how to apply it to image compression and its challenges. 4.2 Input Randomization f or Image Compression T o alleviate the adversarial effects on image compression models without additional training, we le verage the afore- mentioned input randomization technique [ Xie et al. , 2018 ] . Figure 4 compares the input randomization in between image classification and image compression. Suppose that we hav e a pretrained image compression model consisting of an encoder E , a quantizer Q and a de- coder D . T o encode an input image x , we first sample a transformation τ θ from T and transform x to get x t as Equa- tion (4). Then, we encode x t instead of x as follo ws: ˆ y = Q ( E ( x t )) . (5) The decoding is giv en by ˆ x t = D ( ˆ y ) and ˆ x = τ − 1 θ ( ˆ x t ) , (6) where τ − 1 θ is an inv erse transform of τ θ . Note that T consists of (pseudo) in vertible transforms for reconstruction and the additional cost to store the transform index θ , log n bits, is negligible (about 4 × 10 − 4 bpp in our experiments), compared to the bitstream of an image. Although such a na ¨ ıve randomization approach improves adversarial robustness, the compression performance on nor- mal images is degraded by some input transforms, which is further discussed below: • The cropping operations used in [ Xie et al. , 2018 ] are inappropriate due to incomplete reconstruction gi ven by missing content. • The transforms such as rotation, resizing and shifting hav e their corresponding inv erse transforms, but the in- versions are imperfect in general because of the informa- tion loss caused by the transforms, i.e. , x = τ − 1 ( x t ) . 1.0 1.5 2.0 Bits per pix el (BPP) 32 34 36 38 PSNR Original R andom Shif ting R andom P adding R andom R esizing R andom R otating Figure 5: Performance degradation of an image compression model caused by a variety of input transforms. • The zero padding operations utilized in [ Xie et al. , 2018 ] allow us to recov er the original image, b ut the per- formance of the models would be degraded since the paddings lead to out-of-distribution images. Figure 5 demonstrates the performance degradation of the image compression model [ Minnen et al. , 2018 ] on clean im- ages when v arious input transforms are applied. Refer to Ap- pendix C for details. It is not trivial to maintain the perfor- mance for these input transforms without additional training. 4.3 T wo-way Compr ession T o defend against adversarial perturbations while preserving performance on clean images without additional model train- ing, we propose a straightforward and training-free defense technique via two-way compression. Our method is appli- cable to existing compression models without performance degradation on clean images by ef fectively lev eraging the random transform. In the framew ork, we select the better op- tion out of two compression results of the original image and the randomly transformed image. W e summarize the encod- ing and decoding process of the proposed approach on Al- gorithm 1 and Algorithm 2, respectiv ely , where the entropy coding process is omitted for simplicity . Our core idea is to choose the best compression strategy with the lowest loss value out of two different types of com- pression methods, which is feasible due to the av ailability of self-supervision in image compression. The encoding pro- cess for an input image x is as follows. First, we compute the rate-distortion loss of x gi ven by encoding followed by de- coding, without input transform. The encoding and decoding are expressed as ˆ y 1 = Q ( E ( x )) and ˆ x 1 = D ( ˆ y 1 ) , (7) respectiv ely . Then, the rate-distortion loss of input image without transform is calculated by L 1 = − log 2 P ( ˆ y 1 ) + λd ( x , ˆ x 1 ) , (8) where d ( · , · ) is a distortion metric and λ is a Lagrangian mul- tiplier . Next, we compute the rate-distortion loss of x with the input randomization as described in Section 4.2. The encod- ing and decoding with the random input transformation are Algorithm 1 Encoding phase of two-way compression Require : Pretrained image compression model of encoder E , decoder D , quantizer Q , and entrop y model P . Require : Distortion metric d ( · , · ) , Lagrangian multiplier λ , and Image transform set T = { τ 1 , ..., τ n } . Input : Source image x . Output : Compressed latent representation ˆ y ∗ and transform index θ ∗ . 1. Compute the loss for encoding without transform: Encode: ˆ y 1 ← Q ( E ( x )) . Decode: ˆ x 1 ← D ( ˆ y 1 ) . Compute loss: L 1 ← − log 2 P ( ˆ y 1 ) + λd ( x , ˆ x 1 ) . 2. Compute the loss for encoding with random transform: Sample τ θ ∈ T . Apply transformation: x t ← τ θ ( x ) . Encode: ˆ y 2 ← Q ( E ( x t )) . Decode: ˆ x t ← D ( ˆ y 2 ) . Apply in verse transformation: ˆ x 2 ← τ − 1 θ ( ˆ x t ) . Compute loss: L 2 ← − log 2 P ( ˆ y 2 ) + λd ( x , ˆ x 2 ) . 3. Select the latent representation with the lowest loss: if L 1 < L 2 then ˆ y ∗ ← ˆ y 1 . θ ∗ ← 0 . else ˆ y ∗ ← ˆ y 2 . θ ∗ ← θ . end if giv en by Equation (4) to (6), but we redefine the latent repre- sentation and reconstructed image as ˆ y 2 and ˆ x 2 , respecti vely . The rate-distortion loss of input image with the random trans- form is giv en by L 2 = − log 2 P ( ˆ y 2 ) + λd ( x , ˆ x 2 ) . (9) Finally , we determine the optimal compression result ˆ y ∗ and use it as the encoding result, which is giv en by ˆ y ∗ = ˆ y 1 , if L 1 < L 2 . ˆ y 2 , otherwise . (10) For reconstruction, we save the transform index θ ∗ yielding the better result. The decoding process is similar to Equa- tion (6) with an input of ˆ y ∗ . The proposed two-way compression approach pre vents the compression quality degradation on the original images while improving the adversarial robustness of the compres- sion model. The original model performance ( L 1 ) is guaran- teed at least because we select the better option for compres- sion by the comparison between L 1 and L 2 , This attribute is especially valuable for normal images. Besides, the risk of the adversarial attack is mitigated by our input random- ization scheme. The proposed framew ork is simple, easy-to- implement, and ev en free from additional training. Note that this strategy is feasible due to the nature of image compres- sion problem, av ailability of self-supervision, i.e. , the ground- truth that the model has to reconstruct is identical to the input image of the encoder . Algorithm 2 Decoding phase of two-way compression Require : Pretrained decoder D . Require : Image transform set T = { τ 1 , ..., τ n } . Input : Compressed latent representation ˆ y ∗ and transform index θ ∗ . Output : Reconstructed image ˆ x . Decode: ˆ x t ← D ( ˆ y ∗ ) . if θ ∗ = 0 then ˆ x ← ˆ x t . else Apply the in verse transform: ˆ x ← τ − 1 θ ∗ ( ˆ x t ) . end if Computational efficiency Our approach requires more computation in the encoding phase because it has to per - form an e xtra encoding for the transformed image and de- code tw o encoded images, for both the clean and transformed images. Ho we ver , learned compression algorithms in volves sev eral time-consuming modules other than encoders and de- coders, such as entrop y coders and entropy models. Also, we can adopt a lightweight encoding algorithm in our en- coding phase based on masked con volution instead of ex- pensiv e serial prediction, which sa ves computational cost significantly , especially in high-performance models adopt- ing autoregressi ve entropy models [ Minnen et al. , 2018; Cheng et al. , 2020 ] . This trick is frequently used for train- ing models with heavy entropy models [ Minnen et al. , 2018; Minnen and Singh, 2020 ] . Moreover , the costly operation of decoding the bitstream to ˆ y is not needed because ˆ y is already av ailable. The computational cost in the decoding phase is al- most identical except the overhead of applying inv erse trans- form, which is negligible in practice. W e present empirical results related to computational cost in Section 5. Scalability W e can generalize the proposed framew ork to K -way compression for more gain in robustness. W e sample K − 1 transforms from T and choose the best among the K compression results including the one with no transform. In this way , we easily scale-up the robustness of the model with trade-off between the robustness and encoding cost. How- ev er , we show that K = 2 ( i.e. , two-way compression) is practically sufficient in Section 5. 5 Experiments W e now present the experimental results of the proposed de- fense framew ork. 5.1 Experimental Setup The main experiments are conducted on 1000 validation im- ages of 256 × 256 size randomly sampled from the ImageNet dataset [ Russakovsk y et al. , 2015 ] . W e use the pretrained high-bitrate models, Mean & Scale Hyperprior (M&S) [ Min- nen et al. , 2018 ] , Mean & Scale Hyperprior with context model (M&S+C) [ Minnen et al. , 2018 ] , and Anchor (An- chor) [ Cheng et al. , 2020 ] , as in Section 3. For image trans- form, we use the combinations of all elements in T , which in- clude (1) horizontal & vertical flipping and rotating in multi- ples of 90 degrees ( 8 cases), (2) horizontal & v ertical stretch- 0 5 10 15 Bits per pix el (BPP) 34 36 38 PSNR O r i g i n a l + C l e a n O r i g i n a l + V a n i l l a T w o - w a y + C l e a n T w o - w a y + V a n i l l a T w o - w a y + E o T 0 5 10 Bits per pix el (BPP) 34 36 38 PSNR O r i g i n O r i i n a a l + C l e a n l + V a n i l l a g + C l e a n + V a n i l l a + E o T T w o - w a y T w o - w a y T w o - w a y 0 1 2 3 Bits per pix el (BPP) 32 33 34 35 PSNR O r i g i n a l + C l e a n O r i g i n a l + V a n i l l a T w o - w a y + C l e a n T w o - w a y + V a n i l l a T w o - w a y + E o T (a) M&S (b) M&S+C (c) Anchor Figure 6: Rate-distortion performance of models without defense method (Original) and models with our defense method (T wo-way) on clean images (Clean) and adversarial e xamples (V anilla / EoT). Best viewed in color . Figure 7: Bitrate histogram of test samples under rate attacks. ing from 0 to 64 pixels ( 65 × 65 = 4225 cases), and (3) hori- zontal & v ertical shifting from 0 to 64 pixels ( 65 × 65 = 4225 cases). These combinations result in n = |T | ≈ 1 . 43 × 10 8 transforms, where we only require less than 30 bits to store all possible indices. Attack scenarios W e assume that the model weights are known to an attacker . Our defense technique is tested in the following two scenarios depending on whether the attack er is aware of the e xistence of the defense method: • V anilla attack: The attacker is not aware of the defense methods in the encoding algorithm, hence assumes input images are al ways fed to the model without modification ( i.e. , gray-box attack). • Expectation over Transformation (EoT) attack: The at- tacker is aware of our two-way compression algorithm and transforms in T , hence ideally aims to fool all the input transforms including the identity transform ( i.e. , white-box attack). For the vanilla attack, we use the PGD algorithm as in Sec- tion 3 with α = 2 / 255 , ϵ = 4 / 255 , and 50 iterations. The EoT attack [ Athalye et al. , 2018 ] is a strong white-box attack method for the two-way compression, which is often effectiv e on the input randomization-based defense techniques [ Xie et al. , 2018; Guo et al. , 2018 ] in classification. Specifically , EoT attack randomly selects 24 target transforms from T and av- erage the losses of the target transforms at each optimization step of the PGD algorithm. Model Original T wo-way M&S 0.0219 0.0391 M&S+C 0.7617 0.7952 Anchor 0.7649 0.8437 T able 2: A verage encoding time of models in seconds. 5.2 Results Main r esults Figure 6 presents the performance of the pro- posed defense technique against rate attacks. Overall, the proposed approach consistently improves the rob ustness of the models against the attacks. In comparison to the severe performance de gradation of original models by the attacks (‘Original + V anilla’), our method mitigates the adversarial effects (‘T wo-way + V anilla’ and ‘T wo-way + EoT’). Further- more, the performance of our method on clean images (‘T wo- way + Clean’) is almost identical to the original one (‘Orig- inal + Clean’). The attacks with multiple targets in EoT are more ef fective than the v anilla attack, which is highlighted in the Anchor model. Figure 7 visualizes the bitrate distribution of test samples for the highest bitrate models tested in the experiments for Figure 6(a). Note that the results of our method exhibit low bpps by avoiding failure cases with high probability . The his- togram of rate-distortion loss is provided in Appendix D. Scalability and na ¨ ıve input randomization Figure 8(a) shows the defense results by varying K in the K -way com- pression. W e used the Kodak dataset [ K odak, 1993 ] and iter- ativ ely ev aluated performance 40 times for each sample. Us- ing a larger K further improves the robustness of the model although the performance gains are saturated; two-way com- pression is sufficient for defense in practice. Also, we test the na ¨ ıve approach, applying the input randomization in im- age compression as described in Section 4.2, and report the results denoted by ‘Na ¨ ıve’ in Figure 8(a). The difference be- tween the na ¨ ıve and two-way compression is that the former always encodes an input image with a random input transform while sharing T . Our defense framew ork clearly outperforms the na ¨ ıve approach for both clean and perturbed images. 0 1 2 3 Bits per pix el (BPP) 32 34 36 38 PSNR O r i g i n a l + C l e a n 1 0 - w a y + C l e a n 3 - w a y + C l e a n 2 - w a y + C l e a n N a ïve + C l e a n 1 0 - w a y + E o T 3 - w a y + E o T 2 - w a y + E o T + E o T N a ïve 1 2 3 Bits per pix el (BPP) 34 36 38 PSNR O r i g i n a l + C l e a n + 2 - w a y C l e a n + A d v t C l e a n A d v t + V a n i l l a 2 - w a y + E o T 1 2 3 Bits per pix el (BPP) 32 34 36 38 PSNR O r i g i n a l + C l e a n O r i g i n a l + V a n i l l a 2 - w a y + C l e a n 2 - w a y + V a n i l l a 2 - w a y + E o T (a) (b) (c) Figure 8: Rate-distortion results of M&S models for extensi ve studies. (a) Results of K -way compression for multiple K values and direct applicaiton of input randomization on image compression (Na ¨ ıve). (b) Performance comparison between two-way compression and adversarial training (Advt). (c) Results of FD A attacks on original models and ones with our defense method. Clean Decoded P er turbed Without def ense With def ense 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.4192 / 35.6980 / 0.9770 0.6902 / 36.8520 / 0.9952 2.01 0.8157 / 10.0322 / 0.5519 0.7717 / 4.4495 / 0.9876 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.4192 / 35.6980 / 0.9770 0.6902 / 36.8520 / 0.9952 2.01 0.8157 / 10.0322 / 0.5519 0.7717 / 4.4495 / 0.9876 0.6902 / 36.8520 / 0.9952 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.4192 / 35.6980 / 0.9770 0.6902 / 36.8520 / 0.9952 2.01 0.8157 / 10.0322 / 0.5519 0.7717 / 4.4495 / 0.9876 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.4192 / 35.6980 / 0.9770 0.6902 / 36.8520 / 0.9952 2.01 0.8157 / 10.0322 / 0.5519 0.7717 / 4.4495 / 0.9876 0.8157 / 10.0322 / 0.5519 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.4192 / 35.6980 / 0.9770 0.6902 / 36.8520 / 0.9952 2.01 0.8157 / 10.0322 / 0.5519 0.7717 / 34.4495 / 0.9876 0.7717 / 34.4495 / 0.9876 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.6902 / 36.8520 / 0.9952 0.8157 / 10.0322 / 0.5519 0.7717 / 34.4495 / 0.9876 0.5973 / 38.9102 / 0.9938 13.1780 / 35.4537 / 0.9818 0.6820 / 35.6681 / 0.9814 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.6902 / 36.8520 / 0.9952 0.8157 / 10.0322 / 0.5519 0.7717 / 34.4495 / 0.9876 0.5973 / 38.9102 / 0.9938 13.1780 / 35.4537 / 0.9818 0.6820 / 35.6681 / 0.9814 0.5973 / 38.9102 / 0.9938 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.6902 / 36.8520 / 0.9952 0.8157 / 10.0322 / 0.5519 0.7717 / 34.4495 / 0.9876 0.5973 / 38.9102 / 0.9938 13.1780 / 35.4537 / 0.9818 0.6820 / 35.6681 / 0.9814 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.6902 / 36.8520 / 0.9952 0.8157 / 10.0322 / 0.5519 0.7717 / 34.4495 / 0.9876 0.5973 / 38.9102 / 0.9938 13.1780 / 35.4537 / 0.9818 0.6820 / 35.6681 / 0.9814 13.1780 / 35.4537 / 0.9818 0.3258 / 39.3297 / 0.9873 0.4535 / 10.0249 / 0.5533 0.4192 / 35.6980 / 0.9770 0.3262 / 39.3297 / 0.9873 0.6902 / 36.8520 / 0.9952 0.8157 / 10.0322 / 0.5519 0.7717 / 34.4495 / 0.9876 0.5973 / 38.9102 / 0.9938 13.1780 / 35.4537 / 0.9818 0.6820 / 35.6681 / 0.9814 0.6820 / 35.6681 / 0.9814 Figure 9: Qualitative results of distortion attack (top) and rate attack (bottom). The first and second columns: original images and decoded results. The third and fourth columns: perturbed images and decoded results without our defense method. The last column: decoded results for the adversarial e xamples with our defense method. The yellow annotations denote bits per pixel (bpp)/PSNR (dB)/MS-SSIM. Comparison with adversarial training Figure 8(b) com- pares our defense method with adversarial training typically used in classification task. W e fine-tune the pretrained M&S models using both the original images and the adversarial ex- amples generated by FGSM with random initializations, fol- lowing [ W ong et al. , 2020 ] . Our method outperforms the ad- versarial training in terms of the rob ustness to the attacks and the performance on clean images, ev en without training. Generalizability T o demonstrate the generalizability of the proposed defense method, we additionally test a feature- based attack method, feature disruptive attack (FD A) [ Gane- shan et al. , 2019 ] . For faster ev aluation, we randomly sample 100 images from the test set and iteratively measure the per- formance 10 times for each sample. As sho wn in Figure 8(c), our method consistently improv es the robustness to FD A. Encoding time T able 2 compares the encoding time of the original models and the models with our two-way compres- sion technique on a single T itan Xp GPU. The result shows the ef ficiency of our defense method. Especially , the increase of encoding time is marginal for the high performance models (M&S+C and Anchor) by utilizing masked conv olutions for the loss computation as discussed in Section 4.3. Note that the extra cost for decoding is truly ne gligible and not tested. Qualitative results Figure 9 qualitativ ely compares the im- pact of attacks and our defense methods along with the recon- structions of clean images. Our defense methods decode the adversarial images as well as the clean ones, while maintain- ing a low bitrate that is competiti ve with the clean images. 6 Conclusion W e inv estigated the vulnerability of the learned image com- pression models and designed a simple yet effecti ve defense method for image compression. W e observe that the perfor- mance of the recent image compression models can be eas- ily harmed by the basic adversarial attacks in terms of rate and distortion. The na ¨ ıve defense approaches for image com- pression inevitably lead to performance de gradation on clean images. T o address this, we present a robust defense frame- work for image compression that requires no additional train- ing and preserves the original performance on clean images by exploiting the input randomization and characteristics of the self-supervised task. The proposed algorithm computes the rate-distortion losses of the source image with random input transformation and identity transform, and chooses the best option in encoding. The combination of these two opera- tions turns out to be effecti ve while incurring a small amount of additional cost in the encoding phase. Our framework is free from extensi ve training and modification of existing models, and can be easily integrated with various existing models. This property is particularly desirable for robust im- age compression algorithms exposed to white-box adversarial attacks, where any trained models are vulnerable and unreli- able. W e demonstrate the effecti veness of the proposed algo- rithm in white-box and gray-box attack scenarios and analyze the characteristics of our approach. References [ Athalye et al. , 2018 ] Anish Athalye, Logan Engstrom, An- drew Ilyas, and Ke vin Kwok. Synthesizing robust adver - sarial examples. In International confer ence on machine learning , pages 284–293. PMLR, 2018. [ Ball ´ e et al. , 2018 ] Johannes Ball ´ e, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. V ariational image compression with a scale hyperprior . In ICLR , 2018. [ B ´ egaint et al. , 2020 ] Jean B ´ egaint, F abien Racap ´ e, Simon Feltman, and Akshay Pushparaja. CompressAI: a PyT orch library and e valuation platform for end-to-end compres- sion research. arXiv pr eprint arXiv:2011.03029 , 2020. [ Carlini and W agner , 2017 ] Nicholas Carlini and David W agner . T owards e valuating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp) , pages 39–57. IEEE, 2017. [ Chen and Ma, 2023 ] T ong Chen and Zhan Ma. T ow ards ro- bust neural image compression: Adversarial attack and model finetuning. TCSVT , 2023. [ Cheng et al. , 2020 ] Zhengxue Cheng, Heming Sun, Masaru T akeuchi, and Jiro Katto. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In CVPR , pages 7939–7948, 2020. [ Ganeshan et al. , 2019 ] Aditya Ganeshan, V ivek BS, and R V enkatesh Babu. Fda: Feature disruptive attack. In ICCV , 2019. [ Goodfellow et al. , 2015 ] Ian J Goodfellow , Jonathon Shlens, and Christian Szegedy . Explaining and harnessing adversarial e xamples. ICLR , 2015. [ Guo et al. , 2018 ] Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens V an Der Maaten. Countering adver - sarial images using input transformations. ICLR , 2018. [ Kannan et al. , 2018 ] Harini Kannan, Ale xe y Kurakin, and Ian Goodfellow . Adversarial logit pairing. arXiv preprint arXiv:1803.06373 , 2018. [ K odak, 1993 ] Eastman K odak. K odak lossless true color image suite (PhotoCD PCD0992), 1993. [ Liu et al. , 2023 ] Kang Liu, Di W u, Y angyu W u, Y iru W ang, Dan Feng, Benjamin T an, and Siddharth Garg. Manipula- tion attacks on learned image compression. T AI , 2023. [ Madry et al. , 2018 ] Aleksander Madry , Aleksandar Makelov , Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. T owards deep learning models resistant to adversarial attacks. ICLR , 2018. [ Minnen and Singh, 2020 ] David Minnen and Saurabh Singh. Channel-wise autoregressi ve entropy models for learned image compression. In ICIP , pages 3339–3343. IEEE, 2020. [ Minnen et al. , 2018 ] David Minnen, Johannes Ball ´ e, and George D T oderici. Joint autoregressiv e and hierarchi- cal priors for learned image compression. NeurIPS , 31:10771–10780, 2018. [ Moosavi-Dezfooli et al. , 2016 ] Seyed-Mohsen Moosavi- Dezfooli, Alhussein Fawzi, and P ascal Frossard. Deep- fool: a simple and accurate method to fool deep neural networks. In CVPR , pages 2574–2582, 2016. [ Papernot et al. , 2016 ] Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow . T ransferability in machine learning: from phenomena to black-box attacks using adv ersarial samples. arXiv pr eprint arXiv:1605.07277 , 2016. [ Rissanen and Langdon, 1981 ] Jorma Rissanen and Glen Langdon. Uni versal modeling and coding. IEEE T rans. Inf. Theory , 27(1):12–23, 1981. [ Russakovsk y et al. , 2015 ] Olga Russakovsk y , Jia Deng, Hao Su, Jonathan Krause, Sanjee v Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy , Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recognition challenge. IJCV , 115(3):211–252, 2015. [ Samangouei et al. , 2018 ] Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-gan: Protecting classifiers against adversarial attacks using generati ve models. ICLR , 2018. [ Sui et al. , 2023 ] Y ang Sui, Zhuohang Li, Ding Ding, Xiang Pan, Xiaozhong Xu, Shan Liu, and Zhenzhong Chen. Re- construction distortion of learned image compression with imperceptible perturbations. ICML W , 2023. [ Szegedy et al. , 2014 ] Christian Szegedy , W ojciech Zaremba, Ilya Sutske ver , Joan Bruna, Dumitru Erhan, Ian Goodfellow , and Rob Fergus. Intriguing properties of neural networks. ICLR , 2014. [ T ram ` er et al. , 2018 ] Florian T ram ` er , Alex ey K urakin, Nico- las Papernot, Ian Goodfello w , Dan Boneh, and Patrick Mc- Daniel. Ensemble adversarial training: Attacks and de- fenses. ICLR , 2018. [ W ong et al. , 2020 ] Eric W ong, Leslie Rice, and J Zico K olter . Fast is better than free: Re visiting adversarial train- ing. arXiv pr eprint arXiv:2001.03994 , 2020. [ Xie et al. , 2018 ] Cihang Xie, Jianyu W ang, Zhishuai Zhang, Zhou Ren, and Alan Y uille. Mitigating adversarial effects through randomization. In ICLR , 2018. [ Xu et al. , 2018 ] W eilin Xu, David Ev ans, and Y anjun Qi. Feature squeezing: Detecting adversarial examples in deep neural networks. NDSS , 2018. [ Y u et al. , 2023 ] Y i Y u, Y ufei W ang, W enhan Y ang, Shijian Lu, Y ap-Peng T an, and Ale x C Kot. Backdoor attacks against deep image compression via adaptiv e frequency trigger . In CVPR , 2023. A ppendix A Impact of Model Complexity to Robustness T o in vestigate the robustness of image compression models depending on the model complexity , we trained a lightweight variant of high-bitrate M&S model, by halving its channel size. Figure 10 compares the results of the original model (18M parameters) and the lightweight model (7M parame- ters) under rate attacks. While the model with higher capac- ity achiev es slightly better performance on clean images, it suffers from significant failures on perturbed images. This implies that the model with higher capacity is more suscepti- ble to adversarial attacks and rather o verfitted. B Results of Distortion Attacks Figure 11 presents the result of distortion attack on M&S model with ϵ = 4 / 255 for PGD algorithm. The attacks for poor reconstruction quality successfully degraded the model performance. C Details of Input T ransforms This section e xplains the details of the image transforms used in the e xperiments for Figure 5 of the main paper . The exam- ples of the transformed images are illustrated in Figure 12. For the image transforms, we use the operaitons including (1) horizontal and vertical shifting from 0 to 64 pixels, (2) horizontal and vertical zero-padding from 0 to 32 pixels, (3) horizontal & vertical stretching from 0 to 64 pixels, and (4) rotating from -10 to 10 degrees. D Loss Histogram Under Attacks Figure 13 visualizes the rate-distortion loss value distribution of test samples for the highest bitrate models tested in the experiments for Figure 6(a) of the main paper . Note that the results of our method exhibit low losses by av oiding extreme failure cases with high probability . E Comparison to Hand-crafted Codecs Figure 14 compares the compression performance between the attacked models and hand-crafted codecs. W e observe the severe performance degradation of the attacked models, which is ev en worse than the hand-crafted codecs. Figure 10: Rate-distortion results of original model and its lightweight verison with the half channel size. Figure 11: Rate-distortion result of distortion attacks. Original Shifting Padding Rotating Resizing Figure 12: Examples of image transforms used in the experiments. Figure 13: Rate-distortion loss histogram for test samples under rate attacks. 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 Bits per pix el (BPP) 30.0 32.5 35.0 37.5 PSNR Clean- Anchor Clean-M&S+C Clean-M&S Clean- BPG (4:4:4) Clean- W ebP Clean- JPEG (4:2:0) PGD- Anchor PGD-M&S+C PGD-M&S Figure 14: Rate-distortion results of attacked learned image compression models and traditional codecs.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment