PAPER: Privacy-Preserving Convolutional Neural Networks using Low-Degree Polynomial Approximations and Structural Optimizations on Leveled FHE
Recent work using Fully Homomorphic Encryption (FHE) has made non-interactive privacy-preserving inference of deep Convolutional Neural Networks (CNN) possible. However, the performance of these methods remain limited by their heavy reliance on bootstrapping, a costly FHE operation applied across multiple layers, severely slowing inference. Moreover, they depend on high-degree polynomial approximations of non-linear activations, which increase multiplicative depth and reduce accuracy by 2-5% compared to plaintext ReLU models. In this work, we close the accuracy gap between FHE-based non-interactive CNNs and their plaintext counterparts, while also achieving faster inference than existing methods. We propose a quadratic polynomial approximation of ReLU, which achieves the theoretical minimum multiplicative depth for non-linear activations, together with a penalty-based training strategy. We further introduce structural optimizations that reduce the required FHE levels in CNNs by a factor of five compared to prior work, allowing us to run deep CNN models under leveled FHE without bootstrapping. To further accelerate inference and recover accuracy typically lost with polynomial approximations, we introduce parameter clustering along with a joint strategy of data layout and ensemble techniques. Experiments with VGG and ResNet models on CIFAR and Tiny-ImageNet datasets show that our approach achieves up to $4\times$ faster private inference than prior work, with accuracy comparable to plaintext ReLU models.
💡 Research Summary
**
This paper addresses the long‑standing scalability bottleneck of non‑interactive privacy‑preserving inference for deep convolutional neural networks (CNNs) under Fully Homomorphic Encryption (FHE). While recent works have demonstrated that FHE enables a “fire‑and‑forget” inference paradigm, they rely heavily on bootstrapping to support the large multiplicative depth required by modern networks. Bootstrapping is computationally expensive and dominates inference latency, making FHE‑based services impractical for real‑world applications.
The authors propose a comprehensive framework that eliminates the need for bootstrapping by drastically reducing the multiplicative depth of the entire computation graph. Their approach consists of four tightly integrated techniques:
- Quadratic Polynomial Approximation of ReLU – They replace the ReLU activation with a degree‑2 polynomial (x^{2}+c_{1}x+c_{0}). This achieves the theoretical minimum multiplicative depth of one for non‑linear layers. To overcome the well‑known “escaping activation” problem that plagues low‑degree approximations, they introduce a penalty term in the loss function that explicitly constrains intermediate activations to remain within a bounded interval (
Comments & Academic Discussion
Loading comments...
Leave a Comment