Efficiently Training A Flat Neural Network Before It has been Quantizated

February 22, 2026

Reading time: 1 minute

...

📝 Original Info

Title: Efficiently Training A Flat Neural Network Before It has been Quantizated
ArXiv ID: 2511.01462
Date: 2025-11-03
Authors: ** - 논문에 저자 정보가 제공되지 않았습니다. (정보 없음) **

📝 Abstract

Post-training quantization (PTQ) for vision transformers (ViTs) has garnered significant attention due to its efficiency in compressing models. However, existing methods typically overlook the relationship between a well-trained NN and the quantized model, leading to considerable quantization error for PTQ. However, it is unclear how to efficiently train a model-agnostic neural network which is tailored for a predefined precision low-bit model. In this paper, we firstly discover that a flat full precision neural network is crucial for low-bit quantization. To achieve this, we propose a framework that proactively pre-conditions the model by measuring and disentangling the error sources. Specifically, both the Activation Quantization Error (AQE) and the Weight Quantization Error (WQE) are statistically modeled as independent Gaussian noises. We study several noise injection optimization methods to obtain a flat minimum. Experimental results attest to the effectiveness of our approach. These results open novel pathways for obtaining low-bit PTQ models.

Efficiently Training A Flat Neural Network Before It has been Quantizated

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

Importance Ranking in Complex Networks via Influence-aware Causal Node Embedding

MM-UNet: Morph Mamba U-shaped Convolutional Networks for Retinal Vessel Segmentation

Pinching Antennas Meet AI in Next-Generation Wireless Networks

Start searching

No results found