OJBKQ: Objective-Joint Babai-Klein Quantization

OJBKQ: Objective-Joint Babai-Klein Quantization
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Post-training quantization (PTQ) is widely used to compress large language models without retraining. However, many existing weight-only methods rely on heuristic objectives and greedy rounding, thus leading to noticeable degradation under low-bit quantization. In this work, we introduce OJBKQ (Objective-Joint Babai-Klein Quantization with K-Best Sampling), a layer-wise PTQ method that formulates weight quantization as a joint optimization problem over activations and weights. This formulation results in a multiple-right-hand-side box-constrained integer least squares (BILS) problem in each layer, which is NP-hard. For each column of the weight matrix, we apply an extended Babai nearest-plane algorithm and an extended version of Klein’s randomized Babai algorithm to find the minimum-residual Babai-Klein point, a sub-optimal solution to the BILS problem. Experimental results on large language models show that OJBKQ achieves lower perplexity at 3-4 bits compared to existing PTQ approaches, while maintaining comparable computational cost.


💡 Research Summary

The paper introduces OJBKQ (Objective‑Joint Babai‑Klein Quantization with K‑Best Sampling), a novel layer‑wise post‑training quantization (PTQ) framework designed for large language models (LLMs) under aggressive 3‑bit and 4‑bit constraints. The authors begin by reformulating the weight‑only PTQ problem as a multiple‑right‑hand‑side box‑constrained integer least‑squares (BILS) problem for each layer. This formulation reveals that PTQ can be interpreted as a lattice decoding task, which is NP‑hard in general.

To obtain tractable sub‑optimal solutions, the method treats each column of the weight matrix as an independent BILS sub‑problem. For each column, a deterministic Babai nearest‑plane algorithm is first applied in its box‑constrained variant, yielding a fast but potentially sub‑optimal integer candidate. To improve upon the greedy nature of Babai, the authors extend Klein’s randomized Babai algorithm to the box‑constrained setting and run it K independent times (the “Random‑K” or K‑Best strategy). Each run samples integer values according to a temperature‑controlled distribution around the real‑valued least‑squares solution, thereby exploring a richer set of lattice points. The candidate with the smallest residual is selected as the “Babai‑Klein point.”

A second major contribution is the Joint Target Alignment (JTA) scoring function. Existing PTQ methods either align quantized outputs to runtime‑consistent activations (which may be noisy) or to full‑precision references (which ignore error propagation). JTA introduces a continuous knob µ∈


Comments & Academic Discussion

Loading comments...

Leave a Comment