Calibrated Adaptation: Bayesian Stiefel Manifold Priors for Reliable Parameter-Efficient Fine-Tuning

Calibrated Adaptation: Bayesian Stiefel Manifold Priors for Reliable Parameter-Efficient Fine-Tuning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Parameter-efficient fine-tuning methods such as LoRA enable practical adaptation of large language models but provide no principled uncertainty estimates, leading to poorly calibrated predictions and unreliable behavior under domain shift. We introduce Stiefel-Bayes Adapters (SBA), a Bayesian PEFT framework that places a Matrix Langevin prior over orthonormal adapter factors on the Stiefel manifold $\St$ and performs approximate posterior inference via tangent space Laplace approximation with geodesic retraction. Unlike Gaussian priors in flat space projected onto orthogonality constraints, our prior on the manifold naturally encodes the inductive bias that adapter subspaces should be well conditioned and orthogonal, while the posterior provides calibrated predictive uncertainty without recalibration. We prove formally that the tangent space approximation strictly avoids the structural variance inflation inherent in projecting from ambient space, establishing a rigorous theoretical advantage for intrinsic manifold inference. Across GLUE and SuperGLUE benchmarks on RoBERTa-large, LLaMA-2-7B, LLaMA-2-13B, Mistral-7B, and Qwen2.5-7B, domain shift evaluations, selective prediction protocols, and an abstractive summarization task, SBA achieves task performance comparable to LoRA and DoRA while reducing Expected Calibration Error by 18 to 34% over deterministic baselines, improving selective prediction AUROC by 12 to 25% under domain shift, and outperforming deep ensembles of five LoRA models on OOD detection at a fraction of the parameter cost. Our results demonstrate that where you place uncertainty, on the right geometric structure, matters more than simply adding any Bayesian treatment to adapters.


💡 Research Summary

The paper addresses a critical gap in parameter‑efficient fine‑tuning (PEFT) for large language models (LLMs): the lack of principled uncertainty estimates. While methods such as LoRA, DoRA, and orthogonal adapters achieve strong task performance by updating only a low‑rank subspace of the pretrained weights, they produce point estimates and consequently suffer from over‑confidence, especially under domain shift. Post‑hoc calibration techniques (e.g., temperature scaling) can improve in‑distribution calibration but degrade when the data distribution changes, limiting their practical usefulness in safety‑critical applications.

To remedy this, the authors propose Stiefel‑Bayes Adapters (SBA), the first Bayesian PEFT framework that places a probability distribution directly on the Stiefel manifold, the space of orthonormal matrices. Each adapted layer’s weight update is expressed as a singular‑value decomposition‑style factorization
\


Comments & Academic Discussion

Loading comments...

Leave a Comment