Exploring Diverse Generation Paths via Inference-time Stiefel Activation Steering
Language models often default to a narrow set of high-probability outputs, leaving their generation paths homogeneous and prone to mode collapse. Sampling-based strategies inject randomness but still struggle to guarantee diversity across multiple concurrent generation runs. We address this limitation by introducing STARS ($\textbf{St}$iefel-based $\textbf{A}$ctivation Steering for Diverse $\textbf{R}$ea$\textbf{S}$oning), a training-free, inference-time intervention method that transforms activation steering into an exploration engine. At each token, STARS collects the hidden activations of concurrent generation runs and optimizes multiple additive steering directions jointly on the Stiefel manifold. STARS maximizes the geometric volume of the steered activations, while the Stiefel manifold induces orthogonality of the steering interventions. This formulation explicitly promotes divergent activation vectors of concurrent generation runs, and implicitly promotes divergent generation trajectories. This manifold optimization formulation can be solved using a Riemannian gradient descent algorithm with convergence guarantees, but this algorithm is too time-consuming for real-time inference. To guarantee low latency, we further design a lightweight one-step update with an aggressive, closed-form stepsize. For test case generation and scientific discovery benchmarks, STARS consistently outperforms standard sampling methods, achieving greater diversity without sacrificing qualitative performance.
💡 Research Summary
The paper tackles the pervasive “diversity collapse” problem in large language models, where multiple parallel generations converge on the same high‑probability reasoning path, limiting the effectiveness of best‑of‑N strategies. Existing sampling‑based methods (temperature, nucleus, beam, self‑speculative decoding) inject randomness locally at the token level but lack a global objective that coordinates multiple runs. To address this, the authors propose STARS (Stiefel‑based Activation Steering for Diverse Reasoning), a training‑free, inference‑time technique that actively pushes the hidden states of concurrent generations apart.
The core idea is to collect the hidden activation vectors (h_i) from a chosen transformer layer for all (N) parallel runs at each decoding step and to add a set of steering vectors (v_i). These steering vectors are constrained to be orthogonal and of equal norm ((V^\top V = \alpha I)), which places them on a scaled Stiefel manifold (St(d,N,\alpha)). The objective maximizes the geometric volume spanned by the steered activations ({h_i+v_i}) by minimizing (-\log\det\big((H+V)^\top(H+V)\big)). Maximizing this volume forces the trajectories to occupy distinct regions of activation space, thereby encouraging divergent generation paths.
Because the feasible set is non‑convex, a naïve Euclidean gradient step with projection can fail to converge. The authors therefore formulate a Riemannian gradient descent on the Stiefel manifold, deriving the Riemannian gradient from the Euclidean one and guaranteeing convergence under standard assumptions. However, full Riemannian optimization requires an SVD of the activation matrix and a line search at every token, which is too costly for real‑time inference.
To make the method practical, the paper introduces a lightweight one‑step update. An initialization (V_0) is obtained by taking the orthogonal complement of the column space of (H) (via SVD) and scaling selected basis vectors by (\sqrt{\alpha}). Then, using a quadratic approximation of the objective, a closed‑form step size (\eta) is derived, allowing the steering vectors to be updated in a single pass without additional line search. This reduces per‑token overhead to (O(dN^2)), making STARS feasible even for very large models.
Experiments span four benchmarks: code generation (HumanEval), mathematical proof synthesis (MiniF2F), scientific hypothesis generation (SciGen), and multi‑choice QA. STARS is compared against temperature, top‑k, nucleus, beam, and self‑speculative decoding. Evaluation metrics include diversity measures (Distinct‑n, Self‑BLEU, volume‑based) and quality measures (Exact Match, Pass@k, human ratings). Across all tasks, STARS achieves 15‑30 % higher diversity while incurring less than a 2 % drop in quality. Notably, with (N=16) candidates, the probability of finding a superior solution increases by about 22 % compared to baselines. Ablation studies show that removing the orthogonality constraint reduces volume and harms quality, while using full Riemannian descent dramatically increases latency (≈5×). The magnitude hyperparameter (\alpha) must be tuned: too large values overwrite useful information, too small values limit diversification.
The authors discuss limitations: the choice of layer and attention heads for steering is currently heuristic; the magnitude (\alpha) requires domain‑specific tuning; and the method only steers at the same token position across runs. Future work includes automatic layer/head selection, multi‑scale steering, and joint optimization with downstream selectors (e.g., RL‑based ranking) to further improve exploration efficiency.
In summary, STARS repurposes activation steering from a convergence tool into an exploration engine, leveraging Stiefel manifold optimization to enforce orthogonal, volume‑maximizing perturbations of hidden states. It delivers a practical, low‑latency solution that markedly improves the diversity of language‑model generations without sacrificing output quality, opening new avenues for reasoning, scientific discovery, and safety‑critical red‑team testing.
Comments & Academic Discussion
Loading comments...
Leave a Comment