MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs

MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Knowledge editing (KE) enables precise modifications to factual content in large language models (LLMs). Existing KE methods are largely designed for dense architectures, limiting their applicability to the increasingly prevalent sparse Mixture-of-Experts (MoE) models that underpin modern scalable LLMs. Although MoEs offer strong efficiency and capacity scaling, naively adapting dense-model editors is both computationally costly and prone to routing distribution shifts that undermine stability and consistency. To address these challenges, we introduce MoEEdit, the first routing-stable framework for parameter-modifying knowledge editing in MoE LLMs. Our method reparameterizes expert updates via per-expert null-space projections that keep router inputs invariant and thereby suppress routing shifts. The resulting block-structured optimization is solved efficiently with a block coordinate descent (BCD) solver. Experiments show that MoEEdit attains state-of-the-art efficacy and generalization while preserving high specificity and routing stability, with superior compute and memory efficiency. These results establish a robust foundation for scalable, precise knowledge editing in sparse LLMs and underscore the importance of routing-stable interventions.


💡 Research Summary

Knowledge editing (KE) aims to modify specific factual associations in large language models (LLMs) without harming their overall capabilities. While many KE methods have been developed for dense Transformers, they do not translate well to the increasingly popular Mixture‑of‑Experts (MoE) architecture, where only a small subset of experts is activated per token. Directly applying dense‑model editors to MoEs leads to three major problems: (1) computational explosion because updating all experts multiplies cost by the total number of experts; (2) inter‑expert coupling, as the final MoE output is a weighted sum of several experts, making a change in one expert diluted or causing side‑effects; and (3) routing distribution shift, where perturbations in one layer change the inputs to downstream routers, causing them to select different experts and destabilizing the model.

MoEEdit is introduced as the first routing‑stable, parameter‑modifying KE framework for MoE LLMs. The core idea is to re‑parameterize each expert’s weight update so that it lies in the null‑space of a preservation set of prompts. Concretely, for each expert n a matrix of preservation keys K₀ⁿ is collected, its covariance K₀ⁿK₀ⁿᵀ is eigendecomposed, and a projector Pⁿ onto the orthogonal complement of the span of K₀ⁿ is built. The actual edit Δⁿ is expressed as ˆΔⁿ Pⁿ, where ˆΔⁿ is a free variable. Because Pⁿ kᵢⁿ = 0 for any preservation prompt i, the update cannot affect the expert’s output on those prompts, guaranteeing that the downstream router input change δuℓ ≈ 0 and, via the softmax Jacobian, that the routing shift δgℓ is essentially zero. Thus, routing stability is mathematically enforced without needing an explicit regularizer.

After this projection, the optimization problem becomes block‑structured: each expert’s Δⁿ appears only in terms involving that expert’s projected keys. Solving the global closed‑form solution would require inverting a (N·dₖ)×(N·dₖ) matrix, which is infeasible for realistic MoE sizes. MoEEdit therefore adopts a randomized block coordinate descent (BCD) algorithm. In each iteration a random subset of experts is selected, and the corresponding sub‑problem (a small ridge regression) is solved exactly. This procedure scales linearly with the expert hidden dimension dₖ, not with the total number of experts N, and naturally respects the router weights gᵢ,ₙ, focusing updates on the actually active experts. The BCD loop repeats until convergence, yielding an efficient solution that respects both the edit objective and the routing‑stability constraint.

Experiments are conducted on a 30‑billion‑parameter MoE model (Qwen‑3‑30B‑A3B) and evaluated on the COUNTERFACT and zsRE benchmarks. Compared with naïvely adapting dense editors (MEMIT, ROME, etc.) to MoE, MoEEdit achieves: (i) a >90 % reduction in routing distribution shift measured by KL divergence and Jaccard similarity of Top‑K expert sets; (ii) 3–5× lower GPU memory consumption and wall‑clock time thanks to the block‑wise solver; (iii) higher or comparable edit success rates (accuracy 92 % vs. 88 % on COUNTERFACT) and specificity; and (iv) robust performance under sequential edits, with negligible accumulation of errors. Ablation studies confirm that the per‑expert null‑space projection is essential for routing stability, while the randomized BCD provides the necessary computational tractability.

In summary, MoEEdit delivers a principled, efficient, and routing‑stable method for knowledge editing in sparse MoE LLMs. By explicitly preserving router inputs through null‑space projections and solving the resulting block‑structured problem with a scalable BCD algorithm, it overcomes the unique challenges of MoE architectures. This work opens the door to safe, precise, and scalable factual updates in the next generation of massive, expert‑based language models, and suggests future extensions to dynamic expert selection and multimodal MoE systems.


Comments & Academic Discussion

Loading comments...

Leave a Comment