Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Large Vision-Language Models (LVLMs) are widely adopted for their strong multimodal capabilities, yet they raise serious concerns such as privacy leakage and harmful content generation. Machine unlearning has emerged as a promising solution for removing the influence of specific data from trained models. However, existing approaches largely rely on gradient-based optimization, incurring substantial computational costs for large-scale LVLMs. To address this limitation, we propose Knowledge Vector Weakening (KVW), a training-free unlearning method that directly intervenes in the full model without gradient computation. KVW identifies knowledge vectors that are activated during the model’s output generation on the forget set and progressively weakens their contributions, thereby preventing the model from exploiting undesirable knowledge. Experiments on the MLLMU and CLEAR benchmarks demonstrate that KVW achieves a stable forget-retain trade-off while significantly improving computational efficiency over gradient-based and LoRA-based unlearning methods.

💡 Research Summary

The paper tackles the pressing problem of removing specific unwanted data (the “forget set”) from large vision‑language models (LVLMs) without the prohibitive cost of full retraining. Existing unlearning approaches either rely on gradient‑based optimization, which is computationally intensive for models with billions of parameters, or on parameter‑efficient fine‑tuning techniques such as LoRA. While LoRA reduces the number of updated parameters, it confines changes to a low‑rank subspace; if the knowledge to be erased is distributed across the full representation space, LoRA may fail or require extensive hyper‑parameter search, negating its efficiency benefits.

The authors propose Knowledge Vector Weakening (KVW), a training‑free unlearning method that operates solely with forward passes. KVW is built on the observation that a large portion of factual and conceptual knowledge in transformer‑based models resides in the Feed‑Forward Network (FFN) layers. An FFN can be expressed as FFN(x)=f(xKᵀ) · V, where K is a key matrix, V is a value matrix, and f is a non‑linear activation. For a given input, the vector C = f(xKᵀ) contains knowledge coefficients that weight each row vᵢ of V, which the authors term knowledge vectors. The output is thus a weighted sum of these vectors, making the FFN a parametric memory that retrieves knowledge in an input‑dependent manner.

KVW proceeds in three stages:

Coefficient Extraction – The model processes the forget set and the retain set separately, recording the knowledge coefficients Cᶠ and Cʳ for every FFN layer at the time steps corresponding to answer tokens.
Forget Knowledge Accessor (FKA) Computation – To isolate vectors that are highly active for the forget set but not for the retain set, KVW computes a log‑ratio A = max(0, log(Cᶠ / Cʳ)). This yields a non‑negative relevance score per knowledge vector; larger A means the vector is a prime candidate for removal.
Vector Weakening – Each knowledge vector vᵢ is scaled by a gate function g(Aᵢ) = exp(−γ·Aᵢ), where γ is a user‑defined strength hyper‑parameter. The scaled vectors \tilde{vᵢ}=g(Aᵢ)·vᵢ replace the original rows of V during inference. Because the scaling is applied during the forward pass, no gradients are computed, no optimizer is needed, and the model’s parameters are not permanently altered.

The method is evaluated on two representative LVLM unlearning benchmarks:

MLLMU‑Bench – a suite of forget ratios (5 %, 10 %, 15 %) with strict retain‑performance constraints.
CLEAR – a VQA‑style benchmark where specific question‑answer pairs must be forgotten.

Experiments use the LLaVA‑1.5‑7B model and compare KVW against several gradient‑based baselines (GA, GD, KL‑divergence, NPO) and a LoRA‑based baseline (MMU*). Results show that KVW consistently achieves a superior forget‑retain trade‑off: forget accuracy is comparable to or better than the oracle (full retraining) while retain scores stay within the required thresholds. Importantly, KVW requires only a single hyper‑parameter (γ), whereas LoRA’s performance is highly sensitive to the chosen rank r — the authors demonstrate a >4.5× variation in forget accuracy across ranks on CLEAR.

From a computational standpoint, KVW delivers dramatic savings. Because it eliminates back‑propagation, GPU memory consumption drops by a factor of 3–5 relative to gradient‑based methods, and total runtime is reduced by over 60 %. This makes KVW practical for real‑world deployment where rapid response to privacy requests or content‑policy updates is essential.

The paper also discusses limitations and future directions. KVW currently focuses on FFN layers; extending the approach to attention matrices or multimodal fusion modules could capture additional knowledge pathways. Automated selection of γ or adaptive per‑layer weakening strategies are promising avenues. Moreover, integrating KVW with continual‑learning pipelines could enable seamless “forget‑on‑the‑fly” capabilities for ever‑growing LVLMs.

In summary, Knowledge Vector Weakening introduces a novel, gradient‑free paradigm for machine unlearning in large vision‑language models. By directly attenuating the contribution of knowledge vectors identified as forget‑specific, KVW achieves effective data removal with minimal computational overhead, addressing a critical gap between model utility, privacy compliance, and operational efficiency.

Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models

💡 Research Summary

Comments & Academic Discussion

Leave a Comment