In-Context Learning Without Copying

In-Context Learning Without Copying
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Induction heads are attention heads that perform inductive copying by matching patterns from earlier context and copying their continuations verbatim. As models develop induction heads, they experience a sharp drop in training loss, a phenomenon cited as evidence that induction heads may underlie a wide range of in-context learning (ICL) capabilities. In this work, we investigate whether induction heads are a necessary building block for learning abstractive ICL capabilities (i.e., tasks where the answer is not contained in the input context), or whether such capabilities can emerge independently. We propose Hapax, a training regime that omits the loss contribution of tokens predictable by induction heads. Despite a significant reduction in inductive copying, abstractive ICL capabilities are preserved, with the model achieving higher accuracy than the vanilla model on 13 out of 21 tasks, even though 31.7% of tokens are omitted from the loss. Furthermore, our model achieves lower loss values on token positions that induction heads cannot predict. Mechanistic analysis shows that models trained with Hapax develop fewer and weaker induction heads despite preserving abstractive ICL capabilities. Our findings suggest that the developmental link between induction heads and abstractive ICL capabilities is weaker than previously hypothesized.


💡 Research Summary

The paper investigates whether induction heads—attention heads that perform “inductive copying” by matching repeated n‑grams in the context and reproducing their continuations—are a necessary prerequisite for abstract in‑context learning (ICL) capabilities, i.e., tasks where the answer is not present in the prompt. Prior work has linked the sudden emergence of induction heads during training to sharp drops in training loss and to a broad set of ICL abilities, suggesting a causal relationship. To test this hypothesis, the authors introduce a training regime called Hapax that deliberately removes the loss contribution of tokens that can be correctly predicted by induction heads. Concretely, any token position that participates in a repeated n‑gram (n > 1) within the same context window is added to a mask M; the loss is computed only over the unmasked set U = S \ M. This prevents gradients from reinforcing the pattern‑matching behavior that induction heads exploit, effectively suppressing the incentive to learn inductive copying.

Both a standard “vanilla” GPT‑NeoX‑style model and a Hapix model (1 B parameters, trained on the Pile for 20 000 steps) are trained under identical hyper‑parameters. In the Hapix regime, 31.7 % of tokens are masked; a stricter variant, Thresholded‑Hapix, uses cosine‑similarity‑based matching to mask 52.5 % of tokens. The authors evaluate three dimensions:

  1. Inductive copying performance – measured with synthetic random‑token repetition and natural‑text repetition tasks. Hapix suffers a 66 % drop in random repetition accuracy, and Thresholded‑Hapix an 89 % drop, confirming that induction heads are strongly weakened. Natural‑text repetition also declines over training, indicating the model learns to avoid copying.

  2. Abstract ICL performance – assessed on 26 benchmark tasks from the “extractive vs. abstractive” suite (e.g., country‑capital mapping, multi‑language word‑level translation). Despite the loss of induction heads, Hapix matches or exceeds the vanilla model on 13 of 21 tasks with statistically significant differences. When the few‑shot context is filtered to remove label leakage, Hapix outperforms vanilla on 24 of 25 tasks, showing that the vanilla advantage in some cases stemmed from copying labels rather than genuine reasoning. Thresholded‑Hapix generally underperforms on abstract tasks (18/24), but surprisingly achieves higher translation accuracy, suggesting that the stricter masking preferentially preserves cross‑lingual signal while suppressing same‑language repetitions.

  3. Mechanistic analysis – using the token‑loss‑difference metric (Yoon & Steinhardt, 2025), the authors find that Hapix attains lower loss on positions that induction heads cannot predict, implying that other circuits (function‑vector heads, concept‑induction circuits) are compensating. Direct inspection shows fewer and weaker induction heads in Hapix models.

The key insight is that while induction heads and the associated inductive copying phenomenon coincide with rapid loss reduction early in training, they are not causally required for abstract ICL abilities. Models can develop robust reasoning and generation capabilities even when the gradient signal for copying is largely removed. This challenges the prevailing narrative that induction heads are the “engine” behind all emergent ICL and opens a research avenue focused on nurturing alternative mechanisms—such as function‑vector heads or higher‑level concept representations—rather than relying on pattern‑matching copying.

In summary, the Hapix training regime demonstrates that abstract in‑context learning can emerge independently of induction heads. The findings suggest a weaker developmental link between induction circuits and general ICL than previously hypothesized, encouraging future work to explore more diverse architectural and training strategies for building truly reasoning‑capable language models.


Comments & Academic Discussion

Loading comments...

Leave a Comment