Language Models Struggle to Use Representations Learned In-Context

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Though large language models (LLMs) have enabled great success across a wide variety of tasks, they still appear to fall short of one of the loftier goals of artificial intelligence research: creating an artificial system that can adapt its behavior to radically new contexts upon deployment. One important step towards this goal is to create systems that can induce rich representations of data that are seen in-context, and then flexibly deploy these representations to accomplish goals. Recently, Park et al. (2024) demonstrated that current LLMs are indeed capable of inducing such representation from context (i.e., in-context representation learning). The present study investigates whether LLMs can use these representations to complete simple downstream tasks. We first assess whether open-weights LLMs can use in-context representations for next-token prediction, and then probe models using a novel task, adaptive world modeling. In both tasks, we find evidence that open-weights LLMs struggle to deploy representations of novel semantics that are defined in-context, even if they encode these semantics in their latent representations. Furthermore, we assess closed-source, state-of-the-art reasoning models on the adaptive world modeling task, demonstrating that even the most performant LLMs cannot reliably leverage novel patterns presented in-context. Overall, this work seeks to inspire novel methods for encouraging models to not only encode information presented in-context, but to do so in a manner that supports flexible deployment of this information.

💡 Research Summary

The paper investigates whether large language models (LLMs) can not only learn novel representations from in‑context data but also deploy those representations to solve downstream tasks. Building on Park et al. (2025a), the authors first replicate the “graph‑tracing” task, where a random walk traverses a latent state space (either a 1‑D line of 16 or 25 states or a 2‑D grid of 4×4 or 5×5 states). Each state is labeled with an arbitrary token, so the model must infer the underlying topology solely from the sequence of tokens. Using Dirichlet Energy (which penalizes distance between representations of adjacent states) and Distance Correlation (which measures alignment between representation distances and Manhattan distances in the true topology), they confirm that four open‑source instruction‑tuned models—Gemma‑3‑4b, Gemma‑3‑12b, Gemma‑3‑27b, and OLMo‑2‑13b—gradually reshape their hidden‑layer embeddings to reflect the geometry of the state space as context length grows.

Having established that in‑context representation learning occurs, the authors ask whether these learned embeddings are “deployable” for actual reasoning. They design two downstream tasks. The first is next‑token prediction under two prompting regimes: (i) Prefilled, where the random walk is already present in the model’s response, allowing immediate use of the learned representation; and (ii) Instruction, where the walk appears in the user’s message, forcing the model to delay its prediction until after several intervening special tokens. Across 1,000 random token‑state assignments, all models achieve high accuracy in the Prefilled condition (≈90%+), replicating prior work, but performance collapses in the Instruction condition (≈50% or lower). Analyses show that this drop is not due to weaker representations—distance‑correlation scores are comparable—but rather to an inability to retrieve and apply the learned structure after a delay.

The second downstream task, Adaptive World Modeling (AWM), combines the graph‑tracing phase with a few‑shot mapping phase. After observing the random walk, the model is given a few examples that define a rule mapping a token at position (i, j) to a token at (i + k, j) (e.g., a “two‑step down” rule). The model must then apply this rule to novel inputs. The authors test both a “1‑step down” rule (which is directly witnessed in the walk) and a “2‑step down” rule (which is not). Despite encoding the topology in their hidden states, all open‑source models perform poorly on AWM, with accuracies hovering near chance, especially for the unseen 2‑step rule.

To probe whether more sophisticated reasoning mechanisms help, the authors evaluate closed‑source, state‑of‑the‑art “reasoning” models that generate chain‑of‑thought (CoT) explanations. These models show modest improvements on AWM, suggesting that externalizing reasoning steps can sometimes reactivate latent representations, but the gains are inconsistent and far from reliable.

Overall, the study demonstrates a clear dissociation: in‑context representation learning ≠ flexible deployment. Current LLMs can reorganize internal embeddings to mirror novel structures, yet they treat those embeddings as inert when required to use them later or to apply abstract rules. The findings imply that simply scaling models or adding CoT is insufficient; new training objectives, architectural changes, or meta‑learning strategies are needed to teach models how to use the knowledge they acquire on the fly. Potential future directions include mechanisms for explicit grounding of in‑context embeddings, retrieval‑augmented decoding, or curriculum‑style meta‑training that couples representation formation with downstream utilization. The work thus highlights a critical bottleneck on the path toward truly adaptable AI agents capable of rapid, on‑the‑fly learning.

Language Models Struggle to Use Representations Learned In-Context

💡 Research Summary

Comments & Academic Discussion

Leave a Comment