Reading time: 8 minute
...

📝 Original Info

  • Title:
  • ArXiv ID: 2512.18489
  • Date:
  • Authors: Unknown

📝 Abstract

Large Language Models (LLMs) demonstrate strong fewshot generalization through in-context learning (ICL), yet their reasoning in dynamic and stochastic environments remains opaque. Prior studies mainly address static tasks, overlooking the online adaptation required when beliefs must be continuously updated-a key capability for LLMs as world models or agents. We introduce a Bayesian filtering framework to evaluate online inference in LLMs. Our probabilistic probe suite spans multivariate discrete (e.g., dice rolls) and continuous (e.g., Gaussian) distributions, where ground-truth parameters shift over time. We find that while LLMs' belief updates resemble Bayesian posteriors, they are more accurately described by an exponential forgetting filter with a model-specific discount factor γ < 1. This reveals systematic discounting of older evidence, varying significantly across architectures. Although inherent priors are often miscalibrated, the updating mechanism itself is structured and principled. We validate these findings in a simulated agent task and present prompting strategies that effectively recalibrate priors with minimal cost.

📄 Full Content

In-context learning (ICL) represents a remarkable capability of large language models (LLMs) [1,2,3,4], enabling rapid adaptation to novel tasks based solely on a handful of examples provided in their prompts, without explicit gradient updates. While the empirical success of ICL underpins modern prompting techniques, its fundamental mechanism remains largely opaque. A key open question is whether ICL constitutes structured statistical reasoning analogous to Bayesian inference or is merely sophisticated pattern recognition.

A promising perspective frames ICL as implicit Bayesian inference, where models iteratively update latent belief states based on contextual evidence [5,6,7]. However, existing foundational studies predominantly investigate static environments, assuming stationary data-generating distributions. Such an assumption neglects a crucial aspect of real-world intelligence: the necessity to operate effectively within nonstationary environments. In these dynamic contexts, agents must continuously integrate new information while systematically discounting-or “forgetting”-outdated evidence whose relevance diminishes over time. This capacity for online adaptation is critical for deploying LLMs as reliable world models or autonomous agents, yet remains underexplored. In this work, we propose a new theoretical perspective: we conceptualize ICL in Transformers as online Bayesian filtering characterized by systematic discounting of past evidence. Central to our thesis is the assertion that LLMs, when confronted with sequences of evolving evidence, do not function as ideal Bayesian observers. Instead, we hypothesize that their behavior incorporates an intrinsic forgetting mechanism, likely arising from architectural elements of Transformers [8,9]. To formalize and empirically evaluate this hypothesis, we introduce an analytical framework built around fitting a discount factor γ ∈ (0, 1] that minimizes the Kullback-Leibler (KL) divergence between a model’s predictive distribution and a theoretical Bayesian filter’s. We investigate this dynamic behaviour through a controlled probabilistic probe suite [8,10,11], involving tasks such as biased die rolling and Gaussian mean estimation, where the ground-truth parameters undergo sudden shifts, compelling models to adapt their posterior beliefs. Through our integrated experimental and analytical approach, we present the following contributions:

  1. We establish a novel framework for interpreting ICL as online Bayesian filtering. Our analysis provides the We show that observed performance limitations are predominantly attributable to the latter, which stems from miscalibrated priors and intrinsic discounting, rather than a flawed updating process itself.

  2. We further investigate the internal mechanisms underlying this discounting behaviour, revealing through correlation analysis that inferential quality is decoupled from the raw magnitude of attention scores. This finding points to a complex architectural basis for evidential forgetting that extends beyond straightforward attention allocation.

In-context learning (ICL) in large language models (LLMs) [12]has been extensively studied as a form of implicit statistical inference, particularly through a Bayesian lens. Foundational works, such as those by Xie et al. [8] and Aky"urek et al. [13], frame ICL as approximating Bayesian posterior updates or gradient descent on latent functions, enabling fewshot generalization in stationary environments. However, these perspectives assume fixed data distributions, overlooking the challenges of non-stationary settings where beliefs must adapt online to shifting parameters-a gap our discounted filtering framework addresses by introducing systematic evidence forgetting.

Recent efforts have begun exploring dynamic adaptation in LLMs, including continual learning approaches [14] that mitigate catastrophic forgetting via architectural modifications or replay buffers. For instance, methods like selfsynthesized rehearsal [15] generate synthetic data to preserve knowledge across tasks. Yet, these often focus on taskspecific retention rather than principled probabilistic updating under uncertainty.Our work diverges by providing a unified Bayesian filtering model with a fitted discount factor γ < 1, quantifying deviations from ideal inference and linking them to Transformer attention mechanisms, thus bridging static ICL theory with online reasoning.

Our methodology is designed to empirically test if LLM in-context learning emulates online Bayesian inference with evidence discounting. We first define our theoretical model and the non-stationary tasks used for probing, then detail the quantitative analysis, summarized in Algorithm 1.

Discounted Bayesian Filtering. We model the “forgetting” of past evidence using a discounted Bayesian filtering framework [16]. This introduces a discount factor, γ ∈ (0, 1], which tempers the posterior belief p t-1 (θ|D 1:t-1 ) before it is updated with new evidence D t :

Here, γ = 1 represents a standard Bayesian filter with perfect memory, while γ → 0 approaches a memoryless state. Probabilistic Probe Suite. To test online adaptation, we designed two non-stationary tasks over a sequence of T = 100 observations, with a parameter changepoint at t = 51: Biased Die (Discrete): A 6-sided die where the dominant face abruptly changes from one to another. Gaussian Mean (Continuous): Samples from N (µ, 1), where the mean µ shifts from 2.0 to -2.0. At each timestep t, we elicit the LLM’s predictive distribution pLLM,t by providing the history D 1:t-1 and applying softmax to the output logits for all possible next outcomes.

Our analysis quantitatively connects the LLM’s behavior to our framework. The core procedure is outlined in Algorithm 1. 1. Quantifying Discounting (γ * ): We fit an optimal discount factor γ * by minimizing the total KL divergence between the LLM’s predictive sequence and that of the discounted filter:

We solve this using the L-BFGS-B algorithm. A fitted γ * < 1 provides direct evidence of discounting.

We empirically validate our hypothesis using the setup from Section 3. We evaluated the base and instruction-tuned variants of Llama-3.1-8B [17], Mistral-7B [18], and Gemma-2-2B [19] on the Biased Die [20] and Gaussian Mean [21] Algorithm 1 Fitting γ * and Decomposing Error probes, fitting an optimal discount factor γ * and decomposing predictive error for each model-task pair.

Our work builds on, yet diverges from, prior perspectives on ICL [22]. While foundational views frame ICL as implicit Bayesian inference under stationary assumptions (equivalent to γ = 1), others identify systematic deviations from this ideal without proposing an alternative mechanism (Figure 1). Our framework offers a specific mechanistic account for these deviations, testing whether the intrinsic belief updating of a single LLM is captured by a simple, principled discounting process. This provides a more fundamental view of adaptation than active filtering or ensemble gating approaches.

Fitted Discount Factors (γ * ): Evidence for Forgetting. The optimal discount factors (Figure 4) provide strong evidence against the classical Bayesian model. For all models, the fitted γ * is significantly below 1, contradicting the perfect-memory assumption and indicating that LLMs inherently discount past evidence. We observe consistent patterns: instruction-tuned models exhibit lower γ * (stronger discounting), and each model family shows a characteristic discounting rate across tasks. Error Decomposition: Pinpointing the Source of Deviations. The error decomposition in Figure 3 any simplified model. Crucially, the Update Divergence (D U pdate )-the deviation from their own best-fit model-is consistently small (typically < 13% of total error). This result is key: it suggests LLMs are not flawed Bayesian updaters but proficient discounted updaters. Their behavior is principled, and the failure of the standard Bayesian model (γ = 1) is its inability to forget outdated evidence after a changepoint-a limitation the fitted γ * overcomes.

Our final analysis investigates the architectural basis for this behavior by examining the link between the Transformer’s attention [23] and inferential quality [24]. We test the hypothesis that attention allocated to historical context correlates with the stability of the belief update. For Llama-3.1-8B on the Biased Die task, we plotted the step-wise Update Divergence (E t ) against the Aggregate Attention Score on past evidence (A t ), as shown in Figure 5. The results reveal a strong and highly significant negative correlation (ρ = -0.85, p ≈ 2.6 × 10 -35 ). This finding indicates that greater aggregate attention to past evidence is directly associated with more principled and stable belief updates (lower divergence). The attention mechanism thus acts as a primary modulator of inferential fidelity, providing a compelling architectural explana-

In this work, we introduced a discounted Bayesian filtering perspective to analyze the online inference behavior of large language models. Through a comprehensive suite of dynamic probabilistic probes, we showed that LLMs consistently discount past evidence when updating beliefs, a behavior quantified by a fitted discount factor γ < 1. Our error decomposition further revealed that predictive errors mainly stem from model misspecification rather than faulty updating, underscoring that LLMs function as principled discounted updaters. This characterization clarifies when and why LLM predictions drift under non-stationarity, and offers a principled basis for diagnosing failures and designing adaptive correction or calibration mechanisms. These findings provide both a robust theoretical lens and a practical toolkit for understanding, evaluating, and potentially improving the reasoning of LLMs in uncertain and non-stationary environments.

  1. Decomposing Predictive Error: We diagnose the source of errors by decomposing the total divergence into two components: Update Divergence (D Update ): D KL (p LLM ∥ p Bayes (γ * )).

1: Input: LLM M, Task observations D 1:T 2: Output: γ * , D Update , D

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut