In-Context Function Learning in Large Language Models

In-Context Function Learning in Large Language Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Large language models (LLMs) can learn from a few demonstrations provided at inference time. We study this in-context learning phenomenon through the lens of Gaussian Processes (GPs). We build controlled experiments where models observe sequences of multivariate scalar-valued function samples drawn from known GP priors. We evaluate prediction error in relation to the number of demonstrations and compare against two principled references: (i) an empirical GP-regression learner that gives a lower bound on achievable error, and (ii) the expected error of a 1-nearest-neighbor (1-NN) rule, which gives a data-driven upper bound. Across model sizes, we find that LLM learning curves are strongly influenced by the function-generating kernels and approach the GP lower bound as the number of demonstrations increases. We then study the inductive biases of these models using a likelihood-based analysis. We find that LLM predictions are most likely under less smooth GP kernels. Finally, we explore whether post-training can shift these inductive biases and improve sample-efficiency on functions sampled from GPs with smoother kernels. We find that both reinforcement learning and supervised fine-tuning can effectively shift inductive biases in the direction of the training data. Together, our framework quantifies the extent to which LLMs behave like GP learners and provides tools for steering their inductive biases for continuous function learning tasks.


💡 Research Summary

This paper investigates the in‑context learning (ICL) capabilities of large language models (LLMs) through the lens of Gaussian Processes (GPs), providing a rigorous statistical framework for evaluating how LLMs learn continuous functions from a handful of demonstrations at inference time. The authors construct controlled experiments where LLMs are presented with sequences of input‑output pairs sampled from known GP priors, specifically Matérn (ν = ½, 1.5, 2.5) and Squared‑Exponential kernels with length‑scale values ℓ = 1 and ℓ = 8. For each dimensionality d ∈ {1,2,3,4}, they generate 200 functions per kernel, evaluate each on 50 uniformly drawn inputs in the interval


Comments & Academic Discussion

Loading comments...

Leave a Comment