Measuring and Analyzing Intelligence via Contextual Uncertainty in Large Language Models using Information-Theoretic Metrics

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Large Language Models (LLMs) excel on many task-specific benchmarks, yet the mechanisms that drive this success remain poorly understood. We move from asking what these systems can do to asking how they process information. Our contribution is a task-agnostic method that builds a quantitative Cognitive Profile for any model. The profile is built around the Entropy Decay Curve – a plot of a model’s normalised predictive uncertainty as context length grows. Across several state-of-the-art LLMs and diverse texts, the curves expose distinctive, stable profiles that depend on both model scale and text complexity. We also propose the Information Gain Span (IGS) as a single index that summarises the desirability of a decay pattern. Together, these tools offer a principled way to analyse and compare the internal dynamics of modern AI systems.

💡 Research Summary

The paper introduces a task‑agnostic framework for quantifying the “cognitive” behavior of large language models (LLMs) by tracking how predictive uncertainty changes as the context window grows. For a given context length k, the model’s full next‑token probability distribution p(Y|X) is used to compute two entropy‑based quantities: (1) the average conditional entropy hₖ = Eₓ

Measuring and Analyzing Intelligence via Contextual Uncertainty in Large Language Models using Information-Theoretic Metrics

💡 Research Summary

Comments & Academic Discussion

Leave a Comment