State over Tokens: Characterizing the Role of Reasoning Tokens

Reading time: 5 minute
...

📝 Original Info

  • Title: State over Tokens: Characterizing the Role of Reasoning Tokens
  • ArXiv ID: 2512.12777
  • Date: 2025-12-14
  • Authors: Mosh Levy, Zohar Elyoseph, Shauli Ravfogel, Yoav Goldberg

📝 Abstract

Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks. While these sequences seem like human thought processes, empirical evidence reveals that they are not a faithful explanation of the model's actual reasoning process. To address this gap between appearance and function, we introduce the State over Tokens (SoT) conceptual framework. SoT reframes reasoning tokens not as a linguistic narrative, but as an externalized computational state -- the sole persistent information carrier across the model's stateless generation cycles. This explains how the tokens can drive correct reasoning without being a faithful explanation when read as text and surfaces previously overlooked research questions on these tokens. We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state.

💡 Deep Analysis

Figure 1

📄 Full Content

State over Tokens: Characterizing the Role of Reasoning Tokens Mosh Levy Bar-Ilan University moshe0110@gmail.com Zohar Elyoseph University of Haifa Shauli Ravfogel New York University Yoav Goldberg Bar-Ilan University Allen Institute for AI Abstract Large Language Models (LLMs) can generate reasoning tokens before their final answer to boost performance on complex tasks. While these sequences seem like human thought processes, empirical evidence reveals that they are not a faithful explanation of the model’s actual reasoning process. To address this gap between appearance and function, we introduce the State over Tokens (SoT) conceptual framework. SoT reframes reasoning tokens not as a linguistic narrative, but as an externalized computational state—the sole persistent information carrier across the model’s stateless generation cycles. This explains how the tokens can drive correct reasoning without being a faithful explanation when read as text and surfaces previously overlooked research questions on these tokens. We argue that to truly understand the process that LLMs do, research must move beyond reading the reasoning tokens as text and focus on decoding them as state. 1 Introduction The assertion that Large Language Models (LLMs) can reason now appears unremarkable (Mitchell, 2025; Maslej et al., 2025). A key factor to achieve this was letting models generate a sequence of tokens before their final answer, which significantly improves performance (Wei et al., 2022; Zelikman et al., 2022; DeepSeek-AI et al., 2025). We refer to this sequence of symbols, which includes phrases such as ‘therefore’, ‘consider’ and ‘it follows that’ as the reasoning tokens, and explicitly distinguish this name from reasoning text, which is the same tokens when interpreted by a reader according to their English semantics. The combination of (a) utility in improving the answer; and (b) appearance as a readable En- glish text, may lead to the following inference: the reasoning text is a faithful explanation of the model’s reasoning process. This is strengthened by metaphors like “Chain-of-Thought”, which imply that the steps in the text are ”thoughts” that explain the process. Yet empirical findings contradict this inference (see Section 2.1): the reasoning text is not a faithful expla- nation of the model’s reasoning process. While those findings clarify what the reasoning tokens are not, they leave a conceptual vacuum as to what they are. Our aim in this paper is to help fill that vacuum. Drawing on the idea that metaphors structure understanding and guide thinking (Lakoff & Johnson, 1980), we believe that adopting more apt descriptions and metaphors can steer researchers and practitioners toward more fruitful directions and surface a new set of questions that are less salient under the prevailing view of the reasoning text as an explanation. To understand reasoning tokens, we must focus on the functional role they play, rather than their appearance, which empirical research has found to be deceiving. To this end, we advocate viewing them as representing State over Tokens (SoT), which characterizes the 1 arXiv:2512.12777v1 [cs.CL] 14 Dec 2025 reasoning tokens as a computational device that enables the persistence of a process across separate and stateless computation cycles. We argue that in order to understand the role of the reasoning tokens, we should interpret this sequence of tokens not using their semantics when read as English text, but as the state carriers of a computational process. The Whiteboard Analogy Consider a hypothetical scenario: you are placed in a room with a problem written on a whiteboard. Your task is to solve it, but under a peculiar constraint: every 10 sec- onds, your memory is completely wiped and resets to the same state as it was when you first entered the room. Within each interval, you can read what is on the board and add a single word. These rounds repeat until you finally write down the solution. How might you solve a problem under such constraints? You may write intermediate results on the board: numbers, conclusions, or partial computa- tions—that you can use when you return after being ”reset”. You might perform several mental calculations before writing down just the result so the whiteboard may not capture every calculation that you did within each cycle. Moreover, you may use an encoding scheme when writing on the board: abbreviations, symbols, or even apparent gibberish that will mean something specific to you when you encounter it in the next cycle. All in all, an outside observer may interpret the whiteboard text incorrectly. The whiteboard analogy mirrors the model’s operation: the words are the reasoning tokens, you are the model, and the ten-second interval represents the model’s limited capacity per cycle. Motivated by this intuition, we present the SoT framework (Section 3) and use it to demonstrate two common misconceptions that underlay the belief that the text is a faithful

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut