Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information

Reading time: 5 minute
...

📝 Original Info

  • Title: Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
  • ArXiv ID: 2511.22176
  • Date: 2025-11-27
  • Authors: Lukas Struppek, Dominik Hintersdorf, Hannah Struppek, Daniel Neider, Kristian Kersting

📝 Abstract

Recent large language models achieve strong reasoning performance by generating detailed chain-of-thought traces, but this often leads to excessive token use and high inference latency. Existing efficiency approaches typically focus on model-centric interventions, such as reinforcement learning or supervised fine-tuning, to reduce verbosity. In contrast, we propose a training-free, input-centric approach. Inspired by cognitive psychology, we introduce Focused Chain-of-Thought (F-CoT), which separates information extraction from the reasoning process. F-CoT first organizes the essential information from a query into a concise, structured context and then guides the model to reason exclusively over this context. By preventing attention to irrelevant details, F-CoT naturally produces shorter reasoning paths. On arithmetic word problems, F-CoT reduces generated tokens by 2-3× while maintaining accuracy comparable to standard zero-shot CoT. These results highlight structured input as a simple yet effective lever for more efficient LLM reasoning.

💡 Deep Analysis

📄 Full Content

FOCUSED CHAIN-OF-THOUGHT: EFFICIENT LLM REA- SONING VIA STRUCTURED INPUT INFORMATION Lukas Struppek1∗ Dominik Hintersdorf2,3 Hannah Struppek4 Daniel Neider5,6 Kristian Kersting2,3,7,8 1FAR.AI, 2German Research Center for Artificial Intelligence (DFKI), 3Technical University of Darmstadt, 4University of Kassel, 5TU Dortmund University, 6TU Center for Trustworthy Data Science and Security, University Alliance Ruhr, 7Hessian Center for AI (Hessian.AI), 8Centre for Cognitive Science, Technical University of Darmstadt ABSTRACT Recent large language models achieve strong reasoning performance by generating detailed chain-of-thought traces, but this often leads to excessive token use and high inference latency. Existing efficiency approaches typically focus on model-centric interventions, such as reinforcement learning or supervised fine-tuning, to reduce verbosity. In contrast, we propose a training-free, input-centric approach. Inspired by cognitive psychology, we introduce Focused Chain-of-Thought (F-CoT), which separates information extraction from the reasoning process. F-CoT first organizes the essential information from a query into a concise, structured context and then guides the model to reason exclusively over this context. By preventing attention to irrelevant details, F-CoT naturally produces shorter reasoning paths. On arithmetic word problems, F-CoT reduces generated tokens by 2–3× while maintaining accuracy comparable to standard zero-shot CoT. These results highlight structured input as a simple yet effective lever for more efficient LLM reasoning. 1 INTRODUCTION Large language models (LLMs) are trained to predict the next token given a sequence of previous ones. Scaling model parameters and training data has substantially improved their performance on mathematical reasoning benchmarks, with recent models continuing to push the state of the art. Many LLMs reveal their internal reasoning by producing an explicit chain-of-thought (CoT) (Wei et al., 2022) – a step-by-step, natural-language rationale that makes reasoning traceable to humans. While CoT outputs increase transparency, they also generate long reasoning traces that are costly in time and computation. Moreover, locating errors within a long CoT is challenging, since the entire trace must typically be checked to identify the mistakes and verify correctness. The reasoning processes of LLMs are often compared to human logical thinking. Foundational work in cognitive psychology, such as the Active Control of Thought (ACT) framework (Anderson, 1976) models human problem-solving as sequential, resource-efficient processes, beginning with the representation and structuring of information before higher-order reasoning. Modern LLMs exhibit analogous reasoning behavior. However, the stages of information extraction and structuring in LLMs are not clearly distinguished from the subsequent reasoning phase and are often interwoven with it. We hypothesize that this entanglement blurs the boundaries between relevant and irrelevant information, thereby complicating the LLM reasoning process and contributing to the generation of unnecessary tokens. ∗Work mainly done at DFKI/Technical University of Darmstadt. Contact: FirstName@far.ai. 1 arXiv:2511.22176v1 [cs.CL] 27 Nov 2025 Question: “Eliza's rate per hour for the first 40 hours she works each week is $10. She also receives an overtime pay of 1.2 times her regular hourly rate. If Eliza worked for 45 hours this week, how much are her earnings for this week?” 0-Shot CoT: “Please reason step by step” Information Extraction: “Please extract all critical information and the underlying question from the given sample” Okay, let's see. Eliza earns $10 per hour for the first 40 hours each week... Regular hourly rate for first 40 hours: $10/hour Overtime pay rate is 1.2 times the regular rate Hours worked this week: 45 hours How much are Eliza's earnings for this week? Okay, let's see. I need to calculate Eliza's total earnings for the week based on the given context... LLM LLM 0-Shot CoT: “Please reason step by step” LLM + + (1297 Output Tokens) (87 Output Tokens) (644 Output Tokens) Traditional Chain-of-Thought Reasoning Focused Chain-of-Thought Reasoning Figure 1: Focused Chain-of-Thought reasoning. The model first extracts key information into an XML-like context block and then performs reasoning based on that block. The context can also be pre-defined by the user or generated automatically by a larger LLM. When queried using only the context, large reasoning models produce significantly shorter reasoning traces compared to standard natural-language inputs. In this particular example, Qwen3 14B produces 43% fewer tokens compared to standard CoT prompting. Shown prompts are abbreviated; see Appx. A.1 and A.5 for full prompts. In this paper, we introduce Focused Chain-of-Thought (F-CoT), a novel p

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut