LoopBench: Discovering Emergent Symmetry Breaking Strategies with LLM Swarms

Reading time: 5 minute
...

📝 Original Info

  • Title: LoopBench: Discovering Emergent Symmetry Breaking Strategies with LLM Swarms
  • ArXiv ID: 2512.13713
  • Date: 2025-12-07
  • Authors: - Ali Parsaee¹ - Yashar Talebirad¹ - Csongor Szepesvári¹ - Vishwajeet Ohal¹ - Eden Redman² ¹ University of Alberta, Edmonton, Canada ² Network for Applied Technology, Edmonton, Canada

📝 Abstract

Large Language Models (LLMs) are increasingly being utilized as autonomous agents, yet their ability to coordinate in distributed systems remains poorly understood. We introduce \textbf{LoopBench}, a benchmark to evaluate LLM reasoning in distributed symmetry breaking and meta-cognitive thinking. The benchmark focuses on coloring odd cycle graphs ($C_3, C_5, C_{11}$) with limited colors, where deterministic, non-communicating agents fail in infinite loops. A strategy passing mechanism is implemented as a form of consistent memory. We show that while standard LLMs and classical heuristics struggle, advanced reasoning models (e.g., O3) devise strategies to escape deadlocks. LoopBench allows the study of emergent distributed algorithms based on language-based reasoning, offering a testbed for collective intelligence.

💡 Deep Analysis

📄 Full Content

LoopBench: Discovering Emergent Symmetry Breaking Strategies with LLM Swarms Ali Parsaee1, Yashar Talebirad1, Csongor Szepesvári1, Vishwajeet Ohal1, and Eden Redman2 1 University of Alberta, Edmonton, Canada {parsaee,talebira,csongor,ohal}@ualberta.ca 2 Network for Applied Technology, Edmonton, Canada eden@nat.ltd Abstract. Large Language Models (LLMs) are increasingly being uti- lized as autonomous agents, yet their ability to coordinate in distributed systems remains poorly understood. We introduce LoopBench, a bench- mark to evaluate LLM reasoning in distributed symmetry breaking and meta-cognitive thinking. The benchmark focuses on coloring odd cy- cle graphs (C3, C5, C11) with limited colors, where deterministic, non- communicating agents fail in infinite loops. A strategy passing mech- anism is implemented as a form of consistent memory. We show that while standard LLMs and classical heuristics struggle, advanced reason- ing models (e.g., O3) devise strategies to escape deadlocks. LoopBench allows the study of emergent distributed algorithms based on language- based reasoning, offering a testbed for collective intelligence. Keywords: multi-agent systems · collective intelligence · large language models · distributed systems · swarm intelligence · reasoning benchmarks 1 Introduction Large Language Models (LLMs) are evolving beyond isolated chatbots into the building blocks of autonomous multi-agent systems. The true potential of these "AI swarms" lies not just in individual problem-solving, but in their ability to solve complex problems without a central coordinator. However, coordinat- ing a distributed system of reasoning agents presents a fundamental challenge: can independent LLMs, driven by local prompts and limited observations, au- tonomously invent the strategies necessary to collaborate? This ability hinges on meta-cognitive reasoning, which we define as the capacity to adopt strategies that go beyond immediate local optimization. We wanted to test for meta-cognitive reasoning by creating a setting where thinking narrowly results in deadlock, but ’zooming out’ reveals the solution. This led us to the fundamental problem of loop breaking in graphs. Specifically, we focus on over-constrained odd cycles, a scenario where perfect solutions are impossible and simple greedy heuristics lead to endless oscillation loops and fail. arXiv:2512.13713v1 [cs.AI] 7 Dec 2025 2 A. Parsaee et al. This setting forces agents to go beyond immediate conflict resolution and adopt novel strategies that benefit the collective instead. We introduce LoopBench, a benchmark focused on symmetry breaking in over-constrained odd-cycle graphs (C3, C5, C11). By analyzing quantitative per- formance (conflict minimization) and qualitative strategy evolution (via feed- forward strategies), we measure the "Reasoning Gap" between models. We demonstrate that advanced reasoning models like O3 successfully break symmetries by developing strategies such as "waiting" or history-based pseudo- randomness. We interpret this capacity to detect and escape deadlock as a form of meta-cognitive thinking, enabling agents to override immediate greedy in- centives for long-term coordination. In contrast, classical heuristics and weaker LLMs often remain stuck. This work explores the following: 1. A framework for evaluating distributed agents on symmetry-breaking tasks. 2. An LLM-agent architecture using consistent memory to enable emergent and evolving strategies. 3. Experiments showing a performance gap between different LLMs as well as algorithmic baselines. 4. Qualitative analysis of emergent strategies showing how meta-cognitive rea- soning arises from local observations. 2 Related Work General multi-agent frameworks like AgentVerse [2] and AgentBench [6] investi- gate how LLM agents collaborate within structured environments. GPTSwarm [9] takes a graph-optimization approach, refining both agent prompts and their connectivity. For graph-based tasks specifically, GraphAgent [5] employs a dis- tributed approach where node-specific agents communicate synchronously, achiev- ing high accuracy on large-scale polynomial-time problems. The AgentsNet bench- mark [4] evaluates decentralized protocols like leader election and coloring under strict communication constraints. Self-improvement mechanisms have also proven effective. Reflexion [8] uses verbal reinforcement and episodic memory to help agents correct mistakes in subsequent trials. Research on self-reflection indicates that structured templates, such as prompts for retrying or summarizing solutions, enhance performance across various domains [7]. Additionally, FINEREASON [1] Demonstrates that practicing self-correction in puzzles improves mathematical reasoning. LoopBench aligns with multi-agent reasoning and reflective frameworks, but instead of introducing a new solver, we provide a minimal benchmark de- signed to test symmetry breaking under severe information constraints. Our feed-forward

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut