📝 Original Info
- Title: LoopBench: Discovering Emergent Symmetry Breaking Strategies with LLM Swarms
- ArXiv ID: 2512.13713
- Date: 2025-12-07
- Authors: - Ali Parsaee¹ - Yashar Talebirad¹ - Csongor Szepesvári¹ - Vishwajeet Ohal¹ - Eden Redman² ¹ University of Alberta, Edmonton, Canada ² Network for Applied Technology, Edmonton, Canada
📝 Abstract
Large Language Models (LLMs) are increasingly being utilized as autonomous agents, yet their ability to coordinate in distributed systems remains poorly understood. We introduce \textbf{LoopBench}, a benchmark to evaluate LLM reasoning in distributed symmetry breaking and meta-cognitive thinking. The benchmark focuses on coloring odd cycle graphs ($C_3, C_5, C_{11}$) with limited colors, where deterministic, non-communicating agents fail in infinite loops. A strategy passing mechanism is implemented as a form of consistent memory. We show that while standard LLMs and classical heuristics struggle, advanced reasoning models (e.g., O3) devise strategies to escape deadlocks. LoopBench allows the study of emergent distributed algorithms based on language-based reasoning, offering a testbed for collective intelligence.
💡 Deep Analysis
📄 Full Content
LoopBench: Discovering Emergent Symmetry
Breaking Strategies with LLM Swarms
Ali Parsaee1, Yashar Talebirad1, Csongor Szepesvári1, Vishwajeet Ohal1, and
Eden Redman2
1 University of Alberta, Edmonton, Canada
{parsaee,talebira,csongor,ohal}@ualberta.ca
2 Network for Applied Technology, Edmonton, Canada
eden@nat.ltd
Abstract. Large Language Models (LLMs) are increasingly being uti-
lized as autonomous agents, yet their ability to coordinate in distributed
systems remains poorly understood. We introduce LoopBench, a bench-
mark to evaluate LLM reasoning in distributed symmetry breaking and
meta-cognitive thinking. The benchmark focuses on coloring odd cy-
cle graphs (C3, C5, C11) with limited colors, where deterministic, non-
communicating agents fail in infinite loops. A strategy passing mech-
anism is implemented as a form of consistent memory. We show that
while standard LLMs and classical heuristics struggle, advanced reason-
ing models (e.g., O3) devise strategies to escape deadlocks. LoopBench
allows the study of emergent distributed algorithms based on language-
based reasoning, offering a testbed for collective intelligence.
Keywords: multi-agent systems · collective intelligence · large language
models · distributed systems · swarm intelligence · reasoning benchmarks
1
Introduction
Large Language Models (LLMs) are evolving beyond isolated chatbots into the
building blocks of autonomous multi-agent systems. The true potential of these
"AI swarms" lies not just in individual problem-solving, but in their ability
to solve complex problems without a central coordinator. However, coordinat-
ing a distributed system of reasoning agents presents a fundamental challenge:
can independent LLMs, driven by local prompts and limited observations, au-
tonomously invent the strategies necessary to collaborate? This ability hinges
on meta-cognitive reasoning, which we define as the capacity to adopt strategies
that go beyond immediate local optimization.
We wanted to test for meta-cognitive reasoning by creating a setting where
thinking narrowly results in deadlock, but ’zooming out’ reveals the solution.
This led us to the fundamental problem of loop breaking in graphs. Specifically,
we focus on over-constrained odd cycles, a scenario where perfect solutions are
impossible and simple greedy heuristics lead to endless oscillation loops and fail.
arXiv:2512.13713v1 [cs.AI] 7 Dec 2025
2
A. Parsaee et al.
This setting forces agents to go beyond immediate conflict resolution and adopt
novel strategies that benefit the collective instead.
We introduce LoopBench, a benchmark focused on symmetry breaking in
over-constrained odd-cycle graphs (C3, C5, C11). By analyzing quantitative per-
formance (conflict minimization) and qualitative strategy evolution (via feed-
forward strategies), we measure the "Reasoning Gap" between models.
We demonstrate that advanced reasoning models like O3 successfully break
symmetries by developing strategies such as "waiting" or history-based pseudo-
randomness. We interpret this capacity to detect and escape deadlock as a form
of meta-cognitive thinking, enabling agents to override immediate greedy in-
centives for long-term coordination. In contrast, classical heuristics and weaker
LLMs often remain stuck. This work explores the following:
1. A framework for evaluating distributed agents on symmetry-breaking tasks.
2. An LLM-agent architecture using consistent memory to enable emergent and
evolving strategies.
3. Experiments showing a performance gap between different LLMs as well as
algorithmic baselines.
4. Qualitative analysis of emergent strategies showing how meta-cognitive rea-
soning arises from local observations.
2
Related Work
General multi-agent frameworks like AgentVerse [2] and AgentBench [6] investi-
gate how LLM agents collaborate within structured environments. GPTSwarm
[9] takes a graph-optimization approach, refining both agent prompts and their
connectivity. For graph-based tasks specifically, GraphAgent [5] employs a dis-
tributed approach where node-specific agents communicate synchronously, achiev-
ing high accuracy on large-scale polynomial-time problems. The AgentsNet bench-
mark [4] evaluates decentralized protocols like leader election and coloring under
strict communication constraints.
Self-improvement mechanisms have also proven effective. Reflexion [8] uses
verbal reinforcement and episodic memory to help agents correct mistakes in
subsequent trials. Research on self-reflection indicates that structured templates,
such as prompts for retrying or summarizing solutions, enhance performance
across various domains [7]. Additionally, FINEREASON [1] Demonstrates that
practicing self-correction in puzzles improves mathematical reasoning.
LoopBench aligns with multi-agent reasoning and reflective frameworks, but
instead of introducing a new solver, we provide a minimal benchmark de-
signed to test symmetry breaking under severe information constraints. Our
feed-forward
Reference
This content is AI-processed based on open access ArXiv data.