AgentSpawn: Adaptive Multi-Agent Collaboration Through Dynamic Spawning for Long-Horizon Code Generation
Long-horizon code generation requires sustained context and adaptive expertise across domains. Current multi-agent systems use static workflows that cannot adapt when runtime analysis reveals unanticipated complexity. We propose AgentSpawn, an architecture enabling dynamic agent collaboration through: (1) automatic memory transfer during spawning, (2) adaptive spawning policies triggered by runtime complexity metrics, and (3) coherence protocols for concurrent modifications. AgentSpawn addresses five critical gaps in existing research around memory continuity, skill inheritance, task resumption, runtime spawning, and concurrent coherence. Experimental validation demonstrates AgentSpawn achieves 34% higher completion rates than static baselines on benchmarks like SWE-bench while reducing memory overhead by 42% through selective slicing.
💡 Research Summary
AgentSpawn addresses the growing challenge of long‑horizon code generation, where a single LLM must maintain context across dozens of inter‑dependent steps, discover unexpected complexity at runtime, and invoke specialized expertise without losing continuity. Existing approaches fall into two camps: single‑agent systems that stretch the context window but struggle with task decomposition, and static multi‑agent pipelines that pre‑define a fixed set of roles and workflows. Neither can adapt when a task suddenly exceeds the capabilities of the current agent configuration.
The paper first articulates five concrete research gaps: (1) continuous stateful memory across agent lifetimes, (2) dynamic inheritance of skills, (3) seamless resume‑with‑context after a sub‑task, (4) runtime complexity‑driven spawning, and (5) inter‑agent memory coherence when multiple agents edit overlapping code. To fill these gaps, the authors propose the AgentSpawn architecture, composed of five modules: a Memory Manager with automatic slicing, a Skill Library organized as an inheritance graph, a Spawn Controller that monitors runtime metrics, a Resume Coordinator that serializes and deserializes agent state, and a Coherence Manager that resolves concurrent modifications.
Memory slicing is formalized in Algorithm 1. Each memory item is scored by a weighted sum of four relevance components: keyword match, dependency score, temporal recency, and semantic similarity. The weights (α = 0.3, β = 0.3, γ = 0.2, δ = 0.2) and a threshold θ = 0.5 ensure that only about half of the parent’s episodic, semantic, and working memory is transferred to the child. Empirically this reduces token usage from 87 K to 51 K (≈ 42 % reduction) while preserving task success.
Skill inheritance treats each skill as a parameterized prompt (template p plus context‑specific parameters θs). When spawning, the system computes a relevance score between the skill’s embedding and the child sub‑task embedding; skills above a threshold τ are copied to the child. After the child finishes, successful skills are promoted to the parent’s library, enabling continual enrichment without retraining.
Adaptive spawning relies on five runtime metrics: file inter‑dependency count (If), cyclomatic complexity (Cc), test‑failure cascade (Fc), working‑memory overflow (Oc), and agent uncertainty (Uc). Each metric is normalized to
Comments & Academic Discussion
Loading comments...
Leave a Comment