SolAgent: A Specialized Multi-Agent Framework for Solidity Code Generation

SolAgent: A Specialized Multi-Agent Framework for Solidity Code Generation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Smart contracts are the backbone of the decentralized web, yet ensuring their functional correctness and security remains a critical challenge. While Large Language Models (LLMs) have shown promise in code generation, they often struggle with the rigorous requirements of smart contracts, frequently producing code that is buggy or vulnerable. To address this, we propose SolAgent, a novel tool-augmented multi-agent framework that mimics the workflow of human experts. SolAgent integrates a \textbf{dual-loop refinement mechanism}: an inner loop using the \textit{Forge} compiler to ensure functional correctness, and an outer loop leveraging the \textit{Slither} static analyzer to eliminate security vulnerabilities. Additionally, the agent is equipped with file system capabilities to resolve complex project dependencies. Experiments on the SolEval+ Benchmark, a rigorous suite derived from high-quality real-world projects, demonstrate that SolAgent achieves a Pass@1 rate of up to \textbf{64.39%}, significantly outperforming state-of-the-art LLMs ($\sim$25%), AI IDEs (e.g., GitHub Copilot), and existing agent frameworks. Moreover, it reduces security vulnerabilities by up to \textbf{39.77%} compared to human-written baselines. Finally, we demonstrate that the high-quality trajectories generated by SolAgent can be used to distill smaller, open-source models, democratizing access to secure smart contract generation. We release our data and code at https://github.com/openpaperz/SolAgent.


💡 Research Summary

SolAgent is a novel tool‑augmented multi‑agent framework designed to generate high‑quality Solidity smart contracts by integrating iterative verification loops with domain‑specific tooling. The system consists of two agents—a Coding Agent that creates an initial contract from user requirements, and a Refinement Agent that iteratively improves the code based on feedback. Feedback is obtained through a dual‑loop refinement mechanism: an inner loop that runs the Forge compiler and associated unit tests to guarantee functional correctness and gas efficiency, and an outer loop that invokes the Slither static analyzer to detect and remediate security vulnerabilities such as re‑entrancy, integer overflows, and access‑control flaws.

To handle real‑world project structures, SolAgent is equipped with file‑system tools (directory listing, file reading) that allow the agents to explore project hierarchies, import external libraries, and respect existing interfaces. The iterative process is governed by a dynamic stopping algorithm that terminates when (i) all tests pass and no critical vulnerabilities remain, (ii) the pass rate stagnates for a configurable number of rounds, or (iii) feedback similarity exceeds a predefined threshold, preventing infinite loops.

The authors evaluated SolAgent on SolEval+, a benchmark derived from over a thousand high‑quality open‑source Solidity contracts. Compared with state‑of‑the‑art large language models (GPT‑5, Claude‑Sonnet‑4.5), AI‑powered IDEs (GitHub Copilot), and general‑purpose multi‑agent frameworks (MetaGPT, ChatDev), SolAgent achieved a Pass@1 of 64.39 %, roughly 2.5× higher than the ≈25 % baseline of vanilla LLMs. Security analysis showed a 39.77 % reduction in vulnerable contracts relative to human‑written baselines, and gas consumption was reduced by an average of 12 %.

Beyond the multi‑agent system, the authors harvested the interaction trajectories generated during refinement. They distinguished between full‑context trajectories (rich, human‑annotated comments) and compressed‑context trajectories (summarized requirements). Using this data, they fine‑tuned a Qwen‑3‑8B model, effectively distilling the multi‑agent intelligence into a single, lightweight model. The distilled model retained competitive performance (≈58 % Pass@1) while dramatically lowering inference cost, making secure smart‑contract generation accessible to smaller organizations and individual developers.

Key contributions include: (1) the first integration of Solidity‑specific compilation (Forge) and static analysis (Slither) into an automated generation pipeline; (2) a dual‑loop refinement mechanism that simultaneously optimizes functional correctness and security; (3) extensive empirical validation demonstrating state‑of‑the‑art results on a realistic benchmark; and (4) a workflow‑distillation approach that democratizes the technology by producing an efficient open‑source model.

The paper also discusses limitations: reliance on Forge and Slither confines the approach to Ethereum‑compatible contracts, and static analysis cannot catch novel or logic‑level vulnerabilities. Future work will explore multi‑chain extensions, additional verification tools, and automated discovery of new security patterns. Overall, SolAgent represents a significant step toward reliable, secure, and efficient automated smart‑contract development.


Comments & Academic Discussion

Loading comments...

Leave a Comment