Agent-GSPO: Communication-Efficient Multi-Agent Systems via Group Sequence Policy Optimization

February 22, 2026

Reading time: 1 minute

...

📝 Original Info

Title: Agent-GSPO: Communication-Efficient Multi-Agent Systems via Group Sequence Policy Optimization
ArXiv ID: 2510.22477
Date: 2025-10-26
Authors: 제공되지 않음 (논문에 저자 정보가 포함되지 않았습니다.)

📝 Abstract

To combat the prohibitive communication costs of ``free-for-all" multi-agent systems (MAS), we introduce \textbf{Agent-GSPO}, a framework that directly optimizes for token economy using sequence-level reinforcement learning. Agent-GSPO leverages the stable and memory-efficient Group Sequence Policy Optimization (GSPO) algorithm to train agents on a communication-aware reward that explicitly penalizes verbosity. Across seven reasoning benchmarks, Agent-GSPO not only achieves new state-of-the-art performance but does so with a fraction of the token consumption of existing methods. By fostering emergent strategies like ``strategic silence," our approach provides a practical blueprint for developing scalable and economically viable multi-agent systems.

Agent-GSPO: Communication-Efficient Multi-Agent Systems via Group Sequence Policy Optimization

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A Geometric Approach to Feedback Stabilization of Nonlinear Systems with Drift

Adaptive Control for a Physics-Informed Model of a Thermal Energy Distribution System: Qualitative Analysis

Advancing Autonomous Emergency Response Systems: A Generative AI Perspective

Start searching

No results found