Prism: Spectral Parameter Sharing for Multi-Agent Reinforcement Learning

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Parameter sharing is a key strategy in multi-agent reinforcement learning (MARL) for improving scalability, yet conventional fully shared architectures often collapse into homogeneous behaviors. Recent methods introduce diversity through clustering, pruning, or masking, but typically compromise resource efficiency. We propose Prism, a parameter sharing framework that induces inter-agent diversity by representing shared networks in the spectral domain via singular value decomposition (SVD). All agents share the singular vector directions while learning distinct spectral masks on singular values. This mechanism encourages inter-agent diversity and preserves scalability. Extensive experiments on both homogeneous (LBF, SMACv2) and heterogeneous (MaMuJoCo) benchmarks show that Prism achieves competitive performance with superior resource efficiency.

💡 Research Summary

The paper addresses a central tension in multi‑agent reinforcement learning (MARL): parameter sharing dramatically reduces memory and computation, yet fully shared policies often collapse into homogeneous behaviors that are sub‑optimal when agents need to specialize. Existing remedies—clustering agents, node‑level pruning, or edge‑level masking—introduce diversity but at the cost of additional storage, sensitivity to early data, or reduced scalability.

Prism proposes a fundamentally different approach: instead of sharing raw weight matrices, the shared network is parameterized in the spectral domain using singular value decomposition (SVD). A weight matrix (W\in\mathbb{R}^{d\times k}) is expressed as (W = U\Sigma V^\top), where (U) (left singular vectors) and (V) (right singular vectors) are orthonormal bases shared by all agents, and (\Sigma) contains the singular values. Diversity is introduced by giving each agent a learnable mask (m_i\in\mathbb{R}^r) that modulates the singular values:
\

Prism: Spectral Parameter Sharing for Multi-Agent Reinforcement Learning

💡 Research Summary

Comments & Academic Discussion

Leave a Comment