Policy2026-05-07
Descent-Guided Policy Gradient for Scalable Cooperative Multi-Agent Learning
Source: Arxiv CS.AI
arXiv:2602.20078v3 Announce Type: replace-cross Abstract: Scaling cooperative multi-agent reinforcement learning (MARL) is fundamentally limited by cross-agent noise. When agents share a common reward, each agent's learning signal is computed from a shared return that depends on all agents, so the...
arxivpapersagents