Research2026-05-11
Experience Sharing in Mutual Reinforcement Learning for Heterogeneous Language Models
Source: Arxiv CS.AI
arXiv:2605.07244v1 Announce Type: cross Abstract: We introduce Mutual Reinforcement Learning, a framework for concurrent RL post-training in which heterogeneous LLM policies exchange typed experience while keeping separate parameters, objectives, and tokenizers. The framework combines a Shared...
arxivpapersrl