Research2026-05-07
MICA: Multi-granularity Intertemporal Credit Assignment for Long-Horizon Emotional Support Dialogue
Source: Arxiv CS.AI
arXiv:2603.06194v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) for large language models (LLMs) has shown strong performance in single-turn tasks, but extending it to multi-turn interaction remains challenging due to sparse rewards and poor per-turn credit assignment. In...
arxivpapers