BeClaude
Research2026-05-07

MICA: Multi-granularity Intertemporal Credit Assignment for Long-Horizon Emotional Support Dialogue

Source: Arxiv CS.AI

arXiv:2603.06194v2 Announce Type: replace-cross Abstract: Reinforcement learning (RL) for large language models (LLMs) has shown strong performance in single-turn tasks, but extending it to multi-turn interaction remains challenging due to sparse rewards and poor per-turn credit assignment. In...

arxivpapers