BeClaude
Policy2026-05-14

Q-Flow: Stable and Expressive Reinforcement Learning with Flow-Based Policy

Source: Arxiv CS.AI

arXiv:2605.13435v1 Announce Type: cross Abstract: There is growing interest in utilizing flow-based models as decision-making policies in reinforcement learning due to their high expressive capacity. However, effectively leveraging this expressivity for value maximization remains challenging, as...

arxivpapersrl