Policy2026-05-11
Flow-OPD: On-Policy Distillation for Flow Matching Models
Source: Arxiv CS.AI
arXiv:2605.08063v1 Announce Type: cross Abstract: Existing Flow Matching (FM) text-to-image models suffer from two critical bottlenecks under multi-task alignment: the reward sparsity induced by scalar-valued rewards, and the gradient interference arising from jointly optimizing heterogeneous...
arxivpapers