Policy2026-05-11

Flow-OPD: On-Policy Distillation for Flow Matching Models

arXiv:2605.08063v1 Announce Type: cross Abstract: Existing Flow Matching (FM) text-to-image models suffer from two critical bottlenecks under multi-task alignment: the reward sparsity induced by scalar-valued rewards, and the gradient interference arising from jointly optimizing heterogeneous...

Read Original Article on Arxiv CS.AI

arxivpapers