Research2026-04-22
Diamond Maps: Efficient Reward Alignment via Stochastic Flow Maps
Source: Arxiv CS.AI
arXiv:2602.05993v2 Announce Type: replace-cross Abstract: Flow and diffusion models produce high-quality samples, but adapting them to user preferences or constraints post-training remains costly and brittle, a challenge commonly called reward alignment. We argue that efficient reward alignment...
arxivpapers