Research2026-05-11
$f$-Divergence Regularized RLHF: Two Tales of Sampling and Unified Analyses
Source: Arxiv CS.AI
arXiv:2605.06977v1 Announce Type: cross Abstract: Reinforcement Learning from Human Feedback (RLHF) has become a cornerstone technique for post-training large language models. While most existing approaches rely on the reverse KL-regularization, recent empirical studies have begun exploring...
arxivpapers