Research2026-05-05
Preference Goal Tuning: Post-Training as Latent Control for Frozen Policies
Source: Arxiv CS.AI
arXiv:2412.02125v2 Announce Type: replace Abstract: Goal-conditioned policies enable decision-making models to execute diverse behaviors based on specified goals, yet their downstream performance is often highly sensitive to the choice of instructions or prompts. To bypass the limitations of...
arxivpapers