BeClaude
Research2026-04-24

RIFT: Repurposing Negative Samples via Reward-Informed Fine-Tuning

Source: Arxiv CS.AI

arXiv:2601.09253v2 Announce Type: replace-cross Abstract: While Supervised Fine-Tuning (SFT) and Rejection Sampling Fine-Tuning (RFT) are standard for LLM alignment, they either rely on costly expert data or discard valuable negative samples, leading to data inefficiency. To address this, we...

arxivpapersfine-tuning