Research2026-05-12
Users as Annotators: LLM Preference Learning from Comparison Mode
Source: Arxiv CS.AI
arXiv:2510.13830v2 Announce Type: replace-cross Abstract: Pairwise preference data have played an important role in the alignment of large language models (LLMs). Each sample of such data consists of a prompt, two different responses to the prompt, and a binary label indicating which of the two...
arxivpapers