BeClaude
Research2026-05-08

Same Words, Different Judgments: How Preferences Vary Across Modalities

Source: Arxiv CS.AI

arXiv:2602.22710v2 Announce Type: replace-cross Abstract: Preference-based reinforcement learning (PbRL) is the dominant framework for aligning AI systems to human preferences. However, evaluation protocols for such data were designed for text and have not been validated for speech. We present the...

arxivpapers