Research2026-04-28
Explanation Quality Assessment as Ranking with Listwise Rewards
Source: Arxiv CS.AI
arXiv:2604.24176v1 Announce Type: new Abstract: We reformulate explanation quality assessment as a ranking problem rather than a generation problem. Instead of optimizing models to produce a single "best" explanation token-by-token, we train reward models to discriminate among multiple candidate...
arxivpapers