BeClaude
Research2026-05-11

Active teacher selection for reward learning

Source: Arxiv CS.AI

arXiv:2310.15288v3 Announce Type: replace Abstract: Reward learning techniques enable machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite gathering feedback from large...

arxivpapers