Research2026-05-11
Active teacher selection for reward learning
Source: Arxiv CS.AI
arXiv:2310.15288v3 Announce Type: replace Abstract: Reward learning techniques enable machine learning systems to learn objectives from human feedback. A core limitation of these systems is their assumption that all feedback comes from a single human teacher, despite gathering feedback from large...
arxivpapers