BeClaude
Research2026-05-08

Best Arm Identification in Generalized Linear Bandits via Hybrid Feedback

Source: Arxiv CS.AI

arXiv:2605.05745v1 Announce Type: new Abstract: We study fixed-confidence best arm identification in generalized linear bandits under a hybrid feedback model: at each round, the learner may query either (i) absolute reward feedback from a single arm or (ii) relative (dueling) feedback from an arm...

arxivpapers