BeClaude
Research2026-05-12

When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning

Source: Arxiv CS.AI

arXiv:2605.09109v1 Announce Type: new Abstract: Many continuous-control problems ship with a competent but suboptimal controller (a tuned PID, a hand-designed gait). A growing family of methods uses such controllers as queryable experts during RL, but each method has been proposed in isolation, on...

arxivpapersrl