Research2026-05-12

When (and How) to Trust the Expert: Diagnosing Query-Time Expert-Guided Reinforcement Learning

arXiv:2605.09109v1 Announce Type: new Abstract: Many continuous-control problems ship with a competent but suboptimal controller (a tuned PID, a hand-designed gait). A growing family of methods uses such controllers as queryable experts during RL, but each method has been proposed in isolation, on...

Read Original Article on Arxiv CS.AI

arxivpapersrl