Research2026-04-28
AgentHER: Hindsight Experience Replay for LLM Agent Trajectory Relabeling
Source: Arxiv CS.AI
arXiv:2603.21357v3 Announce Type: replace Abstract: LLM agents fail on the majority of real-world tasks -- GPT-4o succeeds on fewer than 15% of WebArena navigation tasks and below 55% pass@1 on ToolBench (Zhou et al., 2024; Qin et al., 2024) -- yet every failed trajectory is routinely discarded,...
arxivpapersagents