BeClaude
Research2026-05-12

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces

Source: Arxiv CS.AI

arXiv:2605.08904v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and tool use. However, the fundamental cognitive faculties essential for problem solving, including perception, reasoning, and memory, remain the stable core of...

arxivpapersagents