Research2026-05-12

OPT-BENCH: Evaluating the Iterative Self-Optimization of LLM Agents in Large-Scale Search Spaces

arXiv:2605.08904v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in reasoning and tool use. However, the fundamental cognitive faculties essential for problem solving, including perception, reasoning, and memory, remain the stable core of...

Read Original Article on Arxiv CS.AI

arxivpapersagents