BeClaude
Research2026-05-12

CrackMeBench: Binary Reverse Engineering for Agents

Source: Arxiv CS.AI

arXiv:2605.10597v1 Announce Type: cross Abstract: Benchmarks for coding agents increasingly measure source-level software repair, and cybersecurity benchmarks increasingly measure broad capture-the-flag performance. Classical binary reverse engineering remains less precisely specified: given only...

arxivpapersagents