Research2026-05-12
CrackMeBench: Binary Reverse Engineering for Agents
Source: Arxiv CS.AI
arXiv:2605.10597v1 Announce Type: cross Abstract: Benchmarks for coding agents increasingly measure source-level software repair, and cybersecurity benchmarks increasingly measure broad capture-the-flag performance. Classical binary reverse engineering remains less precisely specified: given only...
arxivpapersagents