Research2026-05-05
ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation
Source: Arxiv CS.AI
arXiv:2507.14201v3 Announce Type: replace-cross Abstract: We present ExCyTIn-Bench, the first benchmark to Evaluate an LLM agent X on the task of Cyber Threat Investigation through security questions derived from investigation graphs. Real-world security analysts must sift through a large number of...
arxivpapersagents