Research2026-05-06
Can LLMs Compress (and Decompress)? Evaluating Code Understanding and Execution via Invertibility
Source: Arxiv CS.AI
arXiv:2601.13398v2 Announce Type: replace-cross Abstract: LLMs demonstrate strong performance on code benchmarks, yet consistent reasoning across forward and backward execution remains elusive. We present RoundTripCodeEval (RTCE), a benchmark of four code execution reasoning tasks that evaluates...
arxivpapers