Research2026-05-12
VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation
Source: Arxiv CS.AI
arXiv:2605.08553v1 Announce Type: cross Abstract: Large language models can generate useful code from natural language, but their outputs come without correctness guarantees. Verifiable code generation offers a path beyond testing by requiring models to produce not only executable code, but also...
arxivpapersbenchmark