Research2026-04-27

BLAST: Benchmarking LLMs with ASP-based Structured Testing

arXiv:2604.22306v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across a broad spectrum of tasks, including natural language understanding, dialogue systems, and code generation. Despite evident progress, less attention has been paid to their...

Read Original Article on Arxiv CS.AI

arxivpapersbenchmark