Research2026-04-27
BLAST: Benchmarking LLMs with ASP-based Structured Testing
Source: Arxiv CS.AI
arXiv:2604.22306v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across a broad spectrum of tasks, including natural language understanding, dialogue systems, and code generation. Despite evident progress, less attention has been paid to their...
arxivpapersbenchmark