Research2026-04-20
Beyond MCQ: An Open-Ended Arabic Cultural QA Benchmark with Dialect Variants
Source: Arxiv CS.AI
arXiv:2510.24328v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly used to answer everyday questions, yet their performance on culturally grounded and dialectal content remains uneven across languages. We propose a comprehensive method that (i) translates Modern...
arxivpapersbenchmark