Research2026-04-28
Evaluating Language Models' Evaluations of Games
Source: Arxiv CS.AI
arXiv:2510.10930v2 Announce Type: replace-cross Abstract: Reasoning is not just about solving problems -- it is also about evaluating which problems are worth solving at all. Evaluations of artificial intelligence (AI) systems primarily focused on problem solving, historically by studying how...
arxivpapers