Research2026-04-28

Language Models Might Not Understand You: Evaluating Theory of Mind via Story Prompting

arXiv:2506.19089v5 Announce Type: replace-cross Abstract: We introduce StorySim, a programmable framework for synthetically generating stories to evaluate the theory of mind (ToM) and world modeling (WM) capabilities of large language models (LLMs). Unlike prior benchmarks that may suffer from...

Read Original Article on Arxiv CS.AI

arxivpapersprompting