Research2026-05-12
Measure what Matters: Psychometric Evaluation of AI with Situational Judgment Tests
Source: Arxiv CS.AI
arXiv:2510.22170v2 Announce Type: replace Abstract: Persona conditioning is widely used to steer large language model (LLM) behavior, but it is unclear whether it induces stable behavioral structure or superficial variation. We propose a framework to measure consistent behavioral tendencies using...
arxivpapers