Research2026-05-06
Beyond Scalars: Evaluating and Understanding LLM Reasoning via Geometric Progress and Stability
Source: Arxiv CS.AI
arXiv:2603.10384v2 Announce Type: replace Abstract: Evaluating LLM reliability via scalar probabilities often fails to capture the structural dynamics of reasoning. We introduce TRACED, a framework that assesses reasoning quality through theoretically grounded geometric kinematics. By decomposing...
arxivpapersstability-aireasoning