Research2026-05-11
Beyond Confidence: Rethinking Self-Assessments for Performance Prediction in LLMs
Source: Arxiv CS.AI
arXiv:2605.07806v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly used in settings where reliable self-assessment is critical. Assessing model reliability has evolved from using probabilistic correctness estimates to, more recently, eliciting verbalized confidence....
arxivpapers