Research2026-04-24

ReProbe: Efficient Test-Time Scaling of Multi-Step Reasoning by Probing Internal States of Large Language Models

arXiv:2511.06209v5 Announce Type: replace Abstract: LLMs can solve complex tasks by generating long, multi-step reasoning chains. Test-time scaling (TTS) can further improve performance by sampling multiple variants of intermediate reasoning steps, verifying their correctness, and selecting the...

Read Original Article on Arxiv CS.AI

arxivpapersreasoning