Research2026-04-28

Learning to Refine: Self-Refinement of Parallel Reasoning in LLMs

arXiv:2509.00084v2 Announce Type: replace-cross Abstract: Test-time scaling (TTS) has gained widespread attention for enhancing LLM reasoning. Existing approaches such as Best-of-N and majority voting are limited as their performance depends on the quality of candidate responses, making them unable...

Read Original Article on Arxiv CS.AI

arxivpapersreasoning