Research2026-05-12

HyPER: Bridging Exploration and Exploitation for Scalable LLM Reasoning with Hypothesis Path Expansion and Reduction

arXiv:2602.06527v2 Announce Type: replace Abstract: Scaling test-time compute with multi-path chain-of-thought improves reasoning accuracy, but its effectiveness depends critically on the exploration-exploitation trade-off. Existing approaches address this trade-off in rigid ways: tree-structured...

Read Original Article on Arxiv CS.AI

arxivpapersreasoning