Research2026-05-12

Less Diverse, Less Safe: The Indirect But Pervasive Risk of Test-Time Scaling in Large Language Models

arXiv:2510.08592v3 Announce Type: replace-cross Abstract: Test-Time Scaling (TTS) improves LLM reasoning by exploring multiple candidate responses and then operating over this set to find the best output. A tacit premise behind TTS is that sufficiently diverse candidate pools enhance reliability....

Read Original Article on Arxiv CS.AI

arxivpapers