Research2026-05-08
Catch Your Breath: Adaptive Computation for Self-Paced Sequence Production
Source: Arxiv CS.AI
arXiv:2510.13879v2 Announce Type: replace-cross Abstract: Within the landscape of inference-time scaling methods for foundation models, a width-based approach to scaling -- which involves the insertion of tokens in the input stream to delay model responses -- offers a unique advantage by increasing...
arxivpapers