Research2026-05-12

Scaling Limits of Long-Context Transformers

arXiv:2605.08505v1 Announce Type: cross Abstract: We study the long-context limit of softmax self-attention with a fixed query and a random context of $n$ i.i.d. keys on the sphere, viewing the inverse temperature $\beta_n$ as the scaling parameter that decides whether attention degenerates into...

Read Original Article on Arxiv CS.AI

arxivpapers