BeClaude
Research2026-05-12

Kaczmarz Linear Attention

Source: Arxiv CS.AI

arXiv:2605.08587v1 Announce Type: cross Abstract: Long-context language modeling remains central to modern sequence modeling, but the quadratic cost of Transformer attention makes scaling computationally prohibitive. Linear recurrent models address this bottleneck by compressing the context into a...

arxivpapers