Research2026-05-11

MISA: Mixture of Indexer Sparse Attention for Long-Context LLM Inference

arXiv:2605.07363v1 Announce Type: cross Abstract: DeepSeek Sparse Attention (DSA) sets the state of the art for fine-grained inference-time sparse attention by introducing a learned token-wise indexer that scores every prefix token and selects the most relevant ones for the main attention. To...

Read Original Article on Arxiv CS.AI

arxivpapers