Research2026-05-12
Priming: Hybrid State Space Models From Pre-trained Transformers
Source: Arxiv CS.AI
arXiv:2605.08301v1 Announce Type: cross Abstract: Hybrid State-Space models combine Attention with recurrent State-Space Model (SSM) layers, balancing eidetic memory from Attention with compressed fading memory from SSMs. This yields smaller Key-Value caches and faster decoding than Transformers,...
arxivpapers