Research2026-05-12

TiledAttention: a CUDA Tile SDPA Kernel for PyTorch

arXiv:2603.01960v2 Announce Type: replace-cross Abstract: TiledAttention is a scaled dot-product attention (SDPA) forward operator for SDPA research on NVIDIA GPUs. Implemented in cuTile Python (TileIR) and exposed as a PyTorch-callable function, it is easier to modify than low-level CUDA templates...

Read Original Article on Arxiv CS.AI

arxivpapers