Research2026-05-12
TiledAttention: a CUDA Tile SDPA Kernel for PyTorch
Source: Arxiv CS.AI
arXiv:2603.01960v2 Announce Type: replace-cross Abstract: TiledAttention is a scaled dot-product attention (SDPA) forward operator for SDPA research on NVIDIA GPUs. Implemented in cuTile Python (TileIR) and exposed as a PyTorch-callable function, it is easier to modify than low-level CUDA templates...
arxivpapers