Research2026-05-12
Transformers Can Implement Preconditioned Richardson Iteration for In-Context Gaussian Kernel Regression
Source: Arxiv CS.AI
arXiv:2605.08475v1 Announce Type: cross Abstract: Mechanistic accounts of in-context learning (ICL) have identified iterative algorithms for linear regression and related linear prediction tasks, often using linear or ReLU attention variants. For nonlinear ICL, prior work has related softmax and...
arxivpapers