Back to News
Research2026-04-17
Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs
Source: Arxiv CS.AI
arXiv:2604.13258v1 Announce Type: cross Abstract: Attribution methods seek to explain language model predictions by quantifying the contribution of input tokens to generated outputs. However, most existing techniques are designed for encoder-based architectures and rely on linear approximations...
arxivpapers