Back to News
Research2026-04-17
Weight Patching: Toward Source-Level Mechanistic Localization in LLMs
Source: Arxiv CS.AI
arXiv:2604.13694v1 Announce Type: new Abstract: Mechanistic interpretability seeks to localize model behavior to the internal components that causally realize it. Prior work has advanced activation-space localization and causal tracing, but modules that appear important in activation space may...
arxivpapers