Research2026-05-01
Do Sparse Autoencoders Capture Concept Manifolds?
Source: Arxiv CS.AI
arXiv:2604.28119v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption that concepts correspond to independent linear directions. However, a growing body of evidence...
arxivpapers