Research2026-05-01

Do Sparse Autoencoders Capture Concept Manifolds?

arXiv:2604.28119v1 Announce Type: cross Abstract: Sparse autoencoders (SAEs) are widely used to extract interpretable features from neural network representations, often under the implicit assumption that concepts correspond to independent linear directions. However, a growing body of evidence...

Read Original Article on Arxiv CS.AI

arxivpapers