Research2026-04-27

Mechanistic Interpretability of Antibody Language Models Using SAEs

arXiv:2512.05794v2 Announce Type: replace-cross Abstract: Sparse autoencoders (SAEs) are a mechanistic interpretability technique that have been used to provide insight into learned concepts within large protein language models. Here, we employ TopK and Ordered SAEs to investigate autoregressive...

Read Original Article on Arxiv CS.AI

arxivpapers