Research2026-04-28

AIPsy-Affect: A Keyword-Free Clinical Stimulus Battery for Mechanistic Interpretability of Emotion in Language Models

arXiv:2604.23719v1 Announce Type: cross Abstract: Mechanistic interpretability research on emotion in large language models -- linear probing, activation patching, sparse autoencoder (SAE) feature analysis, causal ablation, steering vector extraction -- depends on stimuli that contain the words for...

Read Original Article on Arxiv CS.AI

arxivpapers