Research2026-04-28
Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models
Source: Arxiv CS.AI
arXiv:2508.10599v4 Announce Type: replace Abstract: Activation steering offers a promising approach to controlling the behavior of Large Language Models by directly manipulating their internal activations. However, most existing methods struggle to jointly steer multiple attributes, often resulting...
arxivpapers