Research2026-04-28

Adaptive Multi-Subspace Representation Steering for Attribute Alignment in Large Language Models

arXiv:2508.10599v4 Announce Type: replace Abstract: Activation steering offers a promising approach to controlling the behavior of Large Language Models by directly manipulating their internal activations. However, most existing methods struggle to jointly steer multiple attributes, often resulting...

Read Original Article on Arxiv CS.AI

arxivpapers