Research2026-05-06

Attribution-Guided Pruning for Insight and Control: Circuit Discovery and Targeted Correction in Small-scale LLMs

arXiv:2506.13727v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are widely deployed in real-world applications, yet their internal mechanisms remain difficult to interpret and control, limiting our ability to diagnose and correct undesirable behaviors. Mechanistic...

Read Original Article on Arxiv CS.AI

arxivpapers