Research2026-05-05

Language models recognize dropout and Gaussian noise applied to their activations

arXiv:2604.17465v2 Announce Type: replace Abstract: We provide evidence that language models can detect, localize and, to a certain degree, verbalize the difference between perturbations applied to their activations. More precisely, we either (a) mask activations, simulating dropout, or (b) add...

Read Original Article on Arxiv CS.AI

arxivpapers