BeClaude
Research2026-04-30

Option-Order Randomisation Reveals a Distributional Position Attractor in Prompted Sandbagging

Source: Arxiv CS.AI

arXiv:2604.26206v1 Announce Type: cross Abstract: A predecessor pilot (Cacioli, 2026) found that Llama-3-8B implements prompted sandbagging as positional collapse rather than answer avoidance. However, fixed option ordering in MMLU-Pro left open whether this reflected a model-level...

arxivpapersprompting