Research2026-04-30
A Decision-Theoretic Formalisation of Steganography With Applications to LLM Monitoring
Source: Arxiv CS.AI
arXiv:2602.23163v3 Announce Type: replace Abstract: Large language models are beginning to show steganographic capabilities. Such capabilities could allow misaligned models to evade oversight mechanisms. Yet principled methods to detect and quantify such behaviours are lacking. Classical...
arxivpapers