Research2026-06-29

AI Snitches Get Glitches: Towards Evading Agentic Surveillance

Originally published byArxiv CS.AI

arXiv:2606.25836v2 Announce Type: replace Abstract: To better assist users with completing challenging tasks, AI agents mediate communications, access data, and interact with different APIs. Many employers (and even nation-states) already provide their users with this technology. However,...

The Surveillance Paradox in AI Agent Systems

A new preprint from arXiv (2606.25836v2) examines a growing tension in enterprise AI deployment: the same agentic systems designed to boost productivity are increasingly being weaponized for workplace surveillance. The paper, “AI Snitches Get Glitches,” explores how AI agents that mediate communications, access data, and interact with APIs can be subverted—and why this matters for both employers and employees.

What the Research Reveals

The core finding is straightforward yet significant: AI agents that monitor user behavior for policy compliance or performance tracking are vulnerable to evasion techniques. The researchers demonstrate that these “agentic surveillance” systems—where AI monitors other AI or human-AI interactions—can be tricked through prompt injection, context manipulation, or by exploiting the agent’s own tool-use capabilities. Essentially, the very features that make agents useful (autonomous API calls, data access, communication mediation) also create attack surfaces for those seeking to avoid detection.

This is not abstract theory. The paper notes that employers and even nation-states are already deploying agentic systems to users, creating a surveillance infrastructure that operates at machine speed and scale. The “glitches” in the title refer to both technical failures and the ethical friction inherent in such systems.

Why This Matters Now

The timing is critical. As Claude, ChatGPT, and other agentic platforms roll out “computer use” features and API-mediated workflows, the line between helpful assistance and intrusive monitoring blurs. Employers can now track not just keystrokes or screen time, but the intent behind AI interactions—what tasks employees delegate, what data they access, and how they prompt the system.

This creates a dual-edged reality: the same agent that helps you draft a sensitive memo could also report that you accessed competitor data or worked outside approved hours. The paper’s evasion techniques, while ethically ambiguous, highlight a fundamental design flaw—these systems lack robust privacy boundaries by default.

Implications for AI Practitioners

For developers and deployers, the takeaway is clear: privacy must be architected in, not bolted on. Current agentic systems often treat all user actions as observable by default, with surveillance as an afterthought feature. This approach is both insecure (evasion techniques will proliferate) and ethically precarious (it erodes trust).

Practitioners should consider:

Transparency by design: Agents should clearly signal when they are logging or reporting behavior.
Granular consent: Users need control over what the agent monitors and shares with employers.
Adversarial testing: Red-team agentic surveillance systems as rigorously as you would test for security vulnerabilities.

The paper ultimately serves as a warning: agentic surveillance may be technically feasible, but it is not sustainable without user trust. The “glitches” are not bugs—they are the system’s immune response to an unhealthy design.

Key Takeaways

Agentic surveillance systems that monitor user behavior through AI agents are vulnerable to evasion techniques like prompt injection and context manipulation
The same features that make agents productive (API access, communication mediation) create attack surfaces for avoiding detection
Privacy must be designed into agentic systems from the start, not added as an afterthought
AI practitioners should prioritize transparency, granular consent, and adversarial testing to maintain user trust in agentic deployments

Read Original Article on Arxiv CS.AI

arxivpapersagents