Research2026-05-05
Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts
Source: Arxiv CS.AI
arXiv:2508.06361v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are widely deployed in reasoning, planning, and decision-making tasks, making their trustworthiness critical. A significant and underexplored risk is intentional deception, where an LLM deliberately fabricates or...
arxivpapersprompting