Research2026-05-05

Beyond Prompt-Induced Lies: Investigating LLM Deception on Benign Prompts

arXiv:2508.06361v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are widely deployed in reasoning, planning, and decision-making tasks, making their trustworthiness critical. A significant and underexplored risk is intentional deception, where an LLM deliberately fabricates or...

Read Original Article on Arxiv CS.AI

arxivpapersprompting