Research2026-05-14

History Anchors: How Prior Behavior Steers LLM Decisions Toward Unsafe Actions

arXiv:2605.13825v1 Announce Type: new Abstract: Frontier LLMs are increasingly deployed as agents that pick the next action after a long log of prior tool calls produced by the same or a different model. We ask a simple safety question: if a prior step in that log was harmful, will the model...

Read Original Article on Arxiv CS.AI

arxivpapers