BeClaude
Research2026-05-12

Not All Turns Matter: Credit Assignment for Multi-Turn Jailbreaking

Source: Arxiv CS.AI

arXiv:2605.08778v1 Announce Type: new Abstract: Deploying LLMs in multi-turn dialogues facilitates jailbreak attacks that distribute harmful intent across seemingly benign turns. Recent training-based multi-turn jailbreak methods learn long-horizon attack strategies from interaction feedback, but...

arxivpapers