Research2026-04-28
AutoRISE: Agent-Driven Strategy Evolution for Red-Teaming Large Language Models
Source: Arxiv CS.AI
arXiv:2604.22871v1 Announce Type: cross Abstract: Automated red-teaming methods for large language models typically optimize attack prompts within a fixed, human-designed strategy, leaving the attack strategy itself unchanged. We instead optimize the strategy. We propose AutoRISE, a method that...
arxivpapersagents