BeClaude
Policy2026-05-12

Metis: Learning to Jailbreak LLMs via Self-Evolving Metacognitive Policy Optimization

Source: Arxiv CS.AI

arXiv:2605.10067v1 Announce Type: cross Abstract: Red teaming is critical for uncovering vulnerabilities in Large Language Models (LLMs). While automated methods have improved scalability, existing approaches often rely on static heuristics or stochastic search, rendering them brittle against...

arxivpapers