Research2026-05-05
Jailbreaking Vision-Language Models Through the Visual Modality
Source: Arxiv CS.AI
arXiv:2605.00583v1 Announce Type: cross Abstract: The visual modality of vision-language models (VLMs) is an underexplored attack surface for bypassing safety alignment. We introduce four jailbreak attacks exploiting the vision component: (1) encoding harmful instructions as visual symbol sequences...
arxivpapersvision