BeClaude
Research2026-05-06

Omni-NegCLIP: Enhancing CLIP with Front-Layer Contrastive Fine-Tuning for Comprehensive Negation Understanding

Source: Arxiv CS.AI

arXiv:2603.29258v2 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) have demonstrated strong capabilities across a wide range of multimodal tasks. However, recent studies have shown that VLMs, such as CLIP, perform poorly in understanding negation expressions, which are common...

arxivpapersfine-tuning