BeClaude
Research2026-04-28

IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models

Source: Arxiv CS.AI

arXiv:2604.24002v1 Announce Type: cross Abstract: Improving the effectiveness of human-robot interaction requires social robots to accurately infer human goals through robust intention understanding. This challenge is particularly critical in multimodal settings, where agents must integrate...

arxivpapers