Research2026-04-28

IntentVLM: Open-Vocabulary Intention Recognition through Forward-Inverse Modeling with Video-Language Models

arXiv:2604.24002v1 Announce Type: cross Abstract: Improving the effectiveness of human-robot interaction requires social robots to accurately infer human goals through robust intention understanding. This challenge is particularly critical in multimodal settings, where agents must integrate...

Read Original Article on Arxiv CS.AI

arxivpapers