Research2026-05-06
Ablation Study of Multimodal Perception, Language Grounding, and Control for Human-Robot Interaction in an Object Detection and Grasping Task
Source: Arxiv CS.AI
arXiv:2605.00963v1 Announce Type: cross Abstract: This manuscript extends our previous multimodal human-robot interaction system by introducing a controlled ablation study of the three modules that most strongly influence end-to-end performance: the large language model used for action extraction,...
arxivpapersmultimodal