Research2026-05-07
RoboAlign-R1: Distilled Multimodal Reward Alignment for Robot Video World Models
Source: Arxiv CS.AI
arXiv:2605.03821v1 Announce Type: cross Abstract: Existing robot video world models are typically trained with low-level objectives such as reconstruction and perceptual similarity, which are poorly aligned with the capabilities that matter most for robot decision making, including instruction...
arxivpapersmultimodal