BeClaude
Research2026-05-14

ALAM: Algebraically Consistent Latent Action Model for Vision-Language-Action Models

Source: Arxiv CS.AI

arXiv:2605.10819v2 Announce Type: replace-cross Abstract: Vision-language-action (VLA) models remain constrained by the scarcity of action-labeled robot data, whereas action-free videos provide abundant evidence of how the physical world changes. Latent action models offer a promising way to...

arxivpapersvision