Research2026-05-12

AR-VLA: True Autoregressive Action Expert for Vision-Language-Action Models

arXiv:2603.10126v2 Announce Type: replace-cross Abstract: We propose a standalone autoregressive (AR) Action Expert that generates actions as a continuous causal sequence while conditioning on refreshable vision-language prefixes. In contrast to existing Vision-Language-Action (VLA) models and...

Read Original Article on Arxiv CS.AI

arxivpapersvision