Research2026-05-06
MusicInfuser: Making Video Diffusion Listen and Dance
Source: Arxiv CS.AI
arXiv:2503.14505v3 Announce Type: replace-cross Abstract: We introduce MusicInfuser, an approach that aligns pre-trained text-to-video diffusion models to generate high-quality dance videos synchronized with specified music tracks. Rather than training a multimodal audio-video or audio-motion model...
arxivpapersimage-generation