Research2026-04-28
Audio-Omni: Extending Multi-modal Understanding to Versatile Audio Generation and Editing
Source: Arxiv CS.AI
arXiv:2604.10708v2 Announce Type: replace-cross Abstract: Recent progress in multimodal models has spurred rapid advances in audio understanding, generation, and editing. However, these capabilities are typically addressed by specialized models, leaving the development of a truly unified framework...
arxivpapers