BeClaude
Research2026-04-23

MMCORE: MultiModal COnnection with Representation Aligned Latent Embeddings

Source: Arxiv CS.AI

arXiv:2604.19902v1 Announce Type: cross Abstract: We present MMCORE, a unified framework designed for multimodal image generation and editing. MMCORE leverages a pre-trained Vision-Language Model (VLM) to predict semantic visual embeddings via learnable query tokens, which subsequently serve as...

arxivpapersmultimodal