BeClaude
Research2026-05-07

TCM-Serve: Modality-aware Scheduling for Multimodal Large Language Model Inference

Source: Arxiv CS.AI

arXiv:2603.26498v2 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) power platforms like ChatGPT, Gemini, and Copilot, enabling richer interactions with text, images, and videos. These heterogeneous workloads introduce additional inference stages, such as vision...

arxivpapersmultimodal