Research2026-04-22
ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators
Source: Arxiv CS.AI
arXiv:2512.09427v5 Announce Type: replace-cross Abstract: Existing memory management techniques severely hinder efficient Large Language Model serving on accelerators constrained by poor random-access bandwidth.While static pre-allocation preserves memory contiguity,it incurs significant overhead...
arxivpapers