Research2026-04-22

ODMA: On-Demand Memory Allocation Strategy for LLM Serving on LPDDR-Class Accelerators

arXiv:2512.09427v5 Announce Type: replace-cross Abstract: Existing memory management techniques severely hinder efficient Large Language Model serving on accelerators constrained by poor random-access bandwidth.While static pre-allocation preserves memory contiguity,it incurs significant overhead...

Read Original Article on Arxiv CS.AI

arxivpapers