BeClaude
Research2026-04-20

Taming Asynchronous CPU-GPU Coupling for Frequency-aware Latency Estimation on Mobile Edge

Source: Arxiv CS.AI

arXiv:2604.15357v1 Announce Type: cross Abstract: Precise estimation of model inference latency is crucial for time-critical mobile edge applications, enabling devices to calculate latency margins against deadlines and trade them for enhanced model performance or resource savings. However, the...

arxivpapers