Research2026-04-20
Taming Asynchronous CPU-GPU Coupling for Frequency-aware Latency Estimation on Mobile Edge
Source: Arxiv CS.AI
arXiv:2604.15357v1 Announce Type: cross Abstract: Precise estimation of model inference latency is crucial for time-critical mobile edge applications, enabling devices to calculate latency margins against deadlines and trade them for enhanced model performance or resource savings. However, the...
arxivpapers