Latency-aware edge computing: A survey of optimization techniques and open challenges
Abstract
Edge computing has emerged as a transformative paradigm to meet the ultra-low latency requirements of modern applications by positioning computation closer to data sources and end users. This survey provides a comprehensive review of 250 research papers published between 2016 and 2025, systematically examining latency optimization strategies across multiple layers of edge computing systems. This survey organizes the literature into six major categories, task offloading and scheduling, resource allocation and orchestration, edge caching and content delivery, application-level optimizations, networking and communication protocols, and security and privacy considerations. Our analysis highlights that dynamic offloading strategies, machine learning-driven orchestration, and predictive caching techniques are the most widely adopted approaches for latency reduction. However, open challenges persist, including scalable caching under mobility, privacy-preserving learning with latency guarantees, and integrated solutions for 6G-enabled environments. This survey concludes that effective latency optimization requires coordinated cross-layer approaches rather than isolated improvements. It identifies research gaps in scalability, deterministic real-time guarantees, joint energy and performance optimization, and standardization, and outlines directions toward robust latency optimized edge frameworks for emerging applications such as autonomous driving, extended reality, and industrial IoT.