How Reliable is Your Service at the Extreme Edge? Analytical Modeling of Computational Reliability

Reading time: 5 minute
...

📝 Original Info

  • Title: How Reliable is Your Service at the Extreme Edge? Analytical Modeling of Computational Reliability
  • ArXiv ID: 2602.16362
  • Date: 2026-02-18
  • Authors: ** - M. S. Allahham (Queen’s University, Canada) - H. S. Hassanein (Queen’s University, Canada) (논문에 명시된 저자 정보만 포함되었습니다.) — **

📝 Abstract

Extreme Edge Computing (XEC) distributes streaming workloads across consumer-owned devices, exploiting their proximity to users and ubiquitous availability. Many such workloads are AI-driven, requiring continuous neural network inference for tasks like object detection and video analytics. Distributed Inference (DI), which partitions model execution across multiple edge devices, enables these streaming services to meet strict throughput and latency requirements. Yet consumer devices exhibit volatile computational availability due to competing applications and unpredictable usage patterns. This volatility poses a fundamental challenge: how can we quantify the probability that a device, or ensemble of devices, will maintain the processing rate required by a streaming service? This paper presents an analytical framework for computational reliability in XEC, defined as the probability that instantaneous capacity meets demand at a specified Quality of Service (QoS) threshold. We derive closed-form reliability expressions under two information regimes: Minimal Information (MI), requiring only declared operational bounds, and historical data, which refines estimates via Maximum Likelihood Estimation from past observations. The framework extends to multi-device deployments, providing reliability expressions for series, parallel, and partitioned workload configurations. We derive optimal workload allocation rules and analytical bounds for device selection, equipping orchestrators with tractable tools to evaluate deployment feasibility and configure distributed streaming systems. We validate the framework using real-time object detection with YOLO11m model as a representative DI streaming workload; experiments on emulated XED environments demonstrate close agreement between analytical predictions, Monte Carlo sampling, and empirical measurements across diverse capacity and demand configurations.

💡 Deep Analysis

📄 Full Content

The growth in Internet of Things (IoT) devices and demand for real-time data processing is reshaping distributed computing. Traditional Cloud Computing (CC) introduces latency that is prohibitive for time-sensitive applications due to centralized processing [1]. Content Delivery Networks (CDNs) cache static content closer to users but are ineffective for dynamic, computation-heavy workloads that require continuous data generation and processing rather than retrieval. Edge Computing (EC) addresses this by shifting computation toward the network periphery, reducing round-trip latency by processing data near its source [2], [3]. However, the server-centric edge model struggles to meet the demands of AI-powered applications such as real-time Augmented Reality (AR) [4], [5], cloud gaming [6], [7], and video analytics [8], [9]. These applications perform deep neural network inference for object detection, semantic segmentation, pose estimation, and scene M. S. Allahham and H. S. Hassanein are with the School of Computing, Queen's University, Kingston, ON, Canada (e-mail: 20msa7@queensu.ca; hossam@cs.queensu.ca).

understanding, imposing Ultra-Reliable low Latency Communication (URLLC)-grade demands that require millisecondscale latencies and sustained computational throughput, which can overwhelm individual edge servers when serving multiple concurrent users. Distributed Inference (DI), where deep neural network execution is distributed across nearby devices, whether by partitioning input data or the model itself, addresses this capacity constraint. For video streaming services, DI enables parallel frame processing across local device pools, sustaining throughput that individual devices cannot achieve alone [10], [11].

These limitations have driven the evolution toward Extreme Edge Computing (XEC) [12], [13], [14], which decentralizes computation by utilizing consumer-owned Extreme Edge Devices (XEDs) such as smartphones, tablets, wearables, and IoT sensors [15]. This paradigm shares roots with volunteer computing [16], [17], which demonstrated the viability of harnessing distributed, non-dedicated resources for large-scale computation. Unlike volunteer computing that targets batch scientific workloads, XEC targets latency-sensitive streaming services where devices contribute resources opportunistically to nearby requesters. Related concepts include femtoclouds [18] and device-to-device (D2D) assisted edge computing [19], [20], which leverage proximity for mobile computing. More broadly, Computing Power Networks (CPNs) [21] provide a unified framework for orchestrating heterogeneous computing resources across cloud, edge, and end-device tiers. In CPNs, XEC constitutes the outermost layer of this architecture, where consumer devices contribute computational capacity to the network.

However, XEC introduces challenges when it comes to providing a reliable service. The characteristics of XEDs, including mobility, fluctuating computational capacities, unpredictable usage patterns, and limited observability, create a volatile computational environment [22], [23]. Unlike enterprise-owned edge servers with predictable resource availability, consumer devices experience capacity fluctuations driven by concurrent applications, background processes, battery management, and thermal throttling. This raises the question: how can an orchestrator quantify the probability that an XED will sustain the computational performance required by a streaming service?

Existing research does not provide a formal analytical framework to quantify this computational reliability. Prior work on reliability in edge systems assumes: (a) stable, enterprise-owned resources; (b) discrete task models rather than continuous streaming demands; or (c) reliability definitions based on node/link failure rather than sustained computational performance against a Quality of Service (QoS) threshold. In XEC-enabled streaming, the concern shifts from binary node outages to computational reliability: the probability that an XED sustains processing of a continuous data stream according to specified QoS requirements.

This paper proposes an analytical framework for quantifying computational reliability of streaming services in XEC. We model reliability from a computational perspective: the probability that a worker’s available computational capacity meets or exceeds the service’s instantaneous demand, satisfying the specified QoS requirement. The framework addresses scenarios with varying information availability, from minimal operational bounds to refined historical observations, and scales from individual device analysis to multi-device configurations. The contributions are:

• We derive closed-form expressions for single-device computational reliability under two information regimes, enabling tractable reliability assessment: the Minimal Information (MI) model requires only declared capacity and demand bounds, while the historical model refines estimates via M

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut