Device Association and Resource Allocation for Hierarchical Split Federated Learning in Space-Air-Ground Integrated Network

Device Association and Resource Allocation for Hierarchical Split Federated Learning in Space-Air-Ground Integrated Network
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

6G facilitates deployment of Federated Learning (FL) in the Space-Air-Ground Integrated Network (SAGIN), yet FL confronts challenges such as resource constrained and unbalanced data distribution. To address these issues, this paper proposes a Hierarchical Split Federated Learning (HSFL) framework and derives its upper bound of loss function. To minimize the weighted sum of training loss and latency, we formulate a joint optimization problem that integrates device association, model split layer selection, and resource allocation. We decompose the original problem into several subproblems, where an iterative optimization algorithm for device association and resource allocation based on brute-force split point search is proposed. Simulation results demonstrate that the proposed algorithm can effectively balance training efficiency and model accuracy for FL in SAGIN.


💡 Research Summary

The paper addresses the challenges of deploying federated learning (FL) over a Space‑Air‑Ground Integrated Network (SAGIN) in the emerging 6G era. In such a network, mobile devices suffer from limited computation and communication resources, while the data they collect are highly heterogeneous across devices. To mitigate these issues, the authors propose a Hierarchical Split Federated Learning (HSFL) framework that extends the concept of Split Federated Learning (SFL) to a three‑tier architecture consisting of low‑Earth‑orbit satellites, unmanned aerial vehicles (UAVs), and ground devices.

In HSFL, a deep neural network (DNN) is partitioned at a chosen layer ℓ. The shallow portion (layers 1…ℓ) is executed locally on each device, whereas the deeper portion (layers ℓ+1…L) is off‑loaded to the associated UAV, which in turn forwards the intermediate activations to a satellite for global aggregation. This hierarchical split reduces the local computational burden and shortens the uplink transmission size, while still allowing the satellite to coordinate model updates across the entire network.

The authors first conduct a rigorous convergence analysis. Assuming each local loss function is β‑Lipschitz smooth and µ‑strongly convex, and bounding the variance of stochastic gradients, they derive an upper bound on the expected global loss after mE local updates (Theorem 1). The bound explicitly depends on the split layer ℓ (deeper splits increase the bound) and on a data heterogeneity term Γ, which captures the discrepancy between local class distributions and the global distribution. This analysis shows that both the choice of split point and the device‑UAV association critically affect convergence speed and final model accuracy.

A detailed communication and computation model follows. The uplink data rate from device n to UAV k is expressed as rₙ,ₖ = aₙ,ₖ lₙ,ₖ B_Uₖ log₂(1 + p_Uₙ 10^{−PLₙ,ₖ}/N₀), where aₙ,ₖ is the binary association variable, lₙ,ₖ the bandwidth allocation ratio, and PLₙ,ₖ the path loss that depends on horizontal distance and elevation angle. The transmission latency for the intermediate feature map of size M_ℓ is t_{ℓ,n,k}=M_ℓ/rₙ,ₖ. Local computation latency is t_{ℓ,n}^{cp}=C_ℓⁿ/fₙ, where C_ℓⁿ denotes the computational load of the shallow sub‑network. UAV‑to‑satellite latency is modeled similarly, accounting for antenna gains, distance, and rain attenuation (Weibull‑distributed). The total training latency per global round is T = t_d + t_u + t_s + N_sw τ_s, where t_d is the maximum device‑UAV latency, t_u the maximum UAV computation latency, t_s the satellite uplink latency, N_sw the number of satellite handovers, and τ_s the handover time.

The core optimization problem (P1) seeks to minimize a weighted sum of training loss and latency:  min_{ℓ, aₙ,ₖ, lₙ,ₖ, B_k, fₙ,ₖ} (1−θ) T + θ P, where P = ℓ Z² + (L−ℓ) Z² + ℓ σ² + (L−ℓ) σ² + Σ_k | Σ_{n∈N_k} aₙ,ₖ pₙ(c) − p(c) |² captures the penalty due to data heterogeneity. Constraints enforce bandwidth limits, computation capacity, binary association, and feasible split layers, resulting in a mixed‑integer nonlinear program (MINLP).

To solve this intractable problem, the authors decompose it into three sub‑problems:

  1. Bandwidth Allocation – For a given association aₙ,ₖ and target latency t_d, Theorem 2 yields closed‑form expressions for the optimal bandwidth ratio lₙ,ₖ* and the total UAV uplink bandwidth B_k*. The optimal t_d* is obtained via a bisection method solving Σ_k Σ_n aₙ,ₖ M_ℓ/(t_d*−C_ℓⁿ/fₙ) Rₙ,ₖ = B_U.

  2. Computation Allocation – Applying KKT conditions, the optimal computation share for each device at UAV k is fₙ,ₖ* = (C_ℓ^{n,k} / Σ_{i∈N_k} C_ℓ^{i,k}) f_k, where f_k is the total CPU budget of UAV k.

  3. Device‑UAV Association – With bandwidth and computation fixed, the association problem reduces to minimizing a linearized version of the heterogeneity term using auxiliary variables q_{k,c}. Lagrangian relaxation introduces multipliers λ_{k,c}, μ_{k,c}, ψ, ν, leading to the selection rule:  k*n = arg min_k { ψ M_ℓ/(t_d’−C_ℓⁿ/fₙ) Rₙ,ₖ + ν Σ{i∈N_k} C_ℓ^{i,k} t_u’ + Σ_c (λ_{k,c}−μ_{k,c})(pₙ(c)−p(c)) }. This rule simultaneously accounts for communication cost, computation cost, and data heterogeneity.

The overall algorithm (Algorithm 1) iterates over all possible split layers ℓ = 1…L, solving the three sub‑problems for each ℓ, and finally selects the split point that yields the lowest objective value. The computational complexity is O(L), making it suitable for practical SAGIN deployments.

Simulation studies evaluate the proposed scheme under varying numbers of UAVs, degrees of data imbalance, and different values of the weighting factor θ. Compared with baseline approaches (standard FL, FL with static split points, and FL without hierarchical splitting), the HSFL solution achieves up to 15 % lower training loss and up to 20 % reduction in overall latency. The gains are especially pronounced when data distributions are highly non‑IID, confirming that the association optimization effectively mitigates heterogeneity‑induced performance degradation.

In summary, the paper makes three principal contributions: (1) it provides a theoretical convergence bound for hierarchical split federated learning that quantifies the impact of split layer depth and data heterogeneity; (2) it formulates a joint optimization of device association, split‑layer selection, and resource allocation that captures the intertwined nature of communication, computation, and statistical challenges in SAGIN; and (3) it proposes a low‑complexity iterative algorithm with closed‑form sub‑solutions, demonstrating through extensive simulations that the method substantially improves both learning efficiency and model accuracy in realistic 6G SAGIN scenarios.


Comments & Academic Discussion

Loading comments...

Leave a Comment