SCAFFLSA: Taming Heterogeneity in Federated Linear Stochastic Approximation and TD Learning
In this paper, we analyze the sample and communication complexity of the federated linear stochastic approximation (FedLSA) algorithm. We explicitly quantify the effects of local training with agent heterogeneity. We show that the communication complexity of FedLSA scales polynomially with the inverse of the desired accuracy $ε$. To overcome this, we propose SCAFFLSA a new variant of FedLSA that uses control variates to correct for client drift, and establish its sample and communication complexities. We show that for statistically heterogeneous agents, its communication complexity scales logarithmically with the desired accuracy, similar to Scaffnew. An important finding is that, compared to the existing results for Scaffnew, the sample complexity scales with the inverse of the number of agents, a property referred to as linear speed-up. Achieving this linear speed-up requires completely new theoretical arguments. We apply the proposed method to federated temporal difference learning with linear function approximation and analyze the corresponding complexity improvements.
💡 Research Summary
This paper investigates the fundamental trade‑offs between sample efficiency and communication cost in federated linear stochastic approximation (FedLSA), a framework that captures many distributed learning problems, including federated temporal‑difference (TD) learning with linear function approximation. The authors first provide a precise decomposition of the error dynamics of FedLSA when agents perform (H) local stochastic updates between each global aggregation. They identify three distinct contributors to the mean‑squared error (MSE): (i) a deterministic bias term (\bar\rho_H) that accumulates because each client solves a slightly different linear system, (ii) a zero‑mean fluctuation term arising from stochastic oracle noise, and (iii) a term reflecting the propagation of local updates through the product of random matrices (\Gamma(c,\eta)). Under standard i.i.d. sampling assumptions, they prove that the bias decays exponentially with (H) but remains non‑negligible for heterogeneous agents, leading to a communication complexity that scales polynomially (roughly (O(1/\varepsilon))) with the desired accuracy (\varepsilon).
To overcome this limitation, the authors introduce SCAFFLSA, a novel variant that incorporates control variates (also called “server‑side correction terms”) for each client. During local training, each client subtracts its current control variate (s_c) from the stochastic gradient, and the server updates these variates after each aggregation to track the discrepancy between the local and global objectives. This mechanism exactly cancels the deterministic drift caused by heterogeneity, allowing clients to perform many more local steps without incurring additional bias. Theoretical analysis shows that SCAFFLSA achieves a logarithmic communication complexity (O(\log(1/\varepsilon))) — matching the best known rates for variance‑reduced federated methods — while preserving a linear speed‑up in sample complexity: the total number of stochastic samples needed to reach MSE (\varepsilon) is (O!\big(\frac{1}{N,\varepsilon^{2}}\log\frac{1}{\varepsilon}\big)), where (N) is the number of agents. This linear scaling is proved using a new stochastic expansion that carefully tracks the interaction between the averaged contraction matrix (\bar\Gamma(\eta)_H) and the control‑variate updates, a technique not previously available for federated stochastic approximation.
The authors also extend their results to the more realistic Markovian sampling setting, where each client’s data stream follows a geometrically ergodic Markov chain. By incorporating mixing‑time constants into the analysis, they show that the same asymptotic rates hold up to constant factors, confirming that the method is robust to temporal correlations in the data.
A significant portion of the paper is devoted to applying these findings to federated TD(0) learning. The authors cast the projected Bellman equation for each client as a linear system (\bar A_c\theta_c^\star = \bar b_c) with (\bar A_c) and (\bar b_c) defined via feature expectations under the client‑specific stationary distribution. The global TD solution (\theta^\star) is then the solution of the averaged system. By running FedLSA or SCAFFLSA on this formulation, they obtain concrete bounds on the number of trajectories and communication rounds required to achieve a prescribed value‑function error. Empirical experiments on synthetic Markov decision processes and standard RL benchmarks demonstrate that SCAFFLSA reduces the number of communication rounds by a factor of five or more compared to FedLSA, while attaining equal or lower MSE.
In summary, the paper makes three core contributions: (1) a refined, bias‑aware convergence analysis of federated linear stochastic approximation, (2) the design and rigorous analysis of SCAFFLSA, which simultaneously attains logarithmic communication cost and linear‑in‑(N) sample speed‑up, and (3) a concrete instantiation of these results for federated TD learning, showing practical gains in both communication efficiency and statistical accuracy. The work bridges a gap between the theory of variance‑reduced federated optimization and the practice of distributed reinforcement learning, and it opens the door to further research on control‑variate techniques for other federated stochastic algorithms.
Comments & Academic Discussion
Loading comments...
Leave a Comment