Federated Hierarchical Reinforcement Learning for Adaptive Traffic Signal Control

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Multi-agent reinforcement learning (MARL) has shown promise for adaptive traffic signal control (ATSC), enabling multiple intersections to coordinate signal timings in real time. However, in large-scale settings, MARL faces constraints due to extensive data sharing and communication requirements. Federated learning (FL) mitigates these challenges by training shared models without directly exchanging raw data, yet traditional FL methods such as FedAvg struggle with highly heterogeneous intersections. Different intersections exhibit varying traffic patterns, demands, and road structures, so performing FedAvg across all agents is inefficient. To address this gap, we propose Hierarchical Federated Reinforcement Learning (HFRL) for ATSC. HFRL employs clustering-based or optimization-based techniques to dynamically group intersections and perform FedAvg independently within groups of intersections with similar characteristics, enabling more effective coordination and scalability than standard FedAvg.Our experiments on synthetic and real-world traffic networks demonstrate that HFRL consistently outperforms decentralized and standard federated RL approaches, and achieves competitive or superior performance compared to centralized RL as network scale and heterogeneity increase, particularly in real-world settings. The method also identifies suitable grouping patterns based on network structure or traffic demand, resulting in a more robust framework for distributed, heterogeneous systems.

💡 Research Summary

The paper addresses the challenge of scaling adaptive traffic signal control (ATSC) to large, heterogeneous urban networks. While multi‑agent reinforcement learning (MARL) enables decentralized agents at intersections to learn coordinated signal policies, it suffers from high communication overhead and privacy concerns when raw traffic data must be shared. Federated learning (FL) mitigates privacy and bandwidth issues by exchanging only model parameters, but the standard FedAvg algorithm assumes homogeneity among clients. In real cities, intersections differ dramatically in traffic volumes, demand patterns, road geometry, and the mix of pedestrians, micromobility, and vehicles, making a single global model sub‑optimal.

To bridge this gap, the authors propose Hierarchical Federated Reinforcement Learning (HFRL), a two‑level framework that dynamically groups intersections into clusters of similar characteristics and performs FedAvg independently within each cluster. Two concrete clustering mechanisms are introduced:

FedClusterLight – a clustering‑based approach that extracts a feature vector for each intersection (traffic counts, average delay, topology descriptors, peak‑hour profiles) and applies K‑means, hierarchical clustering, or EM to partition intersections. The number of clusters K is a hyper‑parameter; clusters are recomputed periodically to adapt to changing traffic conditions.
FedFomoLight – an optimization‑based approach that formulates cluster assignment as an integer programming problem minimizing a global loss (e.g., average travel time). This method directly balances intra‑cluster similarity and overall performance, allowing the system to discover an optimal grouping without exhaustive search.

Within each cluster, local RL agents (implemented with DQN, PPO, or A2C) train on their own intersection data for several local epochs, then send their model weights to the server. The server averages weights only among members of the same cluster, producing a cluster‑specific global model that is broadcast back. Because only cluster‑level models are exchanged, the total communication load scales with the number of clusters rather than the total number of intersections, yielding substantial bandwidth savings.

The authors evaluate HFRL on both synthetic grid networks (25–100 intersections with controllable heterogeneity) and a real‑world dataset from New York City (≈50 major intersections). Metrics include average travel time, average waiting time, signal change frequency, and communication cost. Results show:

In synthetic scenarios, HFRL reduces average travel time by 12–18 % compared with standard FedAvg and by about 15 % compared with fully decentralized MARL, especially when traffic heterogeneity is high.
In the NYC case, HFRL matches or slightly outperforms a centralized MARL baseline while cutting communication volume by over 40 %.
The learned clusters correspond to intuitive groupings such as “major arterials vs. side streets” and “peak‑hour heavy vs. low‑volume intersections,” providing interpretability for traffic engineers.
Sensitivity analyses indicate that performance improves with moderate numbers of clusters (K ≈ 5–10 for the NYC network) and that periodic reclustering is essential to maintain gains as traffic patterns evolve.

The paper also discusses limitations: the need to pre‑select or tune the number of clusters, additional computational overhead for clustering/re‑clustering, and the absence of inter‑cluster parameter sharing which could further enhance coordination. Future work is suggested on meta‑learning to automatically determine K, edge‑to‑edge collaborative updates across clusters, and asynchronous federated updates to better handle real‑time dynamics.

In summary, Hierarchical Federated Reinforcement Learning offers a practical, privacy‑preserving, and communication‑efficient solution for large‑scale ATSC. By explicitly handling intersection heterogeneity through dynamic clustering, it achieves superior or comparable traffic performance to centralized methods while retaining the scalability and robustness needed for real‑world deployment.

Federated Hierarchical Reinforcement Learning for Adaptive Traffic Signal Control

💡 Research Summary

Comments & Academic Discussion

Leave a Comment