Use of Devolved Controllers in Data Center Networks

Use of Devolved Controllers in Data Center Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In a data center network, for example, it is quite often to use controllers to manage resources in a centralized man- ner. Centralized control, however, imposes a scalability problem. In this paper, we investigate the use of multiple independent controllers instead of a single omniscient controller to manage resources. Each controller looks after a portion of the network only, but they together cover the whole network. This therefore solves the scalability problem. We use flow allocation as an example to see how this approach can manage the bandwidth use in a distributed manner. The focus is on how to assign components of a network to the controllers so that (1) each controller only need to look after a small part of the network but (2) there is at least one controller that can answer any request. We outline a way to configure the controllers to fulfill these requirements as a proof that the use of devolved controllers is possible. We also discuss several issues related to such implementation.


💡 Research Summary

The paper addresses the scalability bottleneck inherent in centralized SDN controllers for large‑scale data‑center networks. While a single omniscient controller simplifies design, it must store and process massive topology and flow state, leading to increased latency and a single point of failure as the network grows. To overcome this, the authors propose a “devolved controller” architecture in which the network graph G = (V, E) is partitioned among q independent controllers, each managing a subgraph G_i = (V_i, E_i). The key requirements are that (1) each controller’s jurisdiction remains small, and (2) for any source‑destination pair (s, t) at least one controller can supply a valid route.

The authors model the problem as allocating pre‑computed k‑multipaths (sets of k distinct paths for each ordered pair (s, t)) to the q controllers while minimizing the number of unique links each controller must monitor. This allocation problem is shown to be NP‑hard, as the solution space grows according to a Stirling number of the second kind. Consequently, two heuristic algorithms are introduced.

Path‑Partition Heuristic: All (s, t) pairs are processed in random order. For each pair, k distinct paths are generated using an iterative Dijkstra approach that inflates link weights after each path (weight increment ω). The resulting multipath M is then assigned to the controller i that minimizes a cost function c_i = α·ν_i(M) + μ_i, where μ_i is the current number of links already monitored by controller i, ν_i(M) is the number of new links introduced by M, and α balances the preference for re‑using existing links versus load balancing. Empirical tuning shows α between 4 and 8 yields good results.

Partition‑Path Heuristic: The edge set E is first randomly divided into q subsets E_i, each representing the “preferred” links of controller i. For each (s, t) pair, a k‑multipath is computed independently on each controller using the same Dijkstra‑with‑weight‑inflation method, but with lower initial weights on links belonging to E_i. The same cost function as above is then used to decide which controller actually receives the multipath, and any newly used links are added to the controller’s preferred set. This method tends to reduce the number of links each controller monitors, at the expense of longer paths.

The heuristics are evaluated on several realistic topologies taken from the Rocketfuel project (ranging from 28 to 108 nodes and 66 to 456 links) and on a canonical fat‑tree data‑center topology. With q = 4 controllers on a 28‑node, 66‑link network, the Path‑Partition approach results in each controller monitoring about 45–47 links (≈70 % of the total), while the Partition‑Path approach reduces this to 29–31 links per controller (≈45 %). However, the average hop count grows from 2.6 (Path‑Partition) to 3.5 (Partition‑Path), which may be undesirable for latency‑sensitive data‑center traffic.

To benchmark against a more exhaustive optimization, the authors implement a simulated‑annealing scheme that only replaces the allocation loop of the Path‑Partition algorithm. Despite consuming orders of magnitude more runtime (seconds to thousands of seconds), simulated annealing does not produce substantially better allocations; in many cases the simple heuristics achieve equal or slightly superior results.

A theoretical contribution, Theorem 1, proves that in any devolved‑controller deployment, either a controller must cover the entire node set, or every node is covered by at least two controllers. This establishes that some degree of overlap (redundancy) is unavoidable, and the design must balance redundancy against the overhead of monitoring duplicate links.

Further experiments varying q demonstrate a geometric decrease in the maximum per‑controller link coverage as more controllers are added, confirming the intuitive trade‑off: increasing the number of controllers reduces individual load but raises coordination and management overhead.

The discussion highlights practical considerations: (i) real‑time responsiveness is achieved by pre‑computing multipaths, eliminating on‑the‑fly path computation; (ii) the cost‑function‑driven heuristics are lightweight enough for deployment in large data centers; (iii) future work could explore dynamic controller scaling, partial state sharing, fault tolerance, and extensions to QoS, traffic engineering, and network slicing.

In conclusion, the paper demonstrates that devolved controllers can feasibly replace a single centralized controller in data‑center networks. By formulating the allocation of pre‑computed multipaths as a cost‑minimization problem and solving it with simple yet effective heuristics, the authors achieve substantial reductions in per‑controller monitoring responsibilities while preserving full network coverage. The approach offers a practical pathway to scalable, low‑latency SDN control for today’s ever‑growing data‑center infrastructures.


Comments & Academic Discussion

Loading comments...

Leave a Comment