Bring Your Own Objective: Inter-operability of Network Objectives in Datacenters
Datacenter networks are currently locked in a “tyranny of the single objective”. While modern workloads demand diverse performance goals, ranging from coflow completion times, per-flow fairness, short-flow latencies, existing fabrics are typically hardcoded for a single metric. This rigid coupling ensures peak performance when application and network objectives align, but results in abysmal performance when they diverge. We propose DMart, a decentralized scheduling framework that treats network bandwidth as a competitive marketplace. In DMart, applications independently encode the urgency and importance of their network traffic into autonomous bids, allowing diverse objectives to co-exist natively on the same fabric. To meet the extreme scale and sub-microsecond requirements of modern datacenters, DMart implements distributed, per-link, per-RTT auctions, without relying on ILPs, centralized schedulers, or complex priority queues. We evaluate DMart using packet-level simulations and compare it against network schedulers designed for individual metrics, e.g., pFabric and Sincronia. DMart matches the performance of specialized schedulers on their own “home turf” while simultaneously optimizing secondary metrics. Compared to pFabric and Sincronia, DMart reduces deadline misses by 2x and coflow completion times by 1.6x respectively, while matching pFabric short-flow completion times.
💡 Research Summary
The paper identifies a fundamental limitation in modern datacenter networks: they are typically configured to optimize a single performance metric, such as latency, fairness, or deadline satisfaction. While this “single‑objective tyranny” can yield peak performance when the workload’s goals align with the network’s policy, it leads to severe degradation when diverse applications with conflicting objectives share the same fabric. Existing solutions either hard‑code a specific policy (e.g., SRPT for short‑flow latency, EDF for deadlines, SEBF for coflow completion) or rely on coarse‑grained class‑based isolation, both of which are inflexible and require redesign whenever new objectives emerge.
To address this, the authors propose DMart, a decentralized scheduling framework that treats bandwidth as a market commodity. Each application encodes its urgency and importance into a bid expressed in virtual tokens. Switches run per‑link, per‑RTT (round‑trip time) second‑price auctions locally, without a central controller. A flow must win the auction on every link along its path in a given RTT to transmit; otherwise it pauses and re‑bids in the next RTT. Because the same bid is submitted to all traversed links, the mechanism naturally enforces end‑to‑end progress while keeping the decision logic confined to individual switches.
Key design challenges and their solutions are:
- Scalability – Auctions are performed locally at line rate, using a small top‑k heap to select winners, avoiding the need for a datacenter‑wide optimizer.
- Convergence – The system imposes minimal constraints on bidding behavior (non‑negative bids, bounded change rates) and proves that under honest participation the market converges quickly, preventing oscillations.
- Strategic Behavior – Second‑price pricing discourages over‑bidding; the authors also limit the expressiveness of bids to keep strategies simple and truthful.
- Usability – For developers unwilling to implement bidding logic, DMart offers library support and service tiers (e.g., flat‑rate best‑effort) that act on their behalf using virtual credits.
Four representative objectives are instantiated as bidding strategies:
- Average Flow Completion Time (Avg‑FCT) – Bids are inversely proportional to remaining bytes divided by remaining RTTs, mimicking SRPT.
- Deadline Satisfaction (Max‑DL) – Bids increase sharply as the slack (deadline minus estimated completion) shrinks, similar to EDF but dynamically adaptive.
- Fairness – Each flow receives a pre‑allocated budget; bids are proportional to the remaining budget fraction, achieving long‑term equitable share.
- Average Coflow Completion Time (Avg‑CCT) – A coflow‑level controller aggregates flow bids and coordinates them to ensure the whole coflow wins consecutively, optimizing the coflow metric.
The authors evaluate DMart with packet‑level simulations across mixed‑workload scenarios, comparing against pFabric (short‑flow optimal) and Sincronia (deadline‑aware). Results show:
- Deadline miss rate reduced by 2× relative to Sincronia.
- Average coflow completion time improved by 1.6× over Sincronia.
- Short‑flow average completion time within 5 % of pFabric’s optimum.
- No significant loss in overall link utilization or throughput.
Importantly, DMart achieves these gains while allowing all four objectives to coexist on the same network, demonstrating true multi‑objective interoperability. The market abstraction also future‑proofs the fabric: new objectives can be supported simply by defining new bidding policies, without hardware changes or redesign of the core scheduler.
In summary, DMart reframes datacenter bandwidth allocation as a fast, distributed market clearing problem. By leveraging per‑RTT second‑price auctions, it provides a common, quantitative language (tokens) for heterogeneous applications to express urgency, ensures stable and convergent operation, and delivers superior performance across a broad spectrum of workloads. This work opens a path toward truly flexible, objective‑agnostic datacenter networks that can evolve alongside emerging application demands.
Comments & Academic Discussion
Loading comments...
Leave a Comment