On Allocation Policies for Power and Performance

On Allocation Policies for Power and Performance
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

With the increasing popularity of Internet-based services and applications, power efficiency is becoming a major concern for data center operators, as high electricity consumption not only increases greenhouse gas emissions, but also increases the cost of running the server farm itself. In this paper we address the problem of maximizing the revenue of a service provider by means of dynamic allocation policies that run the minimum amount of servers necessary to meet user’s requirements in terms of performance. The results of several experiments executed using Wikipedia traces are described, showing that the proposed schemes work well, even if the workload is non-stationary. Since any resource allocation policy requires the use of forecasting mechanisms, various schemes allowing compensating errors in the load forecasts are presented and evaluated.


💡 Research Summary

The paper addresses the growing concern of power consumption in data‑center operations by proposing dynamic server allocation policies that aim to maximize provider revenue while meeting performance requirements. The authors focus on dedicated server farms, where each physical machine can host a fixed number (m) of parallel “virtual” servers (threads). Servers can be switched on or off; powered‑down machines consume no electricity, while idle but powered‑on servers still draw a reduced amount of power.

User requests arrive according to a Poisson process with rate λ, and service times are exponentially distributed with mean 1/μ. Crucially, the model incorporates user impatience: if a request waits longer than a timeout, it abandons the system and generates no revenue. This abandonment behavior is modeled as an exponential distribution with rate θ, leading to an M/M/n+M queue, also known as the Erlang‑A model. Unlike Erlang‑C, Erlang‑A does not require a stability condition because excess jobs may leave the system, allowing the load ρ=λ/μ to exceed the number of active servers without causing an unbounded queue.

Two dynamic allocation strategies are presented:

  1. Adaptive Policy – At each decision epoch (the end of an observation window), the system estimates λ and μ (using double exponential smoothing to handle non‑stationary traffic) and then evaluates the expected revenue
    R = c·T – r·P,
    where c is the profit per completed request, T is throughput, r is the electricity price, and P is total power consumption. The policy also accounts for the cost of turning servers on or off: Q = |Δn|·t·(∑di + k·e_max), where Δn is the number of servers changed, t is the window length, k is the average transition time, e_max is power drawn during transition, and di are component‑specific wear costs. By computing R for successive values of n and using a binary search (leveraging the concavity of R with respect to n), the policy stops when revenue no longer improves, thus selecting an (approximately) optimal number of active servers.

  2. QED Heuristic – A simpler rule derived from the Quality‑and‑Efficiency‑Driven (Halfin‑Whitt) regime. It sets
    n = ρ + α·√ρ,
    where α reflects the desired probability that all servers are busy. To cope with uncertainty in λ, the heuristic is extended to
    n = E(ρ) + α·E(ρ) + VAR(ρ),
    where E(ρ) and VAR(ρ) are the estimated mean and variance of the load, respectively (VAR(ρ) = VAR(λ)/μ²). This approach requires far less computation and is well‑suited for real‑time control.

The authors evaluate both policies using real Wikipedia traffic from November 2009, which exhibits strong diurnal and weekly patterns and large spikes. Experiments show that the Adaptive policy yields a modest (2‑3 %) increase in average revenue compared with the QED heuristic, while both achieve substantial power savings of 30‑45 % relative to a static over‑provisioned baseline. The abandonment rate stays below 5 %, indicating that user experience is not severely degraded. The results also demonstrate that incorporating transition costs and hardware wear into the model produces realistic operating decisions.

In summary, the paper contributes a mathematically grounded framework for dynamic server provisioning that balances energy efficiency against performance. By employing the Erlang‑A queue to model impatient users, and by explicitly accounting for power‑up/down costs, the proposed policies are shown to be effective even under highly non‑stationary workloads. The work bridges the gap between theoretical queuing analysis and practical data‑center management, offering actionable strategies for operators seeking to reduce electricity bills while preserving revenue.


Comments & Academic Discussion

Loading comments...

Leave a Comment