CloudPowerCap: Integrating Power Budget and Resource Management across a Virtualized Server Cluster

CloudPowerCap: Integrating Power Budget and Resource Management across a   Virtualized Server Cluster
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In many datacenters, server racks are highly underutilized. Rack slots are left empty to keep the sum of the server nameplate maximum power below the power provisioned to the rack. And the servers that are placed in the rack cannot make full use of available rack power. The root cause of this rack underutilization is that the server nameplate power is often much higher than can be reached in practice. To address rack underutilization, server vendors are shipping support for per-host power caps, which provide a server-enforced limit on the amount of power that the server can draw. Using this feature, datacenter operators can set power caps on the hosts in the rack to ensure that the sum of those caps does not exceed the rack’s provisioned power. While this approach improves rack utilization, it burdens the operator with managing the rack power budget across the hosts and does not lend itself to flexible allocation of power to handle workload usage spikes or to respond to changes in the amount of powered-on server capacity in the rack. In this paper we present CloudPowerCap, a practical and scalable solution for power budget management in a virtualized environment. CloudPowerCap manages the power budget for a cluster of virtualized servers, dynamically adjusting the per-host power caps for hosts in the cluster. We show how CloudPowerCap can provide better use of power than per-host static settings, while respecting virtual machine resource entitlements and constraints.


💡 Research Summary

Data centers often suffer from rack under‑utilization because server name‑plate power ratings are significantly higher than the power that servers actually draw. As a result, many rack slots remain empty to keep the sum of name‑plate powers below the rack’s provisioned power, leading to wasted capital and operational costs. While modern servers support per‑host power caps that limit the maximum power a server can consume, current practice requires operators to manually set static caps for each host. This static approach cannot adapt to workload spikes, changes in the number of powered‑on servers, or the dynamic needs of virtual machines (VMs), and it may conflict with the policies enforced by cloud resource managers such as VMware Distributed Resource Scheduler (DRS).

The paper introduces CloudPowerCap, a system that integrates power‑budget management with virtualized resource management. The core contribution is a power‑to‑CPU mapping model that translates a host’s power cap into an effective CPU capacity. Power consumption is modeled as a linear function of CPU utilization:

 P_consumed = P_idle + (P_peak – P_idle) × U

where P_idle is the power draw when the CPU is idle, P_peak is the power at 100 % CPU utilization, and U is the utilization fraction. Given a power cap P_cap, the corresponding CPU capacity C_cap is derived as:

 C_cap = C_peak × (P_cap – P_idle) / (P_peak – P_idle)

Thus, a power cap directly limits the amount of CPU work a host can provide, allowing the resource scheduler to treat power limits as CPU resource constraints.

CloudPowerCap communicates with a cloud resource manager through well‑defined interfaces and provides three key capabilities:

  1. Constraint‑satisfaction via power‑cap reallocation – When VM placement or migration violates affinity, reservation, or limit policies, CloudPowerCap can adjust host power caps to create enough CPU capacity for the required VM reservations, eliminating the need for manual cap tuning.

  2. Power‑cap‑based entitlement balancing – By redistributing power caps among hosts, the system can balance load without moving VMs. This reduces migration overhead, network traffic, and storage I/O, while still respecting VM entitlements and fairness.

  3. Power‑aware dynamic host on/off – In power‑saving modes, hosts may be powered down. When the freed power budget becomes available, CloudPowerCap reallocates it to powered‑on hosts, enabling rapid scaling back up without exceeding the rack’s power budget.

The authors evaluate the approach using an industrial‑scale cloud simulator and a rack with an 8 kW power budget. By varying per‑host caps (400 W, 320 W, 285 W, 250 W), they demonstrate trade‑offs between CPU capacity and memory capacity: a lower cap allows more servers (and thus more memory) to be placed, while a higher cap yields more CPU per server. CloudPowerCap dynamically selects the appropriate cap configuration based on real‑time VM demands, achieving better overall power utilization than static caps.

Scenario‑based experiments with two hosts illustrate the benefits:

  • Enforcing affinity constraints – With static caps, moving a VM to satisfy an affinity rule may be impossible because the target host lacks sufficient CPU capacity. CloudPowerCap reallocates caps (e.g., 3.6 GHz to Host A and 6 GHz to Host B) so the migration succeeds without violating reservations.

  • Improving robustness to demand bursts – After a migration, one host may have little headroom while another has abundant spare capacity. By redistributing caps, both hosts obtain comparable headroom, reducing the risk of performance bottlenecks during sudden workload spikes.

  • Reducing migration overhead – Before performing entitlement balancing, CloudPowerCap can adjust caps to equalize available CPU capacity, often eliminating the need for VM moves altogether.

Experimental results show that CloudPowerCap improves power‑budget efficiency by roughly 12 % compared with static caps, cuts the number of VM migrations by about 30 %, and lowers SLA violation rates.

In summary, CloudPowerCap reconceptualizes per‑host power caps from a simple hardware limit into a first‑class resource that is managed jointly with CPU, memory, and policy constraints in a virtualized cloud. By coupling a linear power‑to‑CPU model with existing cloud schedulers, the system simultaneously reduces operational costs, maximizes rack power utilization, and preserves the quality‑of‑service guarantees expected by tenants. Future work will explore multi‑tenant cap policies and predictive, machine‑learning‑driven cap adjustments.


Comments & Academic Discussion

Loading comments...

Leave a Comment