Reinforcement Learning-Based Co-Design and Operation of Chiller and Thermal Energy Storage for Cost-Optimal HVAC Systems
We study the joint operation and sizing of cooling infrastructure for commercial HVAC systems using reinforcement learning, with the objective of minimizing life-cycle cost over a 30-year horizon. The cooling system consists of a fixed-capacity electric chiller and a thermal energy storage (TES) unit, jointly operated to meet stochastic hourly cooling demands under time-varying electricity prices. The life-cycle cost accounts for both capital expenditure and discounted operating cost, including electricity consumption and maintenance. A key challenge arises from the strong asymmetry in capital costs: increasing chiller capacity by one unit is far more expensive than an equivalent increase in TES capacity. As a result, identifying the right combination of chiller and TES sizes, while ensuring zero loss-of-cooling-load under optimal operation, is a non-trivial co-design problem. To address this, we formulate the chiller operation problem for a fixed infrastructure configuration as a finite-horizon Markov Decision Process (MDP), in which the control action is the chiller part-load ratio (PLR). The MDP is solved using a Deep Q Network (DQN) with a constrained action space. The learned DQN RL policy minimizes electricity cost over historical traces of cooling demand and electricity prices. For each candidate chiller-TES sizing configuration, the trained policy is evaluated. We then restrict attention to configurations that fully satisfy the cooling demand and perform a life-cycle cost minimization over this feasible set to identify the cost-optimal infrastructure design. Using this approach, we determine the optimal chiller and thermal energy storage capacities to be 700 and 1500, respectively.
💡 Research Summary
The paper tackles the joint design and operation problem of a commercial HVAC cooling plant composed of a fixed‑capacity electric chiller and a thermal energy storage (TES) unit. The authors aim to minimize the 30‑year life‑cycle cost (LCC), which includes capital expenditures (CAPEX) for the chiller and TES, discounted operating costs (electricity), and maintenance (2 % of total CAPEX growing at 5 % inflation). A key economic feature is the strong asymmetry in capital cost: increasing chiller capacity by 1 kW costs roughly 4.3 times more than adding 1 kWh of TES capacity. Consequently, the optimal solution is not simply the peak‑demand‑sized chiller but a balanced combination that exploits low‑price electricity periods to charge TES and high‑price periods to discharge it.
Methodologically, the authors formulate the hourly operation for a given (C_ch, E_max) pair as a finite‑horizon Markov Decision Process (MDP). The state vector includes the current cooling load, TES state‑of‑charge, electricity price, time‑of‑day, day index, and a binary vector indicating available electricity sources. The action is the chiller part‑load ratio (PLR) a_k ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment