Preprint

District Cooling System Control for Providing Operating Reserve based on Safe Deep Reinforcement Learning

Authors:
Preprints and early-stage research may not have been peer reviewed yet.
To read the file of this research, you can request a copy directly from the authors.

Abstract

Heating, ventilation, and air conditioning (HVAC) systems are well proved to be capable to provide operating reserve for power systems. As a type of large-capacity and energy-efficient HVAC system (up to 100 MW), district cooling system (DCS) is emerging in modern cities and has huge potential to be regulated as a flexible load. However, strategically controlling a DCS to provide flexibility is challenging, because one DCS services multiple buildings with complex thermal dynamics and uncertain cooling demands. Improper control may lead to significant thermal discomfort and even deteriorate the power system's operation security. To address the above issues, we propose a model-free control strategy based on the deep reinforcement learning (DRL) without the requirement of accurate system model and uncertainty distribution. To avoid damaging "trial & error" actions that may violate the system's operation security during the training process, we further propose a safe layer combined to the DRL to guarantee the satisfaction of critical constraints, forming a safe-DRL scheme. Moreover, after providing operating reserve, DCS increases power and tries to recover all the buildings' temperature back to set values, which may probably cause an instantaneous peak-power rebound and bring a secondary impact on power systems. Therefore, we design a self-adaption reward function within the proposed safe-DRL scheme to constrain the peak-power effectively. Numerical studies based on a realistic DCS demonstrate the effectiveness of the proposed methods.

No file available

Request Full-text Paper PDF

To read the file of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
Due to the distributed structure of electric power networks (EPNs) and district heating networks (DHNs), distributed dispatch is favorable for coordination in an integrated electricity and heat system (IEHS). However, the feasibility issue of the dispatch solution remains unresolved. Without making this clear, critical threats may be posed to system operation, and there cannot be security guarantees. This paper proposes a distributed method for real-time IEHS dispatch where feasibility is strictly ensured in each iteration. First, a network reduction method of DHNs retaining the temperature quasi-dynamics is derived. Based on that, a novel feasibility cut (FC) generation method is devised, which strictly maintains system security during the iterative process. Finally, modified Benders decomposition with guaranteed feasibility (MBD-GF) is proposed to tackle distributed IEHS dispatch. Case studies of two IEHSs validate the effectiveness and efficiency of the proposed method.
Article
Full-text available
It is a crucial yet challenging task to ensure commercial load resilience during high-impact, low-frequency extreme events. In this paper, a novel safe reinforcement learning (SRL)-based resilient proactive scheduling strategy is proposed for commercial buildings (CBs) subject to extreme weather events. It deploys the correlation between different CB components with demand response capabilities to maximize the customer comfort levels while minimizing the energy reserve cost. It also develops an SRL-based algorithm by combining deep-Q-network and conditional-value-at-risk methods to handle the uncertainties in the extreme weather events such that the impact from extreme epochs in the learning process is greatly mitigated. As a result, an optimum control decision can be derived that targets proactive scheduling goals, where exploration and exploitation are considered simultaneously. Extensive simulation results show that the proposed SRL-based proactive scheduling decisions can ensure the resilience of a commercial building while maintaining comprehensive comfort levels for the occupants.
Article
Full-text available
In order to realize an energy efficient and emission-free heat and cold supply in urban areas, 5th Generation District Heating and Cooling (5GDHC) networks are a promising technology. In 5GDHC networks, the control of the network temperature is crucial since it affects the efficiency of connected heat pumps and chillers, the heat losses (or gains) of the network, as well as the integration of waste heat or free cooling. Due to the large number of opposing effects, the optimal control of network temperatures is a challenging task. In this paper, a mixed-integer linear program (MILP) is proposed for short-term optimization of the network temperature in 5GDHC systems. The model comprises an air-source heat pump, compression chiller and thermal storage in a central generation unit as well as heat pumps, chillers, electric boilers and thermal storages in buildings. Furthermore, the model considers the thermal inertia of the water mass in the network which functions as additional thermal storage. The optimization model is real-time capable and designed to be deployed in a model-predictive control. In a case study, the optimization approach leads to cost savings in two of three investigated months (by 10 % and 60 % respectively) compared to a reference operation strategy (free floating network temperature).
Article
Full-text available
Pipeline energy storage in district heating networks (DHNs) has shown to be capable of improving energy efficiency in an integrated electricity and heat system (IEHS). However, most electric power networks (EPNs) and DHNs are managed by different entities, while the incentives inducing such flexibilities from DHNs have been seldom discussed. This paper fills the research gap by investigating price incentives offered by EPNs to encourage DHN operators to fully utilize pipeline energy storage. Individual interests of EPNs and DHNs are addressed via a bi-level model, where the EPN operator determines the best price incentive based on optimal power flow (OPF) in the upper-level, while the lower-level problem describes the optimal response of the DHN operator based on optimal thermal flow (OTF). To preserve the privacy of DHNs in distributed operation, a reduced and accurate OTF model is then proposed where internal states are eliminated and system parameters are not exposed, which also relieves model complexity. Finally, a price-quantity decomposition method along with warm-start strategies are proposed to solve the reduced bi-level model, and the solution obtained is interpreted as the equilibrium of Stackelberg competition between EPNs and DHNs. Case studies of two IEHSs validate that the proposed decomposition method can efficiently reach Stackelberg equilibrium in a distributed setting, while the introduced incentive-based coordination mechanism can effectively improve social welfare by lowering total costs in both EPNs and DHNs.
Article
Full-text available
Coordination efficiency between different energy sectors with privacy preservation has become a technical bottleneck in distributed operation of the integrated electricity and heat systems (IEHS). This paper investigates, for the first time, an equivalent model-based non-iterative solution for this issue. An equivalent model of district heating networks (DHNs) retaining the temperature quasi-dynamics is derived. Based on the equivalence, all the state variables in DHNs are explicitly expressed by the heat power generation, and the feasible region of DHNs is projected on the coupling boundaries of the IEHS. The existence of this equivalent model is proved under a mild sufficient condition. The application of the proposed equivalent model is demonstrated via the distributed unit commitment for IEHS. Case studies of two test systems validate the effectiveness and the efficiency of the solution for distributed IEHS operation.
Article
Full-text available
Peak shaving, demand response, fast fault detection, emissions and costs reduction are some of the main objectives to meet in advanced district heating and cooling (DHC) systems. In order to enhance the operation of infrastructures, challenges such as supply temperature reduction and load uncertainty with the development of algorithms and technologies are growing. Therefore, traditional control strategies and diagnosis approaches cannot achieve these goals. Accordingly, to address these shortcomings, researchers have developed plenty of innovative methods based on their applications and features. The main purpose of this paper is to review recent publications that include both hard and soft computing implementations such as model predictive control and machine learning algorithms with applications also on both fourth and fifth generation district heating and cooling networks. After introducing traditional approaches, the innovative techniques, accomplished results and overview of the main strengths and weaknesses have been discussed together with a description of the main capabilities of some commercial platforms.
Article
Full-text available
As the penetration of renewable energy continues to increase, stochastic and intermittent generation resources gradually replace the conventional generators, bringing significant challenges in stabilizing power system frequency. Thus, aggregating demand-side resources for frequency regulation attracts attentions from both academia and industry. However, in practice, conventional aggregation approaches suffer from random and uncertain behaviors of the users such as opting out control signals. The risk-averse multi-armed bandit learning approach is adopted to learn the behaviors of the users and a novel aggregation strategy is developed for residential heating, ventilation, and air conditioning (HVAC) to provide reliable secondary frequency regulation. Compared with the conventional approach, the simulation results show that the risk-averse multi-armed bandit learning approach performs better in secondary frequency regulation with fewer users being selected and opting out of the control. Besides, the proposed approach is more robust to random and changing behaviors of the users.
Article
Full-text available
Rapid progress in machine learning and artificial intelligence (AI) has brought renewed attention to its applicability in power systems for modern forms of control that help integrate higher levels of renewable generation and address increasing levels of uncertainty and variability. In this paper we discuss these new applications and shine light on the most relevant new safety risks and considerations that emerge when relying on learning for control purposes in electric grid operations. We build on recent taxonomical work in AI safety and focus on four concrete safety problems. We draw on two case studies, one in frequency regulation and one in distribution system control, to exemplify these problems and show mitigating measures. We then provide general guidelines and literature to help people working on integrating learning capabilities for control purposes to make safety risks a central tenet of design.
Article
Full-text available
In commercial buildings, about 40%–50% of the total electricity consumption is attributed to Heating, Ventilation, and Air Conditioning (HVAC) systems, which places an economic burden on building operators. In this paper, we intend to minimize the energy cost of an HVAC system in a multi-zone commercial building with the consideration of random zone occupancy, thermal comfort, and indoor air quality comfort. Due to the existence of unknown thermal dynamics models, parameter uncertainties (e.g., outdoor temperature, electricity price, and number of occupants), spatially and temporally coupled constraints associated with indoor temperature and CO <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> concentration, a large discrete solution space, and a non-convex and non-separable objective function, it is very challenging to achieve the above aim. To this end, the above energy cost minimization problem is reformulated as a Markov game. Then, an HVAC control algorithm is proposed to solve the Markov game based on multi-agent deep reinforcement learning with attention mechanism. The proposed algorithm does not require any prior knowledge of uncertain parameters and can operate without knowing building thermal dynamics models. Simulation results based on real-world traces show the effectiveness, robustness and scalability of the proposed algorithm.
Article
Full-text available
Dynamic distribution network reconfiguration (DNR) algorithms perform hourly status changes of remotely controllable switches to improve distribution system performance. The problem is typically solved by physical model-based control algorithms, which not only rely on accurate network parameters but also lack scalability. To address these limitations, this paper develops a data-driven batch-constrained reinforcement learning (RL) algorithm for the dynamic DNR problem. The proposed RL algorithm learns the network reconfiguration control policy from a finite historical operational dataset without interacting with the distribution network. The numerical study results on three distribution networks show that the proposed algorithm not only outperforms state-of-the-art RL algorithms but also improves the behavior control policy, which generated the historical operational data. The proposed algorithm is also very scalable and can find a desirable network reconfiguration solution in real-time.
Article
Full-text available
Virtual battery (VB) is an innovative method to model flexibility of building loads and effectively coordinate them with other resources at a system level. Unlike a real battery with a dedicated power conversion system for charging control, methods are required for operating building loads to deviate from the baseline to respond to grid signals. This paper presents a VB control for a commercial heating, ventilation, and air conditioning (HVAC) system to follow the desired power consumption in real time by adjusting zonal airflow rates. The proposed method consists of two parts. At the system level, a mixed feedforward and feedback control is used to estimate the desired total airflow rate. At the zone level, two priority-based algorithms are then proposed to distribute the total airflow rate to individual zones. In particular, a zonal airflow limit estimation method is proposed using machine-learning techniques, in contrast to physics-based thermal models in existing studies, to more accurately capture zonal thermal dynamics and improve temperature control performance. An office building on the Pacific Northwest National Laboratory campus is implemented in EnergyPlus, and used to illustrate and validate the proposed control.
Article
Full-text available
This paper proposes a novel framework for home energy management (HEM) based on reinforcement learning in achieving efficient home-based demand response (DR). The concerned hour-ahead energy consumption scheduling problem is duly formulated as a finite Markov decision process (FMDP) with discrete time steps. To tackle this problem, a data-driven method based on neural network (NN) and Q-learning algorithm is developed, which achieves superior performance on cost-effective schedules for HEM system. Specifically, real data of electricity price and solar photovoltaic (PV) generation are timely processed for uncertainty prediction by extreme learning machine (ELM) in the rolling time windows. The scheduling decisions of the household appliances and electric vehicles (EVs) can be subsequently obtained through the newly developed framework, of which the objective is dual, i.e. to minimize the electricity bill as well as the DR induced dissatisfaction. Simulations are performed on a residential house level with multiple home appliances, an EV and several PV panels. The test results demonstrate the effectiveness of the proposed data-driven based HEM framework.
Article
Full-text available
The demand for air conditioning and cooling services is rapidly increasing worldwide. As cooling demand has high coincidence to occur in countries with high solar irradiation, the combination of solar thermal energy and cooling appears to be an exciting alternative to replace traditional electricity-driven cooling systems where electricity is generated from fossil fuels. Nevertheless, solar assisted cooling is not yet widely deployed because of many barriers amongst them the presumed high investment cost of solar cooling technology. This research aims at making this technology more affordable by providing a holistic optimization design of solar assisted district cooling systems. Toward this end, a mixed-integer linear programming model (MILP) is proposed that captures the key design and operation variables of a solar-assisted district cooling system. Hence, the proposed model aims at finding the optimal system design (i.e., the system’s main components along with their optimal capacities) together with the optimal hourly policies for production and storage of hot and cold water while satisfying the expected cooling demand. The model was validated using collected real data of different case studies. The optimal system design of some cases showed that solar collectors covered about 46% of the chiller’s heat demand. Moreover, the existence of the cold-water TES in the system depends on the chosen chiller capacity and the cooling demand of the case study. Furthermore, a sensitivity analysis was carried out to study the model robustness. The sensitivity analysis shows that the chiller COP had the highest impact on the annual total system cost, where increasing COP by 20% of its initial value, will decrease the annual total system cost by 4.4%.
Conference Paper
Full-text available
In recent years, a specific machine learning method called deep learning has gained huge attraction, as it has obtained astonishing results in broad applications such as pattern recognition, speech recognition, computer vision, and natural language processing. Recent research has also been shown that deep learning techniques can be combined with reinforcement learning methods to learn useful representations for the problems with high dimensional raw data input. This article reviews the recent advances in deep reinforcement learning with focus on the most used deep architectures such as autoencoders, convolutional neural networks and recurrent neural networks which have successfully been come together with the reinforcement learning framework.
Article
Full-text available
This paper considers a demand response agent that must find a near-optimal sequence of decisions based on sparse observations of its environment. Extracting a relevant set of features from these observations is a challenging task and may require substantial domain knowledge. One way to tackle this problem is to store sequences of past observations and actions in the state vector, making it high dimensional, and apply techniques from deep learning. This paper investigates the capabilities of different deep learning techniques, such as convolutional neural networks and recurrent neural networks, to extract relevant features for finding near-optimal policies for a residential heating system and electric water heater that are hindered by sparse observations. Our simulation results indicate that in this specific scenario, feeding sequences of time-series to an LSTM network, which is a specific type of recurrent neural network, achieved a higher performance than stacking these time-series in the input of a convolutional neural network or deep neural network.
Article
Full-text available
The purpose with this review is to provide a presentation of the background for the current position for district heating and cooling in the world, with some deeper insights into European conditions. The review structure considers the market, technical, supply, environmental, institutional, and future contexts. The main global conclusions are low utilisation of district heating in buildings, varying implementation rates with respect to countries, moderate commitment to the fundamental idea of district heating, low recognition of possible carbon dioxide emission reductions, and low awareness in general of the district heating and cooling benefits. The cold deliveries from district cooling systems are much smaller than heat deliveries from district heating systems. The European situation can be characterised by higher commitment to the fundamental idea of district heating, lower specific carbon dioxide emissions, and higher awareness of the district heating and cooling benefits. The conclusions obtained from the six contexts analysed show that district heating and cooling systems have strong potentials to be viable heat and cold supply options in a future world. However, more efforts are required for identification, assessment, and implementation of these potentials in order to harvest the global benefits with district heating and cooling.
Article
Full-text available
This paper presents an optimal dispatch model of an ice storage air-conditioning system for participants to quickly and accurately perform energy saving and demand response, and to avoid the over contact with electricity price peak. The schedule planning for an ice storage air-conditioning system of demand response is mainly to transfer energy consumption from the peak load to the partial-peak or off-peak load. Least Squares Regression (LSR) is used to obtain the polynomial function for the cooling capacity and the cost of power consumption with a real ice storage air-conditioning system. Based on the dynamic electricity pricing, the requirements of cooling loads, and all technical constraints, the dispatch model of the ice-storage air-conditioning system is formulated to minimize the operation cost. The Improved Ripple Bee Swarm Optimization (IRBSO) algorithm is proposed to solve the dispatch model of the ice storage air-conditioning system in a daily schedule on summer. Simulation results indicate that reasonable solutions provide a practical and flexible framework allowing the demand response of ice storage air-conditioning systems to demonstrate the optimization of its energy savings and operational efficiency and offering greater energy efficiency.
Article
Full-text available
An important application of reinforcement learning (RL) is to finite-state control problems and one of the most difficult problems in learning for control is balancing the exploration/exploitation tradeoff. Existing theoretical results for RL give very little guidance on reasonable ways to perform exploration. In this paper, we examine the convergence of single-step on-policy RL algorithms for control. On-policy algorithms cannot separate exploration from learning and therefore must confront the exploration problem directly. We prove convergence results for several related on-policy algorithms with both decaying exploration and persistent exploration. We also provide examples of exploration strategies that can be followed during learning that result in convergence to both optimal values and optimal policies.
Article
Full-text available
We provide a tutorial on the construction and evaluation of Markov decision processes (MDPs), which are powerful analytical tools used for sequential decision making under uncertainty that have been widely used in many industrial and manufacturing applications but are underutilized in medical decision making (MDM). We demonstrate the use of an MDP to solve a sequential clinical treatment problem under uncertainty. Markov decision processes generalize standard Markov models in that a decision process is embedded in the model and multiple decisions are made over time. Furthermore, they have significant advantages over standard decision analysis. We compare MDPs to standard Markov-based simulation models by solving the problem of the optimal timing of living-donor liver transplantation using both methods. Both models result in the same optimal transplantation policy and the same total life expectancies for the same patient and living donor. The computation time for solving the MDP model is significantly smaller than that for solving the Markov model. We briefly describe the growing literature of MDPs applied to medical decisions.
Conference Paper
The paradigm shift in the electric power grid necessitates a revisit of existing control methods to ensure the grid's security and resilience. In particular, the increased uncertainties and rapidly changing operational conditions in power systems have revealed outstanding issues in terms of either speed, adaptiveness, or scalability of the existing control methods for power systems. On the other hand, the availability of massive real-time data can provide a clearer picture of what is happening in the grid. Recently, deep reinforcement learning (RL) has been regarded and adopted as a promising approach leveraging massive data for fast and adaptive grid control. However, like most existing machine learning (ML)-based control techniques, RL control usually cannot guarantee the safety of the systems under control. In this paper, we introduce a novel method for safe RL-based load shedding of power systems that can enhance the safe voltage recovery of the electric power grid after experiencing faults. Numerical simulations on the 39-bus IEEE benchmark is performed to demonstrate the effectiveness of the proposed safe RL emergency control, as well as its adaptive capability to faults not seen in the training.
Article
Residential heating, ventilation, and air conditioning (HVAC) has been considered as an important demand response resource. However, the optimization of residential HVAC control is no trivial task due to the complexity of the thermal dynamic models of buildings and uncertainty associated with both occupant-driven heat loads and weather forecasts. In this paper, we apply a novel model-free deep reinforcement learning (RL) method, known as the deep deterministic policy gradient (DDPG), to generate an optimal control strategy for a multi-zone residential HVAC system with the goal of minimizing energy consumption cost while maintaining the users’ comfort. The applied deep RL-based method learns through continuous interaction with a simulated building environment and without referring to any prior model knowledge. Simulation results show that compared with the state-of-art deep Q network (DQN), the DDPG-based HVAC control strategy can reduce the energy consumption cost by 15% and reduce the comfort violation by 79%; and when compared with a rule-based HVAC control strategy, the comfort violation can be reduced by 98%. In addition, experiments with different building models and retail price models demonstrate that the well-trained DDPG-based HVAC control strategy has high generalization and adaptability to unseen environments, which indicates its practicability for real-world implementation.
Article
Model-based Vol/VAR optimization method is widely used to eliminate voltage violations and reduce network losses. However, the parameters of active distribution networks(ADNs) are not onsite identified, so significant errors may be involved in the model and make the model-based method infeasible. To cope with this critical issue, we propose a novel two-stage deep reinforcement learning (DRL) method to improve the voltage profile by regulating inverter-based energy resources, which consists of offline stage and online stage. In the offline stage, a highly efficient adversarial reinforcement learning algorithm is developed to train an offline agent robust to the model mismatch. In the sequential online stage, we transfer the offline agent safely as the online agent to perform continuous learning and controlling online with significantly improved safety and efficiency. Numerical simulations on IEEE test networks not only demonstrate that the proposed adversarial reinforcement learning algorithm outperforms the state-of-art algorithm, but also show that our proposed two-stage method achieves much better performance than the existing DRL based methods in the online application.
Article
Flexibility in power systems is ability to provide supply-demand balance, maintain continuity in unexpected situations, and cope with uncertainty on supply-demand sides. The new method and management requirements to provide flexibility have emerged from the trend towards power systems increasing renewable energy penetration with generation uncertainty and availability. In this study, the historical development of power system flexibility concept, the flexible power system characteristics, flexibility sources, and evaluation parameters are presented as part of international literature. The impact of variable renewable energy sources penetration on power system transient stability, small-signal stability, and frequency stability are discussed; the studies are presented to the researchers for further studies. Moreover, flexibility measurement studies are investigated, and methods of providing flexibility are evaluated.
Article
Buildings, as major energy consumers, can provide great untapped demand response (DR) resources for grid services. However, their participation remains low in real-life. One major impediment for popularizing DR in buildings is the lack of cost-effective automation systems that can be widely adopted. Existing optimization-based smart building control algorithms suffer from high costs on both building-specific modeling and on-demand computing resources. To tackle these issues, this paper proposes a cost-effective edge-cloud integrated solution using reinforcement learning (RL). Beside RL’s ability to solve sequential optimal decision-making problems, its adaptability to easy-to-obtain building models and the off-line learning feature are likely to reduce the controller’s implementation cost. Using a surrogate building model learned automatically from building operation data, an RL agent learns an optimal control policy on cloud infrastructure, and the policy is then distributed to edge devices for execution. Simulation results demonstrate the control efficacy and the learning efficiency in buildings of different sizes. A preliminary cost analysis on a 4-zone commercial building shows the annual cost for optimal policy training is only 2.25% of the DR incentive received. Results of this study show a possible approach with higher return on investment for buildings to participate in DR programs.
Article
Reinforcement learning-based schemes are being recently applied for model-free voltage control in active distribution networks. However, existing reinforcement learning methods face challenges when it comes to continuous state and action spaces problems or problems with operation constraints. To address these limitations, this paper proposes an optimal voltage control scheme based on the safe deep reinforcement learning. In this scheme, the optimal voltage control problem is formulated as a constrained Markov decision process, in which both state and action spaces are continuous. To solve this problem efficiently, the deep deterministic policy gradient algorithm is utilized to learn the reactive power control policies, which determine the optimal control actions from the states. In contrast to existing reinforcement learning methods, deep deterministic policy gradient is naturally capable of addressing control problems with continuous state and action spaces. This is due to the utilization of deep neural networks to approximate both value function and policy. In addition, in order to handle the operation constraints in active distribution networks, a safe exploration approach is proposed to form a safety layer, which is composed directly on top the deep deterministic policy gradient actor network. This safety layer predicts the change in the constrained states and prevents the violation of active distribution networks operation constraints. Numerical simulations on modified IEEE test systems demonstrate that the proposed scheme successfully maintains all bus voltage within the allowed range, and reduces the system loss by 15% compared to the no control case.
Article
Volt-VAR control is critical to keeping distribution network voltages within allowable range, minimizing losses, and reducing wear and tear of voltage regulating devices. To deal with incomplete and inaccurate distribution network models, we propose a safe off-policy deep reinforcement learning algorithm to solve Volt-VAR control problems in a model-free manner. The Volt-VAR control problem is formulated as a constrained Markov decision process with discrete action space, and solved by our proposed constrained soft actor-critic algorithm. Our proposed reinforcement learning algorithm achieves scalability, sample efficiency, and constraint satisfaction by synergistically combining the merits of the maximum-entropy framework, the method of multiplier, a device-decoupled neural network structure, and an ordinal encoding scheme. Comprehensive numerical studies with the IEEE distribution test feeders show that our proposed algorithm outperforms the existing reinforcement learning algorithms and conventional optimization-based approaches on a large feeder.
Article
Electric vehicles (EVs) have been popularly adopted and deployed over the past few years because they are environment-friendly. When integrated into smart grids, EVs can operate as flexible loads or energy storage devices to participate in demand response (DR). By taking advantage of time-varying electricity prices in DR, the charging cost can be reduced by optimizing the charging/discharging schedules. However, since there exists randomness in the arrival and departure time of an EV and the electricity price, it is difficult to determine the optimal charging/discharging schedules to guarantee that the EV is fully charged upon departure. To address this issue, we formulate the EV charging/discharging scheduling problem as a constrained Markov Decision Process (CMDP). The aim is to find a constrained charging/discharging scheduling strategy to minimize the charging cost as well as guarantee the EV can be fully charged. To solve the CMDP, a model-free approach based on safe deep reinforcement learning (SDRL) is proposed. The proposed approach does not require any domain knowledge about the randomness. It directly learns to generate the constrained optimal charging/discharging schedules with a deep neural network (DNN). Unlike existing reinforcement learning (RL) or deep RL (DRL) paradigms, the proposed approach does not need to manually design a penalty term or tune a penalty coefficient. Numerical experiments with real-world electricity prices demonstrate the effectiveness of the proposed approach.
Conference Paper
Buildings sector is one of the major consumers of energy in the United States. The buildings HVAC (Heating, Ventilation, and Air Conditioning) systems, whose functionality is to maintain thermal comfort and indoor air quality (IAQ), account for almost half of the energy consumed by the buildings. Thus, intelligent scheduling of the building HVAC system has the potential for tremendous energy and cost savings while ensuring that the control objectives (thermal comfort, air quality) are satisfied.
Article
District heating and cooling networks have great potential for energy saving, efficient thermal energy distribution and renewable energy source integration. Currently, heating systems are managed on the basis of operator experience or by using adaptive controllers, however these solutions are not suitable when there are remarkable variations in boundary conditions. In this context, Model Predictive Control is a promising strategy as it optimizes control based on the prediction of the future behavior of system dynamics and disturbances by means of simplified models. This paper presents the development of a predictive controller based on a novel Dynamic Programming optimization algorithm and aimed to supply the thermal energy to entire buildings within district heating networks. The controller is exploited to operate the district heating network of a school complex in a simulation environment (i.e. Model-in-the-Loop). Each branch connected to the network is optimized by a dedicated controller according to a multi-agent strategy. The performance of the innovative controller is compared to the results obtained by using a conventional PID controller. Conservative results show that, with the innovative controller, a reduction in fuel consumption of up to more than 7% is obtained together with up to 5 h of avoided failures of the indoor comfort constraints, depending on the season. Overall, the Model-based Predictive Controller is able to fulfill comfort requirements adequately while minimizing energy consumption. Moreover, the multi-agent approach allows these results to be extended to larger networks in future studies.
Article
The operation of indirect evaporative cooler (IEC) largely depends on the ambient temperature and humidity. To maintain stable indoor temperature, proper controller is essential. On-off control is a mature and stable control method used on constant speed fans. However, large fluctuation of indoor temperature can be observed because of limited control precision. To achieve better thermal comfort, a proportional–integral (PI) law based variable speed technology is proposed for accurate temperature control in an IEC system. This technology had been proved highly effective in central air-conditioning systems and direct expansion air-conditioners in terms of control precision and energy saving, but its techno-economic feasibility in IEC has not been investigated. In this study, annual dynamic simulation has been conducted to an IEC system based on the IEC model and control algorithm. Results show that indoor temperature can be controlled within ±0.5 °C around the setting point for 81.9% of time, while it is only 30.5% under on-off control. The PI based controller is well adapted to cooling loads in all seasons with good control precision, fast response speed and small overshoots. Response time of PI control is only 10 min in a disturbance rejection test, which is much shorter than 30 min under the on-off control. Annually, IEC with variable speed fans consume 50.0% less energy than that of on-off fans. At last, economic analysis shows that this technology is economically feasible only when the power of primary air fan is larger than 1.75 kW.
Article
Thermal energy storage can be utilized as an effective component in energy systems to maximize cost savings when time-of-use (TOU) pricing or real-time pricing (RTP) is in place. This study proposes a novel approach that can effectively predict performance and determine control strategy of thermal energy storage (i.e., ice storage) in a district cooling system. The proposed approach utilizes Neural Network (NN) based model predictive control (MPC) strategy coupled with a genetic algorithm (GA) optimizer and examines the effectiveness of using a NN model for a district cooling system with ice storage. The NN offers a relatively fast performance estimation of a district cooling system with given external inputs. To simulate the proposed MPC controller, a physics-based model of the district cooling system is first developed and validated to act as a virtual plant for the controller to communicate system states in real times. Next, the NN modeling the plant is developed and trained during a cooling period so that the control strategy is tested under the RTP and TOU pricing. This model is optimized using the GA due to the on/off controls for the district cooling network. Finally, a thermal load prediction algorithm is integrated to test under perfect weather inputs and weather forecasts by considering 1-hour discretization in the MPC scheme. Results indicate that for the month of August, the optimal control scheme can effectively adapt to varying loads and varying prices to effectively reduce operating costs of the district cooling network by approximately 16% and 13% under the TOU pricing and the RTP, respectively.
Article
Load forecasting problems have traditionally been addressed using various statistical methods, among which autoregressive integrated moving average with exogenous inputs (ARIMAX) has gained the most attention as a classical time-series modeling method. Recently, the booming development of deep learning techniques make them promising alternatives to conventional data-driven approaches. While deep learning offers exceptional capability in handling complex non-linear relationships, model complexity and computation efficiency are of concern. A few papers have explored the possibility of applying deep neural networks to forecast time-series load data but only limited to system-level or single-step building-level forecasting. This study, however, aims at filling in the knowledge gap of deep learning-based techniques for day-ahead multi-step load forecasting in commercial buildings. Two classical deep neural network models, namely recurrent neural network (RNN) and convolutional neural network (CNN), have been proposed and formulated under both recursive and direct multi-step manners. Their performances are compared with the Seasonal ARIMAX model with regard to accuracy, computational efficiency, generalizability and robustness. Among all of the investigated deep learning techniques, the gated 24-h CNN model, performed in a direct multi-step manner, proves itself to have the best performance, improving the forecasting accuracy by 22.6% compared to that of the seasonal ARIMAX.
Article
Air-conditioning systems in commercial buildings are usually switched on before office hour to precool buildings to create an acceptable working environment at the beginning of the office hour in cooling seasons. However, due to high cooling demand during morning start period particularly in hot seasons, often much higher than the capacity of cooling supply, the feedback control strategies in air-conditioning systems often fail to control this cooling process properly. The imbalanced cooling distribution and large difference of cooling-down speeds among different spaces result in the need of significantly extended precooling duration as well as over-speeding of water pumps and fans that lead to serious energy waste and high peak demand. An optimal control strategy is therefore developed to determine the number and schedule of operating chillers and particularly to achieve an optimal cooling distribution among individual spaces. Case studies are conducted and results show that the proposed control strategy could shorten the precooling time about half an hour because of similar cooling-down speeds among individual zones. The energy consumption of the air-conditioning system during morning start period is also reduced over 50%. In addition, the peak demand is reduced significantly contributed by the improved controls of secondary pumps and fans.
Conference Paper
This study investigates whether feedforward neural networks with two hidden layers generalise better than those with one. In contrast to the existing literature, a method is proposed which allows these networks to be compared empirically on a hidden-node-by-hidden-node basis. This is applied to ten public domain function approximation datasets. Networks with two hidden layers were found to be better generalisers in nine of the ten cases, although the actual degree of improvement is case dependent. The proposed method can be used to rapidly determine whether it is worth considering two hidden layers for a given problem.
Article
In ultra-low temperature district heating the supply temperature is less than required to heat the domestic hot water and a heat pump is therefore often proposed to raise the temperature. This paper investigates how this heat pump can be utilized for price based demand response to induce peak reductions and energy cost savings. A model predictive control strategy is proposed and evaluated through co-simulations where a model predictive controller is formulated in MATLAB and connected to an EnergyPlus hot water storage tank. It is demonstrated that the system is capable of reducing the district heating morning peak and the electric grid evening peak as well as providing energy cost savings for the end-user without compromising hygiene and comfort.
Article
District heating and cooling (DHC) systems are attracting increased interest for their low carbon potential. However, most DHC systems are not operating at the expected performance level. Optimization and Enhancement of DHC networks to reduce (a) fossil fuel consumption, CO 2 emission, and heat losses across the network, while (b) increasing return on investment, forming key challenges faced by decision makers in the fast developing energy landscape. While the academic literature is abundant of research based on field experiments, simulations, optimization strategies and algorithms etc., there is a lack of a comprehensive review that addresses the multi-­‐faceted dimensions of the optimization and enhancement of DHC systems with a view to promote integration of smart grids, energy storage and increased share of renewable energy. The paper focuses on four areas: energy generation, energy distribution, heat substations, and terminal users, identifying state-­‐of-­‐the-­‐art methods and solutions, while paving the way for future research.
Article
The penetration of renewable energy sources (RES) in power system is increasing around the world. However, the severe intermittency and variability characteristics of RES make the operating reserve become more and more important for the electric power system to maintain balance between supply and demand. Moreover, the flexible loads, especially for air conditioners (AC), are growing so rapidly that they account for an increasingly large share in power consumption. With the development of information and communication technologies (ICT), ACs can be monitored and controlled remotely to provide operating reserve and respond actively when needed by the electric power system operation. In this paper, a novel control strategy for the aggregation model of ACs based on the thermal model of the room is proposed. By resetting the temperature of each AC, the operation state is adjusted temporarily without affecting customers’ satisfaction. The operation characteristics of both individual AC and the aggregation model of ACs are analysed. Furthermore, several indexes are put forward to evaluate the operating reserve performance, including reserve capacity, response time, duration time and ramp rate. The effectiveness of the proposed control strategy is illustrated in the numerical studies.
Article
In the current energy scenario, system design and operation strategies are paramount especially for plants fed by renewable sources and/or whose production is strictly connected to the users demand. The systems optimization must consider the possibility of energy storage and of the conversion among different forms of energy. In this paper, a hybrid cogeneration system composed by a cogenerative internal combustion engine, a photovoltaic plant, a boiler and a pump as turbine is investigated. Different energy storage options are included: a pack of batteries, a water reservoir and a hot thermal storage. By applying the Particle Swarm Optimization method, the devices size and hourly operation are simultaneously optimized. The minimization of the overall cost is the optimization goal, while the main constraint is the fulfilment of the user request (electricity, heat and water). Results show that the cogen-erative internal combustion engine supplies the 87% of the electricity and the 19% of the heat during winter time while, during summer period, it supplies the 89% of the electricity and the all heat. The use of a pump as turbine allows to reduce the battery discharge rate and the Depth of Discharge with a consequent increase of the battery lifetime.
Article
The increased level of demand that is associated with the restoration of service after an outage, Cold Load Pick-Up (CLPU), can be significantly higher than pre-outage levels, even exceeding the normal distribution feeder peak demand. These high levels of demand can delay restoration efforts and in extreme cases damage equipment. The negative impacts of CLPU can be mitigated with strategies that restore the feeder in sections, minimizing the load current. The challenge for utilities is to manage the current level on critical equipment while minimizing the time to restore service to all customers. Accurately modeling CLPU events is the first step in developing improved restoration strategies that minimize restoration times. This paper presents a new method for evaluating the magnitude of the CLPU peak, and its duration, using multi-state load models. The use of multi-state load models allows for a more accurate representation of the end-use loads that are present on residential distribution feeders.
Article
Traditionally, the planning of operating reserve has been done in terms of capacity and average constant ramping requirements, whereas the newly emerging concept of power system flexibility puts emphasis on resources maneuverability, as well as accurately capturing the intra-hourly variability and uncertainty resulting from significant penetration of renewable power generation. However, the traditional reserve paradigm is deemed impeding to the notion of flexibility, whereas there is yet to be a proper way of defining power system flexibility. To that end, we rethink the fundamental meaning of reserve with respect to the emerging concept of flexibility and present a new flexibility modeling framework. We characterize flexibility provision and flexibility requirements via dynamical envelopes that can reflect the higher order dynamics of power system resources and those of variability and uncertainty. We assert that flexibility adequacy is directly related to how well the aggregate flexibility envelope formed by flexibility resources encloses the flexibility requirement envelope and its dynamics over operational planning horizons. An optimal flexibility planning problem with envelopes is formulated, followed by examples involving unit commitment and economic dispatch.
Article
The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Article
The smart grid is conceived of as an electric grid that can deliver electricity in a controlled, smart way from points of generation to active consumers. Demand response (DR), by promoting the interaction and responsiveness of the customers, may offer a broad range of potential benefits on system operation and expansion and on market efficiency. Moreover, by improving the reliability of the power system and, in the long term, lowering peak demand, DR reduces overall plant and capital cost investments and postpones the need for network upgrades. In this paper a survey of DR potentials and benefits in smart grids is presented. Innovative enabling technologies and systems, such as smart meters, energy controllers, communication systems, decisive to facilitate the coordination of efficiency and DR in a smart grid, are described and discussed with reference to real industrial case studies and research projects.
Article
This report presents a unified approach for the study of constrained Markov decision processes with a countable state space and unbounded costs. We consider a single controller having several objectives; it is desirable to design a controller that minimize one of cost objective, subject to inequality constraints on other cost objectives. The objectives that we study are both the expected average cost, as well as the expected total cost (of which the discounted cost is a special case). We provide two frameworks: the case were costs are bounded below, as well as the contracting framework. We characterize the set of achievable expected occupation measures as well as performance vectors. This allows us to reduce the original control dynamic problem into an infinite Linear Programming. We present a Lagrangian approach that enables us to obtain sensitivity analysis. In particular, we obtain asymptotical results for the constrained control problem: convergence of both the value and the pol...
Time-efficient strategic power dispatch for district cooling systems considering the spatialtemporal evolution of cooling load uncertainties, to appear in CSEE
  • G Chen
  • B Yan
  • H Zhang
  • D Zhang
  • Y Song
G. Chen, B. Yan, H. Zhang, D. Zhang, and Y. Song, Time-efficient strategic power dispatch for district cooling systems considering the spatialtemporal evolution of cooling load uncertainties, to appear in CSEE J. Power Energy Syst., 2021. DOI: 10.17775/CSEEJPES.2020.06800.
PJM manual 11: Energy&ancillary services market operations
  • L L C Pjmint
PJMINT.L.L.C., "PJM manual 11: Energy&ancillary services market operations," Revision: 115, pp. 83-98, Jun. 01, 2021. [Online]. Available: https://www.pjm.com/-/media/documents/manuals/m11.ashx.