Conference Paper

Model-Free Real-Time Autonomous Energy Management for a Residential Multi-Carrier Energy System: A Deep Reinforcement Learning Approach

To read the full-text of this research, you can request a copy directly from the authors.


The problem of real-time autonomous energy management is an application area that is receiving unprecedented attention from consumers, governments, academia and industry. This paper showcases the first application of deep reinforcement learning (DRL) to real-time autonomous energy management for a multi-carrier energy system. The proposed approach is tailored to align with the nature of the energy management problem by posing it in multi-dimensional continuous state and action spaces, in order to coordinate power flows between different energy devices, and to adequately capture the synergistic effect of couplings between different energy carriers. This fundamental contribution is a significant step forward from earlier approaches that only sought to control the power output of a single device and neglected the demand-supply coupling of different energy carriers. Case studies on a real-world scenario demonstrate that the proposed method significantly outperforms existing DRL methods as well as model-based control approaches in achieving the lowest energy cost and yielding a representation of energy management policies that adapt to system uncertainties.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... As indicated before, if the action and state spaces are determined as multi-dimensional continuous intervals, the Q table becomes large, making the algorithms struggle while updating it. In such situations as in [27], DQN is an appropriate option, which approximates the Q values in the table without updating them in a conventional way. In [28], DQN and Double DQN are used for smart charging of EVs. ...
Full-text available
In the future, the load demand due to charging of large numbers of electric vehicles (EVs) will be at such a high level that existing networks in some regions may not afford. Therefore, radical changes modernizing the grid will be required to overcome the technical and economic problems besides bureaucratic issues. Amendments to be made in the regulations on electrical energy and new tariff regulations can be considered within this scope. Smart charging of EVs is not often dealt with a solution using reinforcement learning (RL), which is one of the most effective methods for solving such decision-making problems. Most of the studies on this topic endeavor to estimate the state and action spaces and to tune the penalty coefficients within the RL models developed. In this paper, we solve the EV charging problem using expected SARSA with a novel rewarding strategy, as we propose a new approach to determine the state and action spaces. The efficacy of the proposed method is demonstrated on the problem of charging a single EV, as we compare it with a number of alternatives involving Q-Learning and constant charging approaches.
... As another example, energy management systems will benefit from the possibility of using reliable real-time information exchanged by software agents as well. Evidence exists that autonomous energy management systems for a smart home equipped with sensors can make use of the various energy consumption and production data to train agents using deep reinforcement learning (Ye et al., 2020). As a result, the agent gradually acquires the most promising energy management strategies by learning from repeated interactions through the process of trial and error. ...
Full-text available
Distributed ledger technology (DLT) enables a wide range of innovative industrial use cases and business models, such as through programmable payments and the seamless exchange of assets, goods, and services. To exploit the full potential of a DLT-based European economy, it is crucial to integrate the euro into DLT networks. In this paper, we propose a framework for developing payment solutions for a DLT-based European economy. To this end, we decompose the digital payments value chain into three pillars: (1) contract execution system, (2) digital payment infrastructure, and (3) monetary unit. Based on this framework, we systematically compare account-and token-based payment solutions, including a bridge solution, e-money tokens, synthetic central bank digital currencies (CBDCs), and a central bank digital currency (CBDC). Taking into account current circumstances, we conclude that no individual payment solution will be sufficient to address all emerging use cases. Instead, a broad array of payment solutions will emerge and co-exist. These solutions will apply to a variety of different use cases and will be launched at different points in time.
Full-text available
Residential buildings are large consumers of energy. They contribute significantly to the demand placed on the grid, particularly during hours of peak demand. Demand‐side management is crucial to reducing this demand placed on the grid and increasing renewable utilisation. This research study presents a multi‐objective tunable deep reinforcement learning algorithm for demand‐side management of household appliances. The proposed tunable Deep Q‐Network (DQN) algorithm learns a single policy that accounts for different preferences for multiple objectives present when scheduling appliances. These include electricity cost, peak demand, and punctuality. The tunable Deep Q‐Network algorithm is compared to two rule‐based approaches for appliance scheduling. When comparing the 1‐month simulation results for the tunable DQN with an electricity cost rule‐based benchmark method, the tunable DQN agent provides a statistically significant improvement of 30%, 18.2%, and 37.3% for the cost, peak power, and punctuality objectives. Moreover, the tunable Deep Q‐Network can produce a range of appliance scheduling policies for different objective preferences without requiring any computationally intensive retraining. This is the key advantage of the proposed tunable Deep Q‐Network algorithm for appliance scheduling.
Full-text available
MES (multi-energy systems) whereby electricity, heat, cooling, fuels, transport, and so on optimally interact with each other at various levels (for instance, within a district, city or region) represent an important opportunity to increase technical, economic and environmental performance relative to “classical” energy systems whose sectors are treated “separately” or “independently”. This performance improvement can take place at both the operational and the planning stage. While such systems and in particular systems with distributed generation of multiple energy vectors (DMG (distributed multi-generation)) can be a key option to decarbonize the energy sector, the approaches needed to model and relevant tools to analyze them are often of great complexity. Likewise, it is not straightforward to identify performance metrics that are capable to properly capture costs and benefits that are relating to various types of MES according to different criteria. The aim of this invited paper is thus to provide the reader with a comprehensive and critical overview of the latest models and assessment techniques that are currently available to analyze MES and in particular DMG systems, including for instance concepts such as energy hubs, microgrids, and VPPs (virtual power plants), as well as various approaches and criteria for energy, environmental, and techno-economic assessment.
Full-text available
In this paper, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference (TD) learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction based on information provided by the critic. We show that the features for the critic should ideally span a subspace prescribed by the choice of parameterization of the actor. We study actorcritic algorithms for Markov decision processes with general state and action spaces. We state and prove two results regarding their convergence.
To realize the synergy between plug-in electric vehicles (PEVs) and wind power, this paper presents a hierarchical stochastic control scheme for the coordination of PEV charging and wind power in a microgrid. This scheme consists of two layers. Based on the non-Gaussian wind power predictive distributions, an upper layer stochastic predictive controller coordinates the operation of PEV aggregator and wind turbine. The computed power references are sent to the lower layer PEV and wind controllers for execution. The PEV controller optimally allots the aggregated charging power to individual PEVs. The wind controller regulates the power output of wind turbine. In this way, a power balance between supply and demand in a microgrid is achieved. The main feature of this scheme is that it incorporates the non-Gaussian uncertainty and partially dispatchability of wind power, as well as the PEV uncertainty. Numerical results show the effectiveness of the proposed scheme.
Model-Free Real-Time EV Charging Scheduling Based on Deep Reinforcement Learning
et al., 2019] Zhiqiang Wan, Hepeng Li, Haibo He, and Danil Prokhorov. Model-Free Real-Time EV Charging Scheduling Based on Deep Reinforcement Learning. IEEE Trans. on Smart Grid, 10(5):5246-5257, 2019.
Optimizing Home Energy Management and Electric Vehicle Charging with Reinforcement Learning
et al., 2017] Tianshu Wei, Yanzhi Wang, and Qi Zhu. Deep Reinforcement Learning for Building HVAC Control. In Proc. DAC, pages 1-7, Austin, USA, 2017. [Wen et al., 2015] Zheng Wen, Daniel O'Neill, and Hamid Maei. Optimal Demand Response Using Device-Based Reinforcement Learning. IEEE Trans. on Smart Grid, 6(5):2312-2324, 2015. [Wu et al., 2018] Di Wu, Guillaume Rabusseau, Vincent Francoislavet, Doina Precup, and Benoit Boulet. Optimizing Home Energy Management and Electric Vehicle Charging with Reinforcement Learning. In Proc. ALA, pages 1-8, Stockholm, Sweden, 2018.
Factoring flexible demand non-convexities in electricity markets
et al., 2014] Yujian Ye, Dimitrios Papadaskalopoulos, and Goran Strbac. Factoring flexible demand non-convexities in electricity markets. IEEE Trans. on Power Syst., 30(4):2090-2099, 2014.
Multi-period and multi-spatial equilibrium analysis in imperfect electricity markets: A novel multi-agent deep reinforcement learning approach
et al., 2019] Yujian Ye, Dawei Qiu, Jing Li, and Goran Strbac. Multi-period and multi-spatial equilibrium analysis in imperfect electricity markets: A novel multi-agent deep reinforcement learning approach. IEEE Access, 7:130515-130529, 2019. [Ye et al., 2020] Yujian Ye, Dawei Qiu, Mingyang Sun, Dimitrios Papadaskalopoulos, and Goran Strbac. Deep reinforcement learning for strategic bidding in electricity markets. IEEE Trans. on Smart Grid, 11(2):1343-1355, 2020. [Zhang et al., 2017] Huaguang Zhang, Yushuai Li, David Wenzhong Gao, and Jianguo Zhou. Distributed optimal energy management for energy internet. IEEE Trans. Ind. Informat., 13(6):3081-3097, 2017.