Yutong Nie’s research while affiliated with Zhejiang University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Fig. 1. Two phases of a residential DR event.
Fig. 2. Regret comparison between the OLS algorithm and the TS-based online learning algorithm without contexts.
Online Residential Demand Response via Contextual Multi-Armed Bandits
  • Article
  • Full-text available

June 2020

·

51 Reads

·

30 Citations

IEEE Control Systems Letters

·

Yutong Nie

·

Na Li

Residential loads have great potential to enhance the efficiency and reliability of electricity systems via demand response (DR) programs. One major challenge in residential DR is how to learn and handle unknown and uncertain customer behaviors. In this paper, we consider the residential DR problem where the load service entity (LSE) aims to select an optimal subset of customers to optimize some DR performance, such as maximizing the expected load reduction with a financial budget or minimizing the expected squared deviation from a target reduction level. To learn the uncertain customer behaviors influenced by various time-varying environmental factors, we formulate the residential DR as a contextual multi-armed bandit (MAB) problem, and develop an online learning and selection (OLS) algorithm based on Thompson sampling to solve it. This algorithm takes the contextual information into consideration and is applicable to complicated DR settings. Numerical simulations are performed to demonstrate the learning effectiveness of the proposed algorithm.

Download

Fig. 1. Two phases of a residential DR event.
Fig. 2. Regret comparison between the OLS algorithm and the UCB-based online learning DR algorithm without contexts.
Fig. 3. Cumulative regret under different δ and σ. (Left: fix δ = 0.5, tune σ from 0.1 to 0.4. Right: fix σ = 0.4, tune δ from 0.3 to 0.9.)
Online Residential Demand Response via Contextual Multi-Armed Bandits

March 2020

·

81 Reads

Residential load demands have huge potential to be exploited to enhance the efficiency and reliability of power system operation through demand response (DR) programs. This paper studies the strategies to select the right customers for residential DR from the perspective of load service entities (LSEs). One of the main challenges to implement residential DR is that customer responses to the incentives are uncertain and unknown, which are influenced by various personal and environmental factors. To address this challenge, this paper employs the contextual multi-armed bandit (CMAB) method to model the optimal customer selection problem with uncertainty. Based on the Thompson sampling framework, an online learning and decision-making algorithm is proposed to learn customer behaviors and select appropriate customers for load reduction. This algorithm takes the contextual information into consideration and is applicable to complicated DR settings. Numerical simulations are performed to demonstrate the efficiency and learning effectiveness of the proposed algorithm.

Citations (1)


... Multi-armed bandits [1] is a prominent online decisionmaking framework for algorithms that make sequential decisions under uncertainty. It has found extensive applications across a broad spectrum of domains [2], such as dynamic pricing [3], demand response [4], [5], clinical trials [6], and recommendation systems [7]. Moreover, various extensions of multi-armed bandits have been developed to further enhance its modeling capabilities. ...

Reference:

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making
Online Residential Demand Response via Contextual Multi-Armed Bandits

IEEE Control Systems Letters