## About

431

Publications

75,415

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

16,616

Citations

Introduction

Physics informed machine learning for electric power systems

Additional affiliations

March 2008 - April 2008

March 2007 - April 2007

**Université d'Évry-Val-d'Essonne**

Position

- Invited Professor

January 2003 - present

## Publications

Publications (431)

We model the risk posed by a malicious cyber-attacker seeking to induce grid insecurity by means of a load redistribution attack, while explicitly acknowledging that such an actor would plausibly base its decision strategy on imperfect information. More specifically, we introduce a novel formulation for the cyber-attacker’s decision-making problem...

Random forests have been widely used for their ability to provide so-called importance measures, which give insight at a global (per dataset) level on the relevance of input variables to predict a certain output. On the other hand, methods based on Shapley values have been introduced to refine the analysis of feature relevance in tree-based models...

We model the risk posed by a malicious cyber-attacker seeking to induce grid insecurity by means of a load redistribution attack, while explicitly acknowledging that such an actor would plausibly base its decision strategy on imperfect information. More specifically, we introduce a novel formulation for the cyber-attacker's decision-making problem...

In operation planning, probabilistic reliability assessment consists in evaluating, for various candidate planning decisions, the induced probability of meeting a reliability target and the expected operating cost over a certain future time period. In this paper, we propose to exploit Monte-Carlo simulation and machine learning to predict operation...

This paper considers the integration of grid flexibility in the chance-constrained power system operation planning framework. The particular challenge addressed comes from the discrete nature of the respective controls, such as breaker positions defining the topology of the network. We consider a template short-term operation planning problem state...

PREPRINT available at
http://iandobson.ece.iastate.edu/PAPERS/zhouBayesRatesPMAPS20.pdf
Despite the important role transmission line outages play in power system reliability analysis, it remains a challenge to estimate individual line outage rates accurately enough from limited data. Recent work using a Bayesian hierarchical model shows how to com...

Transmission line outage rates are fundamental to power system reliability analysis. Line outages are infrequent, occurring only about once a year, so outage data are limited. We propose a Bayesian hierarchical model that leverages line dependencies to better estimate outage rates of individual transmission lines from limited outage data. The Bayes...

This paper presents a probabilistic methodology for assessing power system resilience, motivated by the extreme weather storm experienced in Iceland in December 2019. The methodology is built on the basis of models and data available to the Icelandic transmission system operator in anticipation of the said storm. We study resilience in terms of the...

This article reviews recent works applying machine learning (ML) techniques in the context of energy systems' reliability assessment and control. We showcase both the progress achieved to date as well as the important future directions for further research, while providing an adequate background in the fields of reliability management and of ML. Th...

Transmission line outage rates are fundamental to power system reliability analysis. Line outages are infrequent, occurring only about once a year, so outage data are limited. We propose a Bayesian hierarchical model that leverages line dependencies to better estimate outage rates of individual transmission lines from limited outage data. The Bayes...

The deployment of new technologies, e.g., renewable generation and electric vehicles, is rapidly transforming power networks by blurring the previously distinct spatio-temporal scales that many traditional approaches rely on for designing, analyzing and operating power grids. Other energy systems, such as natural gas systems, are undergoing similar...

In many applications of supervised learning, multiple classification or regression outputs have to be predicted jointly. We consider several extensions of gradient boosting to address such problems. We first propose a straightforward adaptation of gradient boosting exploiting multiple output regression trees as base learners. We then argue that thi...

This paper studies an extended formulation of the Security Constrained Optimal Power Flow (SCOPF) problem, which explicitly takes into account the probabilities of contingency events and of potential failures in the operation of post-contingency corrective controls. To manage such threats, we express the requirement that the probability of maintain...

Outage scheduling aims at defining, over a horizon of several months to years, when different components needing maintenance should be taken out of operation. Its objective is to minimize operation-cost expectation while satisfying reliability-related constraints. We propose a distributed scenario-based chance-constrained optimization formulation f...

Dealing with datasets of very high dimension is a major challenge in machine learning. In this paper, we consider the problem of feature selection in applications where the memory is not large enough to contain all features. In this setting, we propose a novel tree-based feature selection approach that builds a sequence of randomized trees on small...

We present in this paper a new generic approach to variable branching in branch and bound for mixedinteger linear problems. Our approach consists in imitating the decisions taken by a good branching strategy, namely strong branching, with a fast approximation. This approximated function is created by a machine learning technique from a set of obser...

We devise the Unit Commitment Nearest Neighbor (UCNN) algorithm to be used as a proxy for quickly approximating outcomes of short-term decisions, to make tractable hierarchical long-term assessment and planning for large power systems. Experimental results on an updated version of IEEE-RTS96 show high accuracy measured on operational cost, achieved...

This paper introduces a probabilistic reliability management approach and describes a pilot test planned by the Icelandic transmission system operator, Landsnet, in early 2017, as part of the EU GARPUR project. The pilot test will assess the viability of the approach and criteria proposed by GARPUR, in the context of real-time system operation. The...

This paper develops a probabilistic approach for power system reliability management in real-time operation where risk is a product of i) the potential occurrence of contingencies, ii) the possible failure of corrective (i.e., post-contingency) control and, iii) the socio-economic impact of service interruptions to end-users. Stressing the spatiote...

In many cases, feature selection is often more complicated than identifying a single subset of input variables that would together explain the output. There may be interactions that depend on contextual information, i.e., variables that reveal to be relevant only in some specific circumstances. In this setting, the contribution of this paper is to...

This paper investigates the stakes of introducing probabilistic approaches for the management of power system's security. In real-time operation, the aim is to arbitrate in a rational way between preventive and corrective control, while taking into account i) the prior probabilities of contingencies, ii) the possible failure modes of corrective con...

Reliability of electrical transmission systems is presently managed by applying the deterministic N-1 criterion, or some variant thereof. This means that transmission systems are designed with at least one level of redundancy, regardless of the cost of doing so, or the severity of the risks they mitigate. In an operational context, the N-1 criterio...

This paper considers the general problem of image classification without using any prior knowledge about image classes. We study variants of a method based on supervised learning whose common steps are the extraction of random subwindows described by raw pixel intensity values and the use of ensemble of extremely randomized trees to directly classi...

Motivation:
Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries.
Results:
We developed Cytomine to foster active and distributed collaboration of multidisciplinary teams for large-scale image-based studies. It uses web development methodologies and machine learning in order to readily organize, explor...

Background:
The purpose of the MaxT algorithm is to provide a significance test algorithm that controls the family-wise error rate (FWER) during simultaneous hypothesis testing. However, the requirements in terms of computing time and memory of this procedure are proportional to the number of investigated hypotheses. The memory issue has been solv...

This paper addresses the problem of computing optimal structured treatment interruption strategies for HIV infected patients. We show that reinforcement learning may be useful to extract such strategies directly from clinical data, without the need of an accurate mathematical model of HIV infection dynamics. To support our claims, we report simulat...

Teleost fish such as zebrafish (Danio rerio) are increasingly used for physiological, genetic and developmental studies. Our understanding of the physiological consequences of altered gravity in an entire organism is still incomplete. We used altered gravity and drug treatment experiments to evaluate their effects specifically on bone formation and...

Fine operating rules for security control and an automatic system for their online discovery were developed to adapt to the development of smart grids. The automatic system uses the real-time system state to determine critical flowgates, and then a continuation power flow-based security analysis is used to compute the initial transfer capability of...

Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on embryo survival and development are generally evaluated manually through microscopic observation by an expert and documented by several typica...

This paper considers a trajectory-based approach to determine control signals superimposed to those of existing controllers so as to enhance the damping of electromechanical oscillations. This approach is framed as a discrete-time, multi-step optimization problem which can be solved by model-based and/or by learning-based methods. This paper propos...

This paper focuses on reducing generators dispatch cost by means of transmission line switching. The problem is formulated as a mixed-integer nonlinear program (MINLP) optimal power flow (OPF). A scalable heuristic algorithm is proposed to break-down the complexity of the problem due to the huge combinatorial space. The algorithm aims at providing...

Power system planning and operation offers multitudinous opportunities for optimization methods. In practice, these problems are generally large-scale, non-linear, subject to uncertainties, and combine both continuous and discrete variables. In the recent years, a number of complementary theoretical advances in addressing such problems have been ob...

A general approach to real-time transient stability control is described, yielding various complementary techniques: pure preventive, open-loop emergency, and closed-loop emergency controls. Recent progress in terms of a global transient stability-constrained optimal power flow is presented, yielding in a scalable nonlinear programming formulation...

The paper investigates the feasibility of applying Model Predictive Control (MPC) as a viable strategy to damp wide-area electromechanical oscillations in large-scale power systems. First a fully centralized MPC scheme is considered, and its performances are evaluated first in ideal conditions and then by considering state estimation errors and com...

This paper proves the practicality of an iterative algorithm for solving realistic large-scale SCOPF problems. This algorithm is based on the combination of a contingency filtering scheme, used to identify the binding contingencies at the optimum, and a network compression method, used to reduce the complexity of the post-contingency models include...

This paper deals with multi-period active power loss minimization. We formulate this problem as a mixed-integer nonlinear programming (MINLP) problem, including constraints that specifically limit the number of switching actions between two successive anticipated system states. We solve this problem using the Mixed Integer Hybrid Differential Evolu...

Networks are ubiquitous in biology and computational approaches have been
largely investigated for their inference. In particular, supervised machine
learning methods can be used to complete a partially known network by
integrating various measurements. Two main supervised frameworks have been
proposed: the local approach, which trains a separate m...

We adapt the idea of random projections applied to the output space, so as to
enhance tree-based ensemble methods in the context of multi-label
classification. We show how learning time complexity can be reduced without
affecting computational complexity and accuracy of predictions. We also show
that random output space projections may be used in o...

The primary goal of genome-wide association studies (GWAS) is to discover variants that could lead, in isolation or in combination, to a particular trait or disease. Standard approaches to GWAS, however, are usually based on univariate hypothesis tests and therefore can account neither for correlations due to linkage disequilibrium nor for combinat...

We present a novel methodology combining web-based software development practices, machine learning, and spatial databases for computer-aided quantification of regions of interest (ROIs) in large-scale imaging data. We describe our main methodological choices, and then illustrate the benefits of the approach (workload reduction, improved precision,...

Direct policy search (DPS) and look-ahead tree (LT) policies are two widely
used classes of techniques to produce high performance policies for sequential
decision-making problems. To make DPS approaches work well, one crucial issue
is to select an appropriate space of parameterized policies with respect to the
targeted problem. A fundamental issue...

Disordered regions, i.e., regions of proteins that do not adopt a stable three-dimensional structure, have been shown to play various and critical roles in many biological processes. Predicting and understanding their formation is therefore a key sub-problem of protein structure and function inference. A wide range of machine learning approaches ha...

Despite growing interest and practical use in various scientific areas, variable im-portances derived from tree-based ensemble methods are not well understood from a theoretical point of view. In this work we characterize the Mean Decrease Im-purity (MDI) variable importances as measured by an ensemble of totally ran-domized trees in asymptotic sam...

One of the pressing open problems of computational systems biology is the elucidation of the topology of gene regulatory networks (GRNs). In an attempt to solve this problem, the idea of systems genetics is to exploit the natural variations that exist between the DNA sequences of related individuals and that can represent the randomized and multifa...

This paper deals with day-ahead static security assessment with respect to a postulated set of
contingencies while taking into account uncertainties about the next day system conditions.
We propose a heuristic approach to compute the worst-case under operation uncertainty for a
contingency with respect to overloads. We formulate this problem as a n...

his paper deals with day-ahead security management with respect to a postulated
set of contingencies while taking into account uncertainties about the next day
generation/load scenario. In order to help the system operator in decision making
under uncertainty we aim at ranking these contingencies into four clusters according
to the type of control...

In this paper, we consider the batch mode reinforcement learning setting, where the central problem is to learn from a sample of trajectories a policy that satisfies or optimizes a performance criterion. We focus on the continuous state space case for which usual resolution schemes rely on function approximators either to represent the underlying c...

This paper investigates the stakes of introducing probabilistic approaches for the management of power system's security. In real-time operation, the aim is to arbitrate in a rational way between preventive and corrective control, while taking into account i) the prior probabilities of contingencies, ii) the possible failure modes of corrective con...

Background
Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis sc...

Disulfide bridges strongly constrain the native structure of many proteins and predicting their formation is therefore a key sub-problem of protein structure and function inference. Most recently proposed approaches for this prediction problem adopt the following pipeline: first they enrich the primary sequence with structural annotations, second t...

This paper reports extensive results obtained with the Interior-Point Method
(IPM) for nonlinear programms (NLPs) stemming from large-scale and severly constrained
classical Optimal Power Flow (OPF) and Security-Constrained Optimal Power Flow (SCOPF)
problems. The paper discusses transparently the problems encountered such as convergence
reliabilit...

We present a unified framework involving the extraction of random subwindows within images and the induction of ensembles of extremely randomized trees. We discuss the specialization of this framework for solving several general problems in computer vision, ranging from image classification and segmentation to content-based image retrieval and inte...

The problem of learning Markov equivalence classes of Bayesian network
structures may be solved by searching for the maximum of a scoring metric in a
space of these classes. This paper deals with the definition and analysis of
one such search space. We use a theoretically motivated neighbourhood, the
inclusion boundary, and represent equivalence cl...

This paper deals with day-ahead power systems security planning under uncertainties, by posing an optimization problem over a set of power injection scenarios that could show up the next day and modeling the next day's real-time control strategies aiming at ensuring security with respect to contingencies by a combination of preventive and correctiv...

In this paper, we address the problem of computing interpretable solutions to reinforcement learning (RL) problems. To this end, we propose a search algorithm over a space of simple closed-form formulas that are used to rank actions. We formalize the search for a high-performance policy as a multi-armed bandit problem where each arm corresponds to...