# Roland MalhamePolytechnique Montréal · Department of Electrical Engineering

Roland Malhame

Ph.D. Georgia Tech

## About

183

Publications

9,942

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

5,579

Citations

Citations since 2016

Introduction

**Skills and Expertise**

Education

September 1978 - March 1983

## Publications

Publications (183)

Flexibility from demand-side resources is increasingly required in modern power systems to maintain the dynamic balance between demand and supply. This flexibility comes from elastic users managing controllable loads. In this context, controlling Electric Space Heaters (ESHs) is of particular interest because it can leverage building inner thermal...

We develop a strategy, with concepts from Mean Field Games (MFG), to coordinate the charging of a large population of battery electric vehicles (BEVs) in a parking lot powered by solar energy and managed by an aggregator. A yearly parking fee is charged for each BEV irrespective of the amount of energy extracted. The goal is to share the energy ava...

We consider a large group of consumers who decide whether or not to buy a durable good offered by a firm. Without previous experience with the product, consumers rely on the ratings of past purchasers to evaluate the product goodwill and make optimal decisions. The consumers have heterogeneous intrinsic rating behaviors and preferences. For example...

In this paper, we propose an approach for coupling a power network dispatch model, which is part of a long-term multi-energy model, with Wardrop or Mean-Field-Game (MFG) equilibrium models that represent the demand response of a large population of small “prosumers” connected at the various nodes of the electricity network. In a deterministic setti...

Transactive Energy (TE) has brought exciting opportunities for all stakeholders in energy markets by enabling management decentralization. This new paradigm empowers demand-side agents to play a more active role through coordinating, cooperating, and negotiating with other agents. Nevertheless, most of these agents are not used to process market si...

This paper presents an algorithm for the identification of parameters for a stochastic hot water end-use process that drives a homogeneous population of thermostatically controlled electric water heaters (EWH). Usually, only metered interval consumption data (kWh) is collected and the hot water end-use process is unobservable to utility and aggrega...

Electric thermal loads, such as those for space heaters, water heaters and air conditioners, due to their association with energy storage, are deferrable. Thus, they can become an effective tool to compensate for the mismatches between power generation and power demand induced by renewable sources. Load reduction and load increase may appear to be...

While the storage properties and the anticipation potential of many classes of power system loads (such as thermal loads) can be exploited to mitigate renewable sources variability, the challenge to do so in an optimal and coherent manner is significant. This is due to the sheer number and dynamic diversity of the loads that can be involved in any...

Large-size populations consisting of a continuum of identical and non-cooperative agents with stochastic dynamics are useful in modeling various biological and engineered systems. This paper addresses the stochastic control problem of designing optimal state-feedback controllers which guarantee the closed-loop stability of the stationary density of...

In order to deal with issues caused by the increasing penetration of renewable resources in power systems, this paper proposes a novel distributed frequency control algorithm for each generating unit and controllable load in a transmission network to replace the conventional automatic generation control (AGC). The targets of the proposed control al...

An efficient participation of prosumers in power system management depends on the quality of information they can obtain. Prosumers actions can be performed by automated agents that are operating in time-changing environments. Therefore, it is essential for them to deal with data stream problems in order to make reliable decisions based on the most...

In this paper, we investigate a class of nonzero-sum dynamic stochastic games, where players have linear dynamics and quadratic cost functions. The players are coupled in both dynamics and cost through a linear regression (weighted average) as well as a quadratic regression (weighted covariance matrix) of the states and actions, where the linear re...

This paper studies a dynamic collective choice model in the presence of an advertiser, where a large number of consumers are choosing between two alternatives. Their choices are influenced by the group’s aggregate choice and an advertising effect. The latter is produced by an advertiser making investments to convince as many consumers as possible t...

Mean field game (MFG) theory studies the existence of Nash equilibria, together with the individual strategies which generate them, in games involving a large number of asymptotically negligible agents modeled by controlled stochastic dynamical systems. This is achieved by exploiting the relationship between the finite and corresponding infinite li...

Pressure on ancillary reserves, i.e.frequency preserving, in power systems has significantly mounted due to the recent generalized increase of the fraction of (highly fluctuating) wind and solar energy sources in grid generation mixes. The energy storage associated with millions of individual customer electric thermal (heating-cooling) loads is con...

We consider a class of dynamic collective choice models with social interactions, whereby a large number of non-uniform agents have to individually settle on one of multiple discrete alternative choices, with the relevance of their would-be choices continuously impacted by noise and the unfolding group behavior. This class of problems is modeled he...

Networked control systems must use communication links between control hubs and distributed components, possibly both to observe component states, and to send control commands. We consider a model of a CDMA based communication and control system where the power sent from the components to the base station, acting as the control hub, is proportional...

This paper deals with a family of dynamic game models that represent schematically the interaction between groups of countries in achieving the necessary limitation of carbon atmospheric emissions in order to control climate change. We start from a situation where m coalitions of countries exist and behave as m players in a game of sharing a global...

Mean field game (MFG) theory studies the existence of Nash equilibria, together with the individual strategies which generate them, in games involving a large number of asymptotically negligible agents modeled by controlled stochastic dynamical systems. This is achieved by exploiting the relationship between the finite and corresponding infinite li...

Load control has traditionally been viewed as a useful tool for peak load reduction in power systems. With the increasing renewable energy penetration in the grid, load control is also considered as a tool to exploit the storage in dispersed devices naturally present in power systems such as electric water heaters to mitigate generation variability...

Convergence properties of time inhomogeneous Markov chain based discrete and continuous time linear consensus algorithms are analyzed. Provided that a so-called infinite jet-flow property is satisfied by the coupling chain, necessary conditions for both consensus and multiple consensus are established. A recent extension by Sonin of the classical K...

We consider a multi-agent system with linear stochastic individual dynamics, and individual linear quadratic ergodic cost functions. The agents partially observe their own states. Their cost functions and initial statistics are a priori independent but they are coupled through an interference term (the mean of all agent states), entering each of th...

We consider a dynamic collective choice problem where a large number of players are cooperatively choosing between multiple destinations while being influenced by the behavior of the group. For example, in a robotic swarm exploring a new environment, a robot might have to choose between multiple sites to visit, but at the same time it should remain...

Studies of traffic dynamics rely either on macro-scopic models considering the traffic as a fluid, or on micro-scopic models of drivers' behavior. The connection between the microscopic and macroscopic scales is often done via empirical relationships such as the fundamental diagram for macroscopic models, relating traffic flow or average velocity a...

Inspired by successful biological collective decision mechanisms such as
honey bees searching for a new colony or the collective navigation of fish
schools, we consider a mean field games (MFG)-like scenario where a large
number of agents have to make a choice among a set of different potential
target destinations. Each individual both influences a...

In this paper, we deal with the limiting behavior of linear consensus systems in both continuous and discrete time. A geometric framework featuring the state transition matrix of the system is introduced to: (i) generalize/rediscover the existing results in the literature about convergence properties of distributed averaging algorithms, (ii) interp...

As part of a system wide optimization problem in smart grids it is required that the mean temperature of a massive number of space heating devices associated with energy storage follow a precomputed target temperature trajectory. The classical control approach is to evaluate the required signal centrally for each device and to send that signal. How...

We consider distributed estimation of a class of large population multi-agent systems where the agents have linear stochastic dynamics and are coupled via their partial observations. The measurements interference model is assumed to depend only on the empirical mean of agents states. In addition, a structural assumption is made on the agents' contr...

Inspired by successful biological collective decision mechanisms such as honey bees searching for a new colony or the collective navigation of fish schools, we consider a mean field games (MFG) scenario producing decentralized homing decisions in large multi-agent systems. For our setup, we show that given an initial distribution of the agents, man...

In this paper, we consider linear continuous-time models of evolution of opinions in large-scale dynamical networks. Our focus is on dynamical networks that are defined on general exogenously given time-varying graphs, where the nodes of the underlying graph model individuals (or agents) with first-order linear opinion dynamics. In such a network,...

The aim of this paper is to establish the optimality of a hedging-point control policy in a multistate Markovian failure-prone manufacturing system with a risk-averse criterion that is defined as the conditional value-at-risk (CVaR) of the steady-state instantaneous running cost, where the system is subject to a constant single-product demand rate....

As part of a system wide optimization problem in smart grids it is required that the mean temperature of a massive number of space heating devices associated with energy storage follows a computed target temperature trajectory. The classical control approach is to compute the required signal centrally for each device and to send that signal. Howeve...

We consider a network of evolving opinions. It includes multiple individuals
with first-order opinion dynamics defined in continuous time and evolving based
on a general exogenously defined time-varying underlying graph. In such a
network, for an arbitrary fixed initial time, a subset of individuals forms an
eminence grise coalition, abbreviated as...

This technical note presents a continuum approach to a non-Gaussian initial mean consensus problem via Mean Field (MF) stochastic control theory. In this problem formulation: (i) each agent has simple stochastic dynamics with inputs directly controlling its state's rate of change and (ii) each agent seeks to minimize by continuous state feedback it...

Load control has traditionally been viewed as a useful tool for peak load reduction in power systems. With the increasing renewable energy penetration in the grid, load control is also considered as a tool to exploit the storage in dispersed devices naturally present in power systems such as electric water heaters to mitigate generation variability...

An unreliable single part type transfer line with fixed inter machine buffer sizes is considered. In general, imperfect machines operating on imperfect raw material, or partially processed raw material, will result in the production of a mix of conforming and non conforming parts. The problem of optimal joint assignment of buffer sizes and inspecti...

The optimality of a hedging control policy in a Markovian failure-prone manufacturing system subject to a constant rate of demand for parts is established for a long-run risk-averse criterion, which is the conditional value-at-risk of the steady-state instantaneous running cost. This extends the known classical result of optimality of hedging polic...

The demand response problem is investigated where it is required that the mean temperature of a massive number of electric devices associated with energy storage (e.g. electric water heaters, electric space heaters, etc.) follow a target temperature trajectory computed as part of a system wide optimization problem in smart grids. The classical cont...

We study a class of mean field control problems for a large population of linear Gaussian quadratic agents, coupled via their partial observations. The measurements interference model is assumed to depend only on the empirical mean of agents states. In addition, a structural assumption is made on the agents' controls which are constrained to be lin...

Unconditional consensus is the property of a consensus algorithm for multiple agents, to produce consensus irrespective of the particular time or state at which the agent states are initialized. Under a weak condition, so-called balanced asymmetry, on the sequence (An) of stochastic matrices in the agents states update algorithm, it is shown that (...

In this paper, we investigate a class of large population stochastic multi-agent systems where the agents have linear stochastic dynamics and are coupled via their measurement equations. Using the state aggregation technique, we propose a distributed estimation and control algorithm that combines the Kalman filtering for state estimation and the li...

Convergence properties of time inhomogeneous Markov chain based discrete and
continuous time linear consensus algorithms are analyzed. Provided that a
so-called infinite jet flow property is satisfied by the underlying chains,
necessary conditions for both consensus and multiple consensus are established.
A recenet extension by Sonin of the classic...

The purpose of this paper is to synthesize initial mean consensus behavior of a set of agents from the fundamental optimization principles of i) stochastic dynamic games, and ii) optimal control. In the stochastic dynamic game model each agent seeks to minimize its individual quadratic discounted cost function involving the mean of the states of al...

This paper considers inventory models of (Q, s) type with Q the order-quantity and s the order point. In general, an optimal choice of control parameters (Q and s) will depend on the characteristics of replenishment lead time and the demand process, as well as holding and shortage costs. Although many studies have treated lead time as constant, foc...

We study large population leader-follower stochastic multi-agent systems where the agents have linear stochastic dynamics and are coupled via their quadratic cost functions. The cost of each leader is based on a trade-off between moving toward a certain reference trajectory which is unknown to the followers and staying near their own centroid. On t...

Abstract—We study a class of linear-quadratic-Gaussian (LQG)
control problems with N decision makers, where the basic objective
is to minimize a social cost as the sum of N individual costs containing
mean field coupling. The exact socially optimal solution (determining
a particular Pareto optimum) requires centralized information
for each agent an...

In a multi-agent system, unconditional (multiple) consensus is the property
of reaching to (multiple) consensus irrespective of the instant and values at
which states are initialized. For linear algorithms, occurrence of
unconditional (multiple) consensus turns out to be equivalent to (class-)
ergodicity of the transition chain (A_n). For a wide cl...

Unconditional consensus is the property of a consensus algorithm for multiple
agents, to produce consensus irrespective of the particular time or state at
which the agent states are initialized. Under a weak condition, so-called
balanced asymmetry, on the sequence (A_n) of stochastic matrices in the agents
states update algorithm, it is shown that...

Multi agent consensus algorithms with update steps based on so-called
balanced asymmetric chains, are analyzed. For such algorithms it is
shown that (i) the set of accumulation points of states is finite, (ii)
the asymptotic unconditional occurrence of single consensus or multiple
consensuses is directly related to the property of absolute infinite...

Bielecki and Kumar (1988) established the optimality of a critical inventory policy (hedging policy) in a Markovian failure-prone manufacturing system subject to a constant rate of demand for parts, and for a long-term average cost structure including parts storage and demand backlog costs. Under the same conditions, and if instead of minimizing th...

This paper presents a continuum approach to the initial mean consensus problem via Mean Field (MF) stochastic control theory. In this problem formulation: (i) each agent has simple stochastic dynamics with inputs directly controlling its state's rate of change, and (ii) each agent seeks to minimize its individual cost function involving a mean fiel...

The purpose of this paper is to study an evolution (i.e., forward in time) mean field equation system of a dynamic game initial mean consensus model. In this model: (i) each agent has simple stochastic dynamics with inputs directly controlling its state's rate of change, and (ii) each agent seeks to minimize its individual long run average cost fun...

The limiting behavior of standard linear non negative combination based state update algorithms in continuous time, or convex combination based state update algorithms in discrete time for multi-agent systems is considered. Under fairly weak conditions, it is shown that each agent's opinion converges to an individual limit as time goes to infinity....

Purpose
– This paper seeks to address the production control problem of a failure‐prone manufacturing system producing a random fraction of defective items.
Design/methodology/approach
– A fluid model with perfectly mixed good and defective parts has been proposed. This approach combines the descriptive capacities of continuous/discrete event simu...

This study proposes a new technique for the extraction of static non-linearities in wideband transmitters by launching a simple sinusoidal wave within the bandwidth of the system. The transmitter is first characterised under a two-carrier wideband code division multiple access (WCDMA) drive signal and modelled using a memory polynomial model. Two t...

A general convex combination based model in dis- crete time for the evolution of multi-agent systems is considered. Sufficient conditions under which an individual limit for each agent's state exists are subsequently derived. Moreover, it is shown that under stronger conditions, those individual limits are equal, i.e., consensus occurs. The relatio...

This paper considers joint production control and product specifications decision making in a failure prone manufacturing system. This is with the knowledge that tight process specifications, while leading to a product of more reliable quality and higher market value, are at the same time associated with higher levels of non-conforming parts, a hig...

For a given choice of the maximum allowable total storage parameter, the performance of constant work-in-process (CONWIP)
disciplines in unreliable transfer lines subjected to a constant rate of demand for parts, is characterized via a tractable
approximate mathematical model. For a (n−1) machines CONWIP loop, the model consists of n multi-state ma...

We consider a leader-follower dynamic game model for large population systems where the agents have linear stochastic dynamics and are coupled via their quadratic cost functions. The cost of each leader is based on a tradeoff between moving towards a certain reference trajectory signal and staying near a weighted average of the members’ states. Fol...

We study large population stochastic dynamic games where each agent assigns individually determined coupling strengths (with possible spatial interpretation) to the states of other agents in its performance function. The mean field methodology yields a set of decentralized controls which generates an -Nash equilibrium for the population of size . A...

We study mean field LQG problems with cost coupling. The cost of each agent is a convex combination of its own cost and the social cost, where the weight assigned to the latter reflects the willingness of the agent in question to contribute to the social objective. We use the Nash Certainty Equivalence (NCE) approach to construct a set of decentral...

The paper addresses the optimal production control problems for an unreliable manufacturing system that produces items that can be regarded as conforming or non conforming. Two cases ares considered: (i) The ratio between conforming and non conforming parts is fixed at all times, (ii) the production of defective parts is respectively initiated and...

We study the optimality properties of maximum likelihood ratio estimation based Mean Field (Nash Certainty Equivalence) control laws in a leader-follower stochastic collective dynamics model. In this formulation the leaders track a convex combination of their centroid together with a certain reference trajectory which is unknown to the followers, a...

In this paper we study a controlled flocking model, where the state of each agent consists of both its position and its controlled velocity, by use of Mean Field (MF) stochastic control framework. We formulate large population stochastic flocking problem as a dynamic game problem in which the agents have similar dynamics and are coupled via their n...

In this paper we study a controlled Leader-Follower (L-F) flocking model (where the state of each agent consists of both its
position and its controlled velocity) by use of the Mean Field (MF) Stochastic Control framework. We formulate the large population
stochastic L-F flocking problem as a dynamic game problem. In this model, the agents have sim...