Sebastien Gros

Sebastien Gros
Norwegian University of Science and Technology | NTNU · Department of Engineering Cybernetics

Professor

About

214
Publications
48,905
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,391
Citations
Introduction
My research focuses on Model Predictive and Optimal Control, Reinforcement Learning (RL), risk-based planning and control, and decision making. My applications include wind energy, smart grids and smart buildings, electric mobility, traffic control, autonomous driving, autonomous ships, aircraft control. I have recently developed an experimental smart house, which we use to investigate algorithms for optimal energy management in a more realistic way than is usually done.
Additional affiliations
January 2019 - present
Norwegian University of Science and Technology
Position
  • Professor
April 2013 - December 2018
Chalmers University of Technology
Position
  • Professor (Associate)
April 2011 - April 2013
KU Leuven
Position
  • PostDoc Position
Education
September 1996 - December 2007

Publications

Publications (214)
Article
The combination of learning methods with Model Predictive Control (MPC) has attracted a significant amount of attention in the recent literature. The hope of this combination is to reduce the reliance of MPC schemes on accurate models, and to tap into the fast developing machine learning and reinforcement learning tools to exploit the growing amoun...
Article
Economic Model Predictive Control (MPC) dissipativity theory is central to discussing the stability of policies resulting from minimizing economic stage costs. In its current form, the dissipativity theory for economic MPC applies to problems based on deterministic dynamics or to very specific classes of stochastic problems, and does not readily ex...
Article
Full-text available
The closed-loop stability of an optimal policy provided by an Economic Nonlinear Model Predictive Control (ENMPC) scheme requires the existence of a storage function satisfying dissipativity conditions. Unfortunately, finding such a storage function is difficult in general. In contrast, tracking NMPC scheme uses a stage cost that is lower-bounded b...
Preprint
Full-text available
This paper aims to provide a Dynamic Programming (DP) approach to solve the Mission-Wide Chance-Constrained Optimal Control Problems (MWCC-OCP). The mission-wide chance constraint guarantees that the probability that the entire state trajectory lies within a constraint/safe region is higher than a prescribed level, and is different from the stage-w...
Article
In this paper, we consider solving discounted Markov Decision Processes (MDPs) under the constraint that the resulting policy is stabilizing. In practice MDPs are solved based on some form of policy approximation. We will leverage recent results proposing to use Model Predictive Control (MPC) as a structured approximator in the context of Reinforce...
Conference Paper
Currently, continuous glucose monitoring sensors are used in the artificial pancreas to monitor blood glucose levels. However, insulin and glucagon concentrations in different parts of the body cannot be measured in real-time, and determining body glucagon sensitivity is not feasible. Estimating these states provides more information about the curr...
Article
Economic Model Predictive Control has recently gained popularity due to its ability to directly optimize a given performance criterion, while enforcing constraint satisfaction for nonlinear systems. Recent research has developed both numerical algorithms and stability analysis for the undiscounted case. The introduction of a discount factor in the...
Preprint
Full-text available
In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals with inexact bounds), where the Linear Time Invariant (LTI) models (vertices) captured by combinations of the scheduling parameters becomes wrong. We fir...
Conference Paper
Full-text available
Reinforcement learning methods typically use Deep Neural Networks to approximate the value functions and policies underlying a Markov Decision Process. Unfortunately, DNN-based RL suffers from a lack of explainability of the resulting policy. In this paper, we instead approximate the policy and value functions using an optimization problem, taking...
Preprint
Full-text available
Reinforcement learning methods typically use Deep Neural Networks to approximate the value functions and policies underlying a Markov Decision Process. Unfortunately, DNN-based RL suffers from a lack of explainability of the resulting policy. In this paper, we instead approximate the policy and value functions using an optimization problem, taking...
Preprint
Battery cycle life prediction using early degradation data has many potential applications throughout the battery product life cycle. Various data-driven methods have been proposed for point prediction of battery cycle life with minimum knowledge of the battery degradation mechanisms. However, management of batteries at end-of-life with lower econo...
Conference Paper
Full-text available
In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals with inexact bounds), where the Linear Time Invariant (LTI) models (vertices) captured by combinations of the scheduling parameters becomes wrong. We fir...
Preprint
Full-text available
This paper discusses the functional stability of closed-loop Markov Chains under optimal policies resulting from a discounted optimality criterion, forming Markov Decision Processes (MDPs). We investigate the stability of MDPs in the sense of probability measures (densities) underlying the state distributions and extend the dissipativity theory of...
Preprint
Full-text available
This paper presents a model-free approximation for the Hessian of the performance of deterministic policies to use in the context of Reinforcement Learning based on Quasi-Newton steps in the policy parameters. We show that the approximate Hessian converges to the exact Hessian at the optimal policy, and allows for a superlinear convergence in the l...
Article
We present a reinforcement learning-based (RL) model predictive control (MPC) method for trajectory tracking of surface vessels. The proposed method uses an MPC controller in order to perform both trajectory tracking and control allocation in real-time, while simultaneously learning to optimize the closed loop performance by using RL and system ide...
Article
Full-text available
Airborne wind energy (AWE) is a new power generation technology that harvests wind energy at high altitudes using tethered wings. The potentially higher energy yield, combined with expected lower costs compared to traditional wind turbines (WTs), motivates interest in further developing this technology. However, commercial systems are currently una...
Article
In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals with inexact bounds), where the Linear Time Invariant (LTI) models (vertices) captured by combinations of the scheduling parameters becomes wrong. We fir...
Article
Reinforcement learning methods typically use Deep Neural Networks to approximate the value functions and policies underlying a Markov Decision Process. Unfortunately, DNN-based RL suffers from a lack of explainability of the resulting policy. In this paper, we instead approximate the policy and value functions using an optimization problem, taking...
Article
In this article, we consider the optimal coordination of automated vehicles at intersections under fixed crossing orders. We formulate the problem using direct optimal control and exploit the structure to construct a semidistributed primal-dual interior-point algorithm to solve it by parallelizing most of the computations. Differently from standard...
Conference Paper
Full-text available
In this work, we propose a Model Predictive Control (MPC)-based Reinforcement Learning (RL) method for Autonomous Surface Vehicles (ASVs). The objective is to find an optimal policy that minimizes the closed-loop performance of a simplified freight mission, including collision-free path following, autonomous docking, and a skillful transition betwe...
Conference Paper
Full-text available
The cost of the power distribution infrastructures is driven by the peak power encountered in the system. Therefore, the distribution network operators consider billing consumers behind a common transformer in the function of their peak demand and leave it to the consumers to manage their collective costs. This management problem is, however, not t...
Preprint
Full-text available
In this paper, we consider the optimal coordination of automated vehicles at intersections under fixed crossing orders. We formulate the problem using direct optimal control and exploit the structure to construct a semi-distributed primal-dual interior-point algorithm to solve it by parallelizing most of the computations. Differently from standard...
Preprint
Full-text available
The aim of this paper is to propose a high performance control approach for trajectory tracking of Autonomous Underwater Vehicles (AUVs). However, the controller performance can be affected by the unknown perturbations including model uncertainties and external time-varying disturbances in an undersea environment. To address this problem, a Backste...
Preprint
Model predictive control (MPC) is increasingly being considered for control of fast systems and embedded applications. However, the MPC has some significant challenges for such systems. Its high computational complexity results in high power consumption from the control algorithm, which could account for a significant share of the energy resources...
Preprint
Full-text available
This paper is concerned with solving chance-constrained finite-horizon optimal control problems, with a particular focus on the recursive feasibility issue of stochastic model predictive control (SMPC) in terms of mission-wide probability of safety (MWPS). MWPS assesses the probability that the entire state trajectory lies within the constraint set...
Conference Paper
Full-text available
In this paper, we present the use of Model Predictive Control (MPC) based on Reinforcement Learning (RL) to find the optimal policy for a multi-agent battery storage system. A time-varying prediction of the power price and production-demand uncertainty are considered. We focus on optimizing an economic objective cost while avoiding very low or very...
Preprint
Full-text available
The cost of the power distribution infrastructures is driven by the peak power encountered in the system. Therefore, the distribution network operators consider billing consumers behind a common transformer in the function of their peak demand and leave it to the consumers to manage their collective costs. This management problem is, however, not t...
Article
Full-text available
The main contribution of this article is a novel method for planning globally optimal trajectories for dynamical systems subject to polygonal constraints. The proposed method is a hybrid trajectory planning approach, which combines graph search, i.e., a discrete roadmap method, with convex optimization, i.e., a complete path method. Contrary to pas...
Article
Full-text available
The problems of bus bunching mitigation and the energy management of groups of vehicles have traditionally been treated separately in the literature and been formulated in two different frameworks. The present work bridges this gap by formulating the optimal control problem of the bus line eco-driving and regularity control as a smooth, multi-objec...
Conference Paper
Full-text available
In the Economic Nonlinear Model Predictive (ENMPC) context, closed-loop stability relates to the existence of a storage function satisfying a dissipation inequality. Finding the storage function is in general– for nonlinear dynamics and cost– challenging, and has attracted attentions recently. Q-Learning is a well-known Reinforcement Learning (RL)...
Conference Paper
Full-text available
Tube Model Predictive Control (TMPC) guarantees that a set of prescribed constraints is satisfied for all possible realizations of a bounded disturbance acting on the controlled system. In this paper, we compare two popular TMPC schemes, highlight the many similarities and few differences and close a theoretical gap by proving the existence of a un...
Conference Paper
Full-text available
In this paper, we discuss the deterministic policy gradient using the Actor-Critic methods based on the linear compatible advantage function approximator, where the input spaces are continuous. When the policy is restricted by hard constraints, the exploration may not be Centred or Isotropic (non-CI). As a result, the policy gradient estimation can...
Conference Paper
Full-text available
We present a Reinforcement Learning-based Robust Nonlinear Model Predictive Control (RL-RNMPC) framework for controlling nonlinear systems in the presence of disturbances and uncertainties. An approximate Robust Nonlinear Model Predictive Control (RNMPC) of low computational complexity is used in which the state trajectory uncertainty is modelled v...
Conference Paper
Full-text available
In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presen...
Preprint
Full-text available
Economic Model Predictive Control has recently gained popularity due to its ability to directly optimize a given performance criterion, while enforcing constraint satisfaction for nonlinear systems. Recent research has developed both numerical algorithms and stability analysis for the undiscounted case. The introduction of a discount factor in the...
Preprint
Full-text available
In this work, we propose a Model Predictive Control (MPC)-based Reinforcement Learning (RL) method for Autonomous Surface Vehicles (ASVs). The objective is to find an optimal policy that minimizes the closed-loop performance of a simplified freight mission, including collision-free path following, autonomous docking, and a skillful transition betwe...
Article
Full-text available
In order to alleviate the range anxiety of electric vehicle users (EVUs), several researches focus on facilitating the efficiency of fast-electric vehicle charging stations (fast-EVCSs) using artificial intelligence (AI). This paper first proposes a fast-EVCS revenue maximization pricing policy using an AI approach, and we argue that the AI algorit...
Preprint
Full-text available
In this paper, we present the use of Model Predictive Control (MPC) based on Reinforcement Learning (RL) to find the optimal policy for a multi-agent battery storage system. A time-varying prediction of the power price and production-demand uncertainty are considered. We focus on optimizing an economic objective cost while avoiding very low or very...
Conference Paper
Full-text available
In this paper, we present the use of Reinforcement Learning (RL) based on Robust Model Predictive Control (RMPC) for the control of an Autonomous Surface Vehicle (ASV). The RL-MPC strategy is utilized for obstacle avoidance and target (set-point) tracking. A scenario-tree robust MPC is used to handle potential failures of the ship thrusters. Beside...
Conference Paper
Full-text available
This paper proposes an observer-based framework for solving Partially Observable Markov Decision Processes (POMDPs) when an accurate model is not available. We first propose to use a Moving Horizon Estimation-Model Predictive Control (MHE-MPC) scheme in order to provide a policy for the POMDP problem, where the full state of the real process is not...
Preprint
Full-text available
Dissipativity theory is central to discussing the stability of policies resulting from minimzing economic stage costs. In its current form, the dissipativity theory applies to problems based on deterministic dynamics, and does not readily extends to Markov Decision Processes, where the dynamics are stochastic. In this paper, we clarify the core rea...
Preprint
Full-text available
In this paper, we are interested in optimal control problems with purely economic costs, which often yield optimal policies having a (nearly) bang-bang structure. We focus on policy approximations based on Model Predictive Control (MPC) and the use of the deterministic policy gradient method to optimize the MPC closed-loop performance in the presen...
Preprint
Full-text available
In this paper, we discuss the deterministic policy gradient using the Actor-Critic methods based on the linear compatible advantage function approximator, where the input spaces are continuous. When the policy is restricted by hard constraints, the exploration may not be Centred or Isotropic (non-CI). As a result, the policy gradient estimation can...
Preprint
Full-text available
We present a Reinforcement Learning-based Robust Nonlinear Model Predictive Control (RL-RNMPC) framework for controlling nonlinear systems in the presence of disturbances and uncertainties. An approximate Robust Nonlinear Model Predictive Control (RNMPC) of low computational complexity is used in which the state trajectory uncertainty is modelled v...
Preprint
Full-text available
This paper proposes an observer-based framework for solving Partially Observable Markov Decision Processes (POMDPs) when an accurate model is not available. We first propose to use a Moving Horizon Estimation-Model Predictive Control (MHE-MPC) scheme in order to provide a policy for the POMDP problem, where the full state of the real process is not...
Preprint
Full-text available
In this paper, we present the use of Reinforcement Learning (RL) based on Robust Model Predictive Control (RMPC) for the control of an Autonomous Surface Vehicle (ASV). The RL-MPC strategy is utilized for obstacle avoidance and target (set-point) tracking. A scenario-tree robust MPC is used to handle potential failures of the ship thrusters. Beside...
Preprint
Full-text available
Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC's capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and t...
Preprint
Full-text available
In this paper, we consider solving discounted Markov Decision Processes (MDPs) under the constraint that the resulting policy is stabilizing. In practice MDPs are solved based on some form of policy approximation. We will leverage recent results proposing to use Model Predictive Control (MPC) as a structured policy in the context of Reinforcement L...
Article
Full-text available
To be able to recover a fixed-wing unmanned aerial vehicle (UAV) on a small space like a boat deck or a glade in the forest, a steep and precise descent is needed. One way to reduce the speed of the UAV during landing is by performing a deep-stall landing manoeuvre, where the lift of the UAV is decreased until it is unable to keep the UAV level, at...
Article
As opposed to tracking Model Predictive Control (MPC), economic MPC directly optimizes a given performance objective rather than penalizing the distance from a reference. On the one hand this typically improves performance. On the other hand, however, this also poses challenges in terms of stability and computational burden. In order to make econom...
Article
Model predictive control (MPC) is a powerful trajectory optimization control technique capable of controlling complex nonlinear systems while respecting system constraints and ensuring safe operation. The MPC’s capabilities come at the cost of a high online computational complexity, the requirement of an accurate model of the system dynamics, and t...
Article
In control applications there is often a compromise that needs to be made with respect to the complexity and performance of the controller, and the computational resources that are available. For instance, the typical hardware platform in embedded control applications is a microcontroller with limited memory and processing power, and for battery po...
Preprint
Full-text available
Reinforcement Learning offers tools to optimize policies based on the data obtained from the real system subject to the policy. While the potential of Reinforcement Learning is well understood, many critical aspects still need to be tackled. One crucial aspect is the issue of safety and stability. Recent publications suggest the use of Nonlinear Mo...
Article
Full-text available
The placement of electric vehicle charging stations (EVCSs), which encourages the rapid development of electric vehicles (EVs), should be considered from not only operational perspective such as minimizing installation costs, but also user perspective so that their strategic and competitive charging behaviors can be reflected. This paper proposes a...
Preprint
Full-text available
In control applications there is often a compromise that needs to be made with regards to the complexity and performance of the controller and the computational resources that are available. For instance, the typical hardware platform in embedded control applications is a microcontroller with limited memory and processing power, and for battery pow...
Article
Full-text available
Docking of autonomous surface vehicles (ASVs) involves intricate maneuvering at low speeds under the influence of unknown environmental forces, and is often a challenging operation even for experienced helmsmen. In this paper, we propose an optimization-based trajectory planner for performing automatic docking of a small ASV. The approach formulate...
Preprint
Full-text available
The main contribution of this paper is a novel method for planning globally optimal trajectories for dynamical systems subject to polygonal constraints. The proposed method is a hybrid trajectory planning approach, which combines graph search, i.e. a discrete roadmap method, with convex optimization, i.e. a complete path method. Contrary to past ap...
Article
Full-text available
Reinforcement Learning (RL) has recently impressed the world with stunning results in various applications. While the potential of RL is now well-established, many critical aspects still need to be tackled, including safety and stability issues. These issues, while secondary for the RL community, are central to the control community which has been...
Article
Full-text available
Airborne Wind Energy (AWE) is a new power technology that harvests wind energy at high altitudes using tethered wings. Studying the power potential of the system at a given location requires evaluating the local power production profile of the AWE system. As the optimal operational AWE system altitude depends on complex trade-offs, a commonly used...
Article
Full-text available
In this paper we propose and compare methods for combining system identification (SYSID) and reinforcement learning (RL) in the context of data-driven model predictive control (MPC). Assuming a known model structure of the controlled system, and considering a parametric MPC, the proposed approach simultaneously: a) Learns the parameters of the MPC...
Preprint
Full-text available
Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which do...
Article
Full-text available
In this paper, we analyse the performance of a model predictive controller for coordination of connected, automated vehicles at intersections. The problem has combinatorial complexity, and we propose to solve it approximately by using a two stage procedure where (1) the vehicle crossing order in which the vehicles cross the intersection is found by...
Article
Full-text available
In this article, we study the optimal coordination of automated vehicles at intersections. The problem can be stated as an optimal control problem (OCP), which can be decomposed as a bi‐level scheme composed by one nonlinear program (NLP) which schedules the access to the intersection and one OCP per vehicle which computes the appropriate vehicle c...
Preprint
Full-text available
In this paper we propose and compare methods for combining system identification (SYSID) and reinforcement learning (RL) in the context of data-driven model predictive control (MPC). Assuming a known model structure of the controlled system, and considering a parametric MPC, the proposed approach simultaneously: a) Learns the parameters of the MPC...
Preprint
Full-text available
Model Predictive Control has been recently proposed as policy approximation for Reinforcement Learning, offering a path towards safe and explainable Reinforcement Learning. This approach has been investigated for Q-learning and actor-critic methods, both in the context of nominal Economic MPC and Robust (N)MPC, showing very promising results. In th...