Jorge Cortes’s research while affiliated with University of California, San Diego and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (481)


Anytime Safe Reinforcement Learning
  • Preprint

April 2025

·

Arnau Marzabal

·

Jorge Cortés

This paper considers the problem of solving constrained reinforcement learning problems with anytime guarantees, meaning that the algorithmic solution returns a safe policy regardless of when it is terminated. Drawing inspiration from anytime constrained optimization, we introduce Reinforcement Learning-based Safe Gradient Flow (RL-SGF), an on-policy algorithm which employs estimates of the value functions and their respective gradients associated with the objective and safety constraints for the current policy, and updates the policy parameters by solving a convex quadratically constrained quadratic program. We show that if the estimates are computed with a sufficiently large number of episodes (for which we provide an explicit bound), safe policies are updated to safe policies with a probability higher than a prescribed tolerance. We also show that iterates asymptotically converge to a neighborhood of a KKT point, whose size can be arbitrarily reduced by refining the estimates of the value function and their gradients. We illustrate the performance of RL-SGF in a navigation example.


Feedback Optimization with State Constraints through Control Barrier Functions
  • Preprint
  • File available

April 2025

·

52 Reads

Recently, there has been a surge of research on a class of methods called feedback optimization. These are methods to steer the state of a control system to an equilibrium that arises as the solution of an optimization problem. Despite the growing literature on the topic, the important problem of enforcing state constraints at all times remains unaddressed. In this work, we present the first feedback-optimization method that enforces state constraints. The method combines a class of dynamics called safe gradient flows with high-order control barrier functions. We provide a number of results on our proposed controller, including well-posedness guarantees, anytime constraint-satisfaction guarantees, equivalence between the closed-loop's equilibria and the optimization problem's critical points, and local asymptotic stability of optima.

Download

Gradient sampling algorithm for subsmooth functions

March 2025

·

3 Reads

This paper considers non-smooth optimization problems where we seek to minimize the pointwise maximum of a continuously parameterized family of functions. Since the objective function is given as the solution to a maximization problem, neither its values nor its gradients are available in closed form, which calls for approximation. Our approach hinges upon extending the so-called gradient sampling algorithm, which approximates the Clarke generalized gradient of the objective function at a point by sampling its derivative at nearby locations. This allows us to select descent directions around points where the function may fail to be differentiable and establish algorithm convergence to a stationary point from any initial condition. Our key contribution is to prove this convergence by alleviating the requirement on continuous differentiability of the objective function on an open set of full measure. We further provide assumptions under which a desired convex subset of the decision space is rendered attractive for the iterates of the algorithm.


Fig. 1: Simulation results for safe control of a 2-link robot arm.
Safe Control of Second-Order Systems with Linear Constraints

March 2025

·

15 Reads

Control barrier functions (CBFs) offer a powerful tool for enforcing safety specifications in control synthesis. This paper deals with the problem of constructing valid CBFs. Given a second-order system and any desired safety set with linear boundaries in the position space, we construct a provably control-invariant subset of this desired safety set. The constructed subset does not sacrifice any positions allowed by the desired safety set, which can be nonconvex. We show how our construction can also meet safety specification on the velocity. We then demonstrate that if the system satisfies standard Euler-Lagrange systems properties then our construction can also handle constraints on the allowable control inputs. We finally show the efficacy of the proposed method in a numerical example of keeping a 2D robot arm safe from collision.



Control Barrier Function-Based Safety Filters: Characterization of Undesired Equilibria, Unbounded Trajectories, and Limit Cycles

January 2025

·

9 Reads

This paper focuses on safety filters designed based on Control Barrier Functions (CBFs): these are modifications of a nominal stabilizing controller typically utilized in safety-critical control applications to render a given subset of states forward invariant. The paper investigates the dynamical properties of the closed-loop systems, with a focus on characterizing undesirable behaviors that may emerge due to the use of CBF-based filters. These undesirable behaviors include unbounded trajectories, limit cycles, and undesired equilibria, which can be locally stable and even form a continuum. Our analysis offer the following contributions: (i) conditions under which trajectories remain bounded and (ii) conditions under which limit cycles do not exist; (iii) we show that undesired equilibria can be characterized by solving an algebraic equation, and (iv) we provide examples that show that asymptotically stable undesired equilibria can exist for a large class of nominal controllers and design parameters of the safety filter (even for convex safe sets). Further, for the specific class of planar systems, (v) we provide explicit formulas for the total number of undesired equilibria and the proportion of saddle points and asymptotically stable equilibria, and (vi) in the case of linear planar systems, we present an exhaustive analysis of their global stability properties. Examples throughout the paper illustrate the results.


Back to Base: Towards Hands-Off Learning via Safe Resets with Reach-Avoid Safety Filters

January 2025

·

8 Reads

Azra Begzadić

·

Nikhil Uday Shinde

·

Sander Tonkens

·

[...]

·

Sylvia Herbert

Designing controllers that accomplish tasks while guaranteeing safety constraints remains a significant challenge. We often want an agent to perform well in a nominal task, such as environment exploration, while ensuring it can avoid unsafe states and return to a desired target by a specific time. In particular we are motivated by the setting of safe, efficient, hands-off training for reinforcement learning in the real world. By enabling a robot to safely and autonomously reset to a desired region (e.g., charging stations) without human intervention, we can enhance efficiency and facilitate training. Safety filters, such as those based on control barrier functions, decouple safety from nominal control objectives and rigorously guarantee safety. Despite their success, constructing these functions for general nonlinear systems with control constraints and system uncertainties remains an open problem. This paper introduces a safety filter obtained from the value function associated with the reach-avoid problem. The proposed safety filter minimally modifies the nominal controller while avoiding unsafe regions and guiding the system back to the desired target set. By preserving policy performance while allowing safe resetting, we enable efficient hands-off reinforcement learning and advance the feasibility of safe training for real world robots. We demonstrate our approach using a modified version of soft actor-critic to safely train a swing-up task on a modified cartpole stabilization problem.



FIGURE 1. Vector field of Van der Pol oscillator (10) and its limit cycle.
FIGURE 7. The accuracy hierarchy of subspaces contained in the search space S. Given ϵ ∈ [0, 1], RFB-EDMD captures a member of the hierarchy with the largest index smaller than ϵ. Note that L0 contains all exact Koopman eigenfunctions contained in the original search space S (cf. Theorem V.3) and L1 equals to S (cf. Lemma VI.2).
FIGURE 8. Dimension of identified subspaces by RFB-EDMD and T-SSD versus the value of the accuracy parameter ϵ ∈ [0, 1] for system (22).
Recursive Forward-Backward EDMD: Guaranteed Algebraic Search for Koopman Invariant Subspaces

January 2025

·

74 Reads

IEEE Access

The implementation of the Koopman operator on digital computers often relies on the approximation of its action on finite-dimensional function spaces. This approximation is generally done by orthogonally projecting on the subspace. Extended Dynamic Mode Decomposition (EDMD) is a popular, special case of this projection procedure in a data-driven setting. Importantly, the accuracy of the model obtained by EDMD depends on the quality of the finite-dimensional space, specifically on how close it is to being invariant under the Koopman operator. This paper presents a data-driven algebraic search algorithm, termed Recursive Forward-Backward EDMD, for subspaces close to being invariant under the Koopman operator. Relying on the concept of temporal consistency, which measures the quality of the subspace, our algorithm recursively decomposes the search space into two subspaces with different prediction accuracy levels. The subspace with lower level of accuracy is removed if it does not reach a satisfactory threshold. The algorithm allows for tuning the level of accuracy depending on the underlying application and is endowed with convergence and accuracy guarantees.


Online Event-Triggered Switching for Frequency Control in Power Grids With Variable Inertia

January 2025

·

1 Read

Power Systems, IEEE Transactions on

The increasing integration of renewable energy resources into power grids has led to time-varying system inertia and consequent degradation in frequency dynamics. A promising solution to alleviate performance degradation is using power electronics interfaced energy resources, such as renewable generators and battery energy storage for primary frequency control, by adjusting their power output set-points in response to frequency deviations. However, designing a frequency controller under time-varying inertia is challenging. Specifically, the stability or optimality of controllers designed for time-invariant systems can be compromised once applied to a time-varying system. We model the frequency dynamics under time-varying inertia as a nonlinear switching system, where the frequency dynamics under each mode are described by the nonlinear swing equations and different modes represent different inertia levels. We identify a key controller structure, named Neural Proportional-Integral (Neural-PI) controller, that guarantees exponential input-to-state stability for each mode. To further improve performance, we present an online event-triggered switching algorithm to select the most suitable controller from a set of Neural-PI controllers, each optimized for specific inertia levels. Simulations on the IEEE 39-bus system validate the effectiveness of the proposed online switching control method with stability guarantees and optimized performance for frequency control under time-varying inertia.


Citations (46)


... For instance, it is well-known that designing CBF filters for systems with stabilizing nominal controllers can lead to emergence of undesirable equilibrium points or unbounded trajectories. Moreover, some of these undesirable equilibria may even be locally stable (see, e.g., Reis et al. (2021); Cortez and Dimarogonas (2022); Tan and Dimarogonas (2024); Chen et al. (2024b)), and their stability properties cannot be changed by simply changing the CBF (Chen et al. (2024a)). ...

Reference:

Neural Network-assisted Interval Reachability for Systems with Control Barrier Function-Based Safe Controllers
Characterization of the Dynamical Properties of Safety Filters for Linear Planar Systems
  • Citing Conference Paper
  • December 2024

... The gradient ∇h(x) along the boundary ∂C corresponds to the surface normals of the zero-level set. Since C is compact, ∂C is compact and the Gauss map G(x) = ∇h(x)/||∇h(x)|| is surjective [20], [21]. There exists at least a point x ∈ ∂C such that ∇h(x)/||∇h(x)|| is parallel to µ. ...

Continuity and Boundedness of Minimum-Norm CBF-Safe Controllers
  • Citing Article
  • January 2025

IEEE Transactions on Automatic Control

... Our work is related to the recent works that investigate data-driven control in the presence of noise [5], [8]- [14], where the system cannot be uniquely identified even when the data are persistently exciting and it is reasonable and desirable that the noise is bounded in some sense. In these papers, the goal is to asymptotically stabilize all systems consistent with data for different assumptions on the noise (e.g., S-Lemma [9], Petersen's Lemma [11], through updating uncertanties [8], and bounds on measurement errors [12]). ...

Data-driven stabilization of switched and constrained linear systems
  • Citing Article
  • January 2025

Automatica

... This provides an online method for collecting informative data (Lemma 4), which is then utilized to design a controller along with an adaptive law. The online collection of informative data marks a key difference with different types of online approaches [38]- [40] where the conditions on data are assumed a priori rather than imposed by suitable design of the inputs. We also prove that, with the proposed online method, the controller gains will converge to a solution of the matching equations (Theorem 2), and the tracking error is bounded and will converge to zero (Theorem 3). ...

Data-driven mode detection and stabilization of unknown switched linear systems
  • Citing Article
  • January 2024

IEEE Transactions on Automatic Control

... As such, feedback optimization has found a wide range of applications on, e.g., power systems [2], [3], traffic control [4], smart buildings [5], communication networks [6], etc. Overall, in recent years, research on feedback optimization has surged [2]- [15]. However, although designs with input constraints are available in the literature, a fundamental problem remains unsolved: enforcing state constraints at all times. ...

Optimal Power Flow Pursuit via Feedback-Based Safe Gradient Flow
  • Citing Article
  • January 2024

IEEE Transactions on Control Systems Technology

... For the same reason, ReLU networks have low computational complexity [40] when performing gradient propagation. One feasible approach to designing controllers for linear-threshold networks is to first identify the system parameters [41] and then design a modelbased controller [4]. Instead, we pursue here a direct datadriven approach that bypasses the system identification step to avoid accumulating approximation errors. ...

Efficient Reconstruction of Neural Mass Dynamics Modeled by Linear-Threshold Networks
  • Citing Article
  • January 2024

IEEE Transactions on Automatic Control

... Deep learning technology selects the most appropriate method of solving a given system of differential equations based on the existing experience of solving similar systems [37,19,36,14]. The next generation of artificial intelligence technologies is aimed at automatic development of an efficient and stable numerical method of integration for each given system [13,61]. This raises the question of constructing invariant geometric structures, or tensor invariants, for a given system of differential equations and discrete mappings that preserve these tensor invariants. ...

Symmetry Preservation in Hamiltonian Systems: Simulation and Learning

Journal of Nonlinear Science

... In particular, CBFs are used as safety filters by adjusting a nominal control law to ensure the system satisfies safety constraints (Wabersich et al., 2023), resulting in a quadratic program. However, constructing a CBF and ensuring the feasibility of a CBF-based optimization problem is challenging (Mestres and Cortés, 2024). To tackle feasibility challenges, backup CBFs have been proposed to guarantee feasibility using a predefined backup policy to a predefined safe set (Gurriet et al., 2020;Chen et al., 2021). ...

Feasibility and regularity analysis of safe stabilizing controllers under uncertainty
  • Citing Article
  • September 2024

Automatica

... Reference [18] designed an incremental local Volt/Var controller with a learnable neural network, in which the neural network was trained in a supervised learning manner by using many instances of pre-solved OPF solutions. Reference [19] proposed an unsupervised learning approach to train a local controller, which restricts the objective as voltage deviations and ignores inequality constraints (e.g., the voltage safety limits) in the OPF problems, thus compromising their practicality. Tackling such inequality constraints is necessary but is challenging for existing machine learning algorithms [10], [11], [20]. ...

Unsupervised learning for equitable DER control
  • Citing Article
  • September 2024

Electric Power Systems Research