Santiago Paternain

Santiago Paternain
University of Pennsylvania | UP · Department of Electrical and Systems Engineering

About

60
Publications
3,531
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
477
Citations
Introduction

Publications

Publications (60)
Preprint
Though learning has become a core technology of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced solutions. The need to impose requirements on learning is therefore paramount, especially as it reaches critical applications in social, industrial, and medical domains. However, the non-conve...
Preprint
Safety is a critical feature of controller design for physical systems. When designing control policies, several approaches to guarantee this aspect of autonomy have been proposed, such as robust controllers or control barrier functions. However, these solutions strongly rely on the model of the system being available to the designer. As a parallel...
Preprint
Constrained reinforcement learning involves multiple rewards that must individually accumulate to given thresholds. In this class of problems, we show a simple example in which the desired optimal policy cannot be induced by any linear combination of rewards. Hence, there exist constrained reinforcement learning problems for which neither regulariz...
Preprint
Data driven models of dynamical systems help planners and controllers to provide more precise and accurate motions. Most model learning algorithms will try to minimize a loss function between the observed data and the model's predictions. This can be improved using prior knowledge about the task at hand, which can be encoded in the form of constrai...
Preprint
Prediction credibility measures, in the form of confidence intervals or probability distributions, are fundamental in statistics and machine learning to characterize model robustness, detect out-of-distribution samples (outliers), and protect against adversarial attacks. To be effective, these measures should (i) account for the wide variety of mod...
Preprint
Reinforcement learning considers the problem of finding policies that maximize an expected cumulative reward in a Markov decision process with unknown transition probabilities. In this paper we consider the problem of finding optimal policies assuming that they belong to a reproducing kernel Hilbert space (RKHS). To that end we compute unbiased sto...
Article
Reinforcement learning consists of finding policies that maximize an expected cumulative long-term reward in a Markov decision process with unknown transition probabilities and instantaneous rewards. In this paper, we consider the problem of finding such optimal policies while assuming they are continuous functions belonging to a reproducing kernel...
Article
Optimization underpins many of the challenges that science and technology face on a daily basis. Recent years have witnessed a major shift from traditional optimization paradigms grounded on batch algorithms for medium-scale problems to challenging dynamic, time-varying, and even huge-size settings. This is driven by technological transformations t...
Preprint
Optimization underpins many of the challenges that science and technology face on a daily basis. Recent years have witnessed a major shift from traditional optimization paradigms grounded on batch algorithms for medium-scale problems to challenging dynamic, time-varying, and even huge-size settings. This is driven by technological transformations t...
Article
In this article, we consider groups of agents in a network that select actions in order to satisfy a set of constraints that vary arbitrarily over time and minimize a time varying function of which they have only local observations. The selection of actions, also called a strategy, is causal and decentralized, i.e., the dynamical system that determ...
Preprint
Full-text available
In optimal control problems, disturbances are typically dealt with using robust solutions, such as H-infinity or tube model predictive control, that plan control actions feasible for the worst-case disturbance. Yet, planning for every contingency can lead to over-conservative, poorly performing solutions or even, in extreme cases, to infeasibility....
Article
Reproducing kernel Hilbert spaces (RKHSs) are key elements of many non-parametric tools successfully used in signal processing, statistics, and machine learning. In this work, we aim to address three issues of the classical RKHS- based techniques. First, they require the RKHS to be known a priori, which is unrealistic in many applications. Furtherm...
Preprint
Full-text available
This paper is concerned with the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all of modern information processing. Accounting for constraints, however, is paramount to incorporate prior knowledge and impose desired structural and statistical properties on the solutions. Still,...
Preprint
Full-text available
In recent years, considerable work has been done to tackle the issue of designing control laws based on observations to allow unknown dynamical systems to perform pre-specified tasks. At least as important for autonomy, however, is the issue of learning which tasks can be performed in the first place. This is particularly critical in situations whe...
Preprint
In this work we adapt a prediction-correction algorithm for continuous time-varying convex optimization problems to solve dynamic programs arising from Model Predictive Control. In particular, the prediction step tracks the evolution of the optimal solution of the problem which depends on the current state of the system. The cost of said step is th...
Preprint
Full-text available
In this paper, we study the learning of safe policies in the setting of reinforcement learning problems. This is, we aim to control a Markov Decision Process (MDP) of which we do not know the transition probabilities, but we have access to sample trajectories through experience. We define safety as the agent remaining in a desired safe set with hig...
Preprint
Full-text available
Autonomous agents must often deal with conflicting requirements, such as completing tasks using the least amount of time/energy, learning multiple tasks, or dealing with multiple opponents. In the context of reinforcement learning~(RL), these problems are addressed by (i)~designing a reward function that simultaneously describes all requirements or...
Preprint
Full-text available
Navigation tasks often cannot be defined in terms of a target, either because global position information is unavailable or unreliable or because target location is not explicitly known a priori. This task is then often defined indirectly as a source seeking problem in which the autonomous agent navigates so as to minimize the convex potential indu...
Preprint
Given a convex quadratic potential of which its minimum is the agent's goal and a space populated with ellipsoidal obstacles, one can construct a Rimon-Koditschek artificial potential to navigate. These potentials are such that they combine the natural attractive potential of which its minimum is the destination of the agent with potentials that re...
Preprint
Full-text available
Reproducing kernel Hilbert spaces (RKHSs) are key elements of many non-parametric tools successfully used in signal processing, statistics, and machine learning. In this work, we aim to address three issues of the classical RKHS based techniques. First, they require the RKHS to be known a priori, which is unrealistic in many applications. Furthermo...
Preprint
In this paper, we consider groups of agents in a network that select actions in order to satisfy a set of constraints that vary arbitrarily over time and minimize a time-varying function of which they have only local observations. The selection of actions, also called a strategy, is causal and decentralized, i.e., the dynamical system that determin...
Preprint
For complex real-world systems, designing controllers are a difficult task. With the advent of neural networks as a proxy for complex function approximators, it has become popular to learn the controller directly. However, these controllers are specific to a given task and need to be relearned for a new task. Alternatively, one can learn just the m...
Article
Machine learning problems such as neural network training, tensor decomposition, and matrix factorization require local minimization of a nonconvex function. This local minimization is challenged by the presence of saddle points, of which there can be many and from which descent methods may take an inordinately large number of iterations to escape....
Preprint
Reinforcement learning consists of finding policies that maximize an expected cumulative long-term reward in a Markov decision process with unknown transition probabilities and instantaneous rewards. In this paper, we consider the problem of finding such optimal policies while assuming they are continuous functions belonging to a reproducing kernel...
Article
Full-text available
We consider multi-agent stochastic optimization problems over reproducing kernel Hilbert spaces (RKHS). In this setting, a network of interconnected agents aims to learn decision functions, i.e., nonlinear statistical models, that are optimal in terms of a global convex functional that aggregates data across the network, with only access to locally...
Article
Full-text available
Machine learning problems such as neural network training, tensor decomposition, and matrix factorization, require local minimization of a nonconvex function. This local minimization is challenged by the presence of saddle points, of which there can be many and from which descent methods may take inordinately large number of iterations to escape. T...
Article
Consider a convex set of which we remove an arbitrarily number of disjoints convex sets -- the obstacles -- and a convex function whose minimum is the agent's goal. We consider a local and stochastic approximation of the gradient of a Rimon-Koditschek navigation function where the attractive potential is the convex function that the agent is minimi...
Article
Define an environment as a set of convex constraint functions that vary arbitrarily over time and consider a cost function that is also convex and arbitrarily varying. Agents that operate in this environment intend to select actions that are feasible for all times while minimizing the cost's time average. Such action is said optimal and can be comp...
Article
Full-text available
In this paper, we develop an interior-point method for solving a class of convex optimization problems with time-varying objective and constraint functions. Using log-barrier penalty functions, we propose a continuous-time dynamical system for tracking the (time-varying) optimal solution with an asymptotically vanishing error. This dynamical system...
Article
Full-text available
Given a convex potential in a space with convex obstacles, an artificial potential is used to navigate to the minimum of the natural potential while avoiding collisions. The artificial potential combines the given natural potential with potentials that repel the agent from the border of the obstacles. This is a popular approach to navigation proble...
Article
Full-text available
This paper considers a class of convex optimization problems where both, the objective function and the constraints, have a continuously varying dependence on time. Our goal is to develop an algorithm to track the optimal solution as it continuously changes over time inside or on the boundary of the dynamic feasible set. We develop an interior poin...
Article
An environment is defined as a set of constraint functions that vary arbitrarily over time. An agent wants to select feasible actions that keep all the constraints negative, but must do so causally. I.e., the dynamical system that determines actions is such that only their time derivatives can depend on the current constraints. An environment is sa...
Conference Paper
Full-text available
This paper describes the design and integration of the instrumentation and sensor fusion that is used to allow the autonomous flight of a quadrotor. A comercial frame is used, a mathematical model for the quadrotor is developed and its parameters determined from the characterization of the unit. A 9 degrees of freedom Inertial Measurement Unit (IMU...
Conference Paper
Abstract—This paper presents a fast and low cost way to calibrate different inertial measurement sensors. In particular the calibration of an accelerometer and a gyroscope using nonlinear least squares is presented. A model of the sensors which includes the main errors that MEMS devices present is used. A calibration method is proposed for estimati...