# Ufuk Topcu's research while affiliated with University of Texas System and other places

## Publications (464)

Article
Conic optimization is the minimization of a differentiable convex objective function subject to conic constraints. We propose a novel primal–dual first-order method for conic optimization, named proportional–integral projected gradient method (PIPG). PIPG ensures that both the primal–dual gap and the constraint violation converge to zero at the rat...
Preprint
In an inverse game problem, one needs to infer the cost function of the players in a game such that a desired joint strategy is a Nash equilibrium. We study the inverse game problem for a class of multiplayer games, where the players' cost functions are characterized by a matrix. We guarantee that the desired joint strategy is the unique Nash equil...
Preprint
Full-text available
We develop a learning-based algorithm for the distributed formation control of networked multi-agent systems governed by unknown, nonlinear dynamics. Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees. The proposed a...
Preprint
We propose a method to attack controllers that rely on external timeseries forecasts as task parameters. An adversary can manipulate the costs, states, and actions of the controllers by forging the timeseries, in this case perturbing the real timeseries. Since the controllers often encode safety requirements or energy limits in their costs and cons...
Preprint
Full-text available
Interactions among multiple self-interested agents may not necessarily yield socially desirable behaviors. While static games offer a pragmatic model for such interactions, and modifying the utilities of the agents in such games provides a means toward achieving socially desirable behavior, manipulating the utilities is hard-if not impossible-after...
Conference Paper
Neural ordinary differential equations (NODEs) -- parametrizations of differential equations using neural networks -- have shown tremendous promise in learning models of unknown continuous-time dynamical systems from data. However, every forward evaluation of a NODE requires numerical integration of the neural network used to capture the system dyn...
Article
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments. By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals while deceiving outside obser...
Preprint
We study the problem of data-driven, constrained control of unknown nonlinear dynamics from a single ongoing and finite-horizon trajectory. We consider a one-step optimal control problem with a smooth, black-box objective, typically a composition of a known cost function and the unknown dynamics. We investigate an on-the-fly control paradigm, i.e.,...
Conference Paper
Full-text available
In this research, we have identified and surveyed three categories of hazards for advanced air mobility (AAM): (i) adverse weather with a special focus on winds, (ii) eVTOL vehicle and component level faults/degradation, and (iii) AAM corridor incursion by non-cooperative aircraft. While these categories of hazards may be independent of one another...
Preprint
We consider a team of autonomous agents that navigate in an adversarial environment and aim to achieve a task by allocating their resources over a set of target locations. The adversaries in the environment observe the autonomous team's behavior to infer their objective and counter-allocate their own resources to the target locations. In this setti...
Article
We consider a mobile multi-agent network in which the agents locate themselves in an environment through imperfect measurements and aim to transmit a message signal to a far-field base station via collaborative beamforming. The agents imperfect measurements yield localization errors that degrade the quality of service at the base station due to unk...
Article
Full-text available
Contact-based decision and planning methods are becoming increasingly important to endow higher levels of autonomy for legged robots. Formal synthesis methods derived from symbolic systems have great potential for reasoning about high-level locomotion decisions and achieving complex maneuvering behaviors with correctness guarantees. This study take...
Preprint
We study the privacy risks that are associated with training a neural network's weights with self-supervised learning algorithms. Through empirical evidence, we show that the fine-tuning stage, in which the network weights are updated with an informative and often private dataset, is vulnerable to privacy attacks. To address the vulnerabilities, we...
Article
Full-text available
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
Preprint
Full-text available
We study the problem of autonomous racing amongst teams composed of cooperative agents subject to realistic safety and fairness rules. We develop a hierarchical controller to solve this problem consisting of two levels, extending prior work where bi-level hierarchical control is applied to head-to-head autonomous racing. A high-level planner constr...
Preprint
Full-text available
We study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In r...
Article
Full-text available
We address the problem of inferring descriptions of system behavior using temporal logic from a finite set of positive and negative examples. In this paper, we consider two formalisms of temporal logic that describe linear time properties: Linear Temporal Logic over finite horizon (LTLf\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{...
Preprint
Full-text available
Reinforcement learning (RL) in safety-critical environments requires an agent to avoid decisions with catastrophic consequences. Various approaches addressing the safety of RL exist to mitigate this problem. In particular, so-called shields provide formal safety guarantees on the behavior of RL agents based on (partial) models of the agents' enviro...
Preprint
Autonomous systems are often deployed in complex sociotechnical environments, such as public roads, where they must behave safely and securely. Unlike many traditionally engineered systems, autonomous systems are expected to behave predictably in varying "open world" environmental contexts that cannot be fully specified formally. As a result, assur...
Preprint
Full-text available
Urban Air Mobility is a concept that promotes aerial modes of transport in urban areas. In these areas, the location and capacity of the vertiports--where the travelers embark and disembark the aircraft--not only affect the flight delays of the aircraft, but can also aggravate the congestion of ground vehicles by creating extra ground travel demand...
Article
Full-text available
We study the problem of synthesizing lockdown policies—schedules of maximum capacities for different types of activity sites—to minimize the number of deceased individuals due to a pandemic within a given metropolitan statistical area (MSA) while controlling the severity of the imposed lockdown. To synthesize and evaluate lockdown policies, we deve...
Preprint
Full-text available
Conic optimization is the minimization of a convex quadratic function subject to conic constraints. We introduce a novel first-order method for conic optimization, named \emph{extrapolated proportional-integral projected gradient method (exPIPG)}, that automatically detects infeasibility. The iterates of exPIPG either asymptotically satisfy a set o...
Preprint
Full-text available
Consider a robot operating in an uncertain environment with stochastic, dynamic obstacles. Despite the clear benefits for trajectory optimization, it is often hard to keep track of each obstacle at every time step due to sensing and hardware limitations. We introduce the Safely motion planner, a receding-horizon control framework, that simultaneous...
Preprint
Autonomous systems require the management of several model views to assure properties such as safety and security among others. A crucial issue in autonomous systems design assurance is the notion of emergent behavior; we cannot use their parts in isolation to examine their overall behavior or performance. Compositional verification attempts to com...
Article
We study the problem of synthesizing a controller for an agent with imperfect sensing and a quantitative surveillance objective, that is, an agent is required to maintain knowledge of the location of a moving, possibly adversarial target. We formulate the problem as a one-sided partial-information game with a winning condition expressed as a tempor...
Preprint
Full-text available
We develop a hierarchical controller for multi-agent autonomous racing. A high-level planner approximates the race as a discrete game with simplified dynamics that encodes the complex safety and fairness rules seen in real-life racing and calculates a series of target waypoints. The low-level controller takes the resulting waypoints as a reference...
Preprint
In a Stackelberg game, a leader commits to a randomized strategy, and a follower chooses their best strategy in response. We consider an extension of a standard Stackelberg game, called a discrete-time dynamic Stackelberg game, that has an underlying state space that affects the leader's rewards and available strategies and evolves in a Markovian m...
Preprint
Full-text available
Transformers have made remarkable progress towards modeling long-range dependencies within the medical image analysis domain. However, current transformer-based models suffer from several disadvantages: 1) existing methods fail to capture the important features of the images due to the naive tokenization scheme; 2) the models suffer from informatio...
Preprint
In a cooperative multiagent system, a collection of agents executes a joint policy in order to achieve some common objective. The successful deployment of such systems hinges on the availability of reliable inter-agent communication. However, many sources of potential disruption to communication exist in practice, such as radio interference, hardwa...
Preprint
Full-text available
Neural ordinary differential equations (NODEs) -- parametrizations of differential equations using neural networks -- have shown tremendous promise in learning models of unknown continuous-time dynamical systems from data. However, every forward evaluation of a NODE requires numerical integration of the neural network used to capture the system dyn...
Preprint
We consider parametric Markov decision processes (pMDPs) that are augmented with unknown probability distributions over parameter values. The problem is to compute the probability to satisfy a temporal logic specification within any concrete MDP that corresponds to a sample from these distributions. As this problem is infeasible to solve precisely,...
Preprint
Full-text available
We study the detection problem for a finite set of Markov decision processes (MDPs) where the MDPs have the same state and action spaces but possibly different probabilistic transition functions. Any one of these MDPs could be the model for some underlying controlled stochastic process, but it is unknown a priori which MDP is the ground truth. We i...
Article
Probabilistic model checking aims to prove whether a Markov decision process (MDP) satisfies a temporal logic specification. The underlying methods rely on an often unrealistic assumption that the MDP is precisely known. Consequently, parametric MDPs (pMDPs) extend MDPs with transition probabilities that are functions over unspecified parameters. T...
Preprint
Shared autonomy provides a framework where a human and an automated system, such as a robot, jointly control the system's behavior, enabling an effective solution for various applications, including human-robot interaction. However, a challenging problem in shared autonomy is safety because the human input may be unknown and unpredictable, which af...
Article
Partially observable Markov decision processes (POMDPs) are models for sequential decision-making under uncertainty and incomplete information. Machine learning methods typically train recurrent neural networks (RNN) as effective representations of POMDP policies that can efficiently process sequential data. However, it is hard to verify whether th...
Article
In many scenarios, a principal dynamically interacts with an agent and offers a sequence of incentives to align the agent's behavior with a desired objective. This paper focuses on the problem of synthesizing an incentive sequence that, once offered, induces the desired agent behavior even when the agent's intrinsic motivation is unknown to the pri...
Chapter
Consumption Markov Decision Processes (CMDPs) are probabilistic decision-making models of resource-constrained systems. We introduce FiMDP, a tool for controller synthesis in CMDPs with LTL objectives expressible by deterministic Büchi automata. The tool implements the recent algorithm for polynomial-time controller synthesis in CMDPs, but extends...
Preprint
Full-text available
We develop a learning-based algorithm for the distributed formation control of networked multi-agent systems governed by unknown, nonlinear dynamics. Most existing algorithms either assume certain parametric forms for the unknown dynamic terms or resort to unnecessarily large control inputs in order to provide theoretical guarantees. The proposed a...
Preprint
Full-text available
We study the privacy implications of deploying recurrent neural networks in machine learning. We consider membership inference attacks (MIAs) in which an attacker aims to infer whether a given data record has been used in the training of a learning agent. Using existing MIAs that target feed-forward neural networks, we empirically demonstrate that...
Chapter
We address the problem of inferring descriptions of system behavior using Linear Temporal Logic (LTL) from a finite set of positive and negative examples. Most of the existing approaches for solving such a task rely on predefined templates for guiding the structure of the inferred formula. The approaches that can infer arbitrary LTL formulas, on th...
Article
Reinforcement learning (RL) algorithms have been used to learn how to implement tasks in uncertain and partially unknown environments. In practice, environments are usually uncontrolled and may affect task performance in an adversarial way. In this paper, we model the interaction between an RL agent and its potentially adversarial environment as a...
Preprint
Full-text available
We study the problem of synthesizing implementations from temporal logic specifications that need to work correctly in all environments that can be represented as transducers with a limited number of states. This problem was originally defined and studied by Kupferman, Lustig, Vardi, and Yannakakis. They provide NP and 2-EXPTIME lower and upper bou...
Preprint
We study the design of autonomous agents that are capable of deceiving outside observers about their intentions while carrying out tasks in stochastic, complex environments. By modeling the agent's behavior as a Markov decision process, we consider a setting where the agent aims to reach one of multiple potential goals while deceiving outside obser...
Preprint
Full-text available
Effective inclusion of physics-based knowledge into deep neural network models of dynamical systems can greatly improve data efficiency and generalization. Such a-priori knowledge might arise from physical principles (e.g., conservation laws) or from the system's design (e.g., the Jacobian matrix of a robot), even if large portions of the system dy...
Preprint
Full-text available
Although perception is an increasingly dominant portion of the overall computational cost for autonomous systems, only a fraction of the information perceived is likely to be relevant to the current task. To alleviate these perception costs, we develop a novel simultaneous perception-action design framework wherein an agent senses only the task-rel...
Preprint
A constrained optimization problem is primal infeasible if its constraints cannot be satisfied, and dual infeasible if the constraints of its dual problem cannot be satisfied. We propose a novel iterative method, named proportional-integral projected gradient method (PIPG), for detecting primal and dual infeasiblity in convex optimization with quad...
Preprint
Full-text available
Conic optimization is the minimization of a differentiable convex objective function subject to conic constraints. We propose a novel primal-dual first-order method for conic optimization, named proportional-integral projected gradient method (PIPG). PIPG ensures that both the primal-dual gap and the constraint violation converge to zero at the rat...
Conference Paper
Full-text available
Shared autonomy provides a framework where a human and an automated system, such as a robot, jointly control the system's behavior, enabling an effective solution for various applications, including human-robot interaction. However, a challenging problem in shared autonomy is safety because the human input may be unknown and unpredictable, which af...
Chapter
Despite the fact that deep reinforcement learning (RL) has surpassed human-level performances in various tasks, it still has several fundamental challenges. First, most RL methods require intensive data from the exploration of the environment to achieve satisfactory performance. Second, the use of neural networks in RL renders it hard to interpret...
Preprint
Full-text available
We explore methodologies to improve the robustness of generative adversarial imitation learning (GAIL) algorithms to observation noise. Towards this objective, we study the effect of local Lipschitzness of the discriminator and the generator on the robustness of policies learned by GAIL. In many robotics applications, the learned policies by GAIL t...
Preprint
Probabilistic model checking aims to prove whether a Markov decision process (MDP) satisfies a temporal logic specification. The underlying methods rely on an often unrealistic assumption that the MDP is precisely known. Consequently, parametric MDPs (pMDPs) extend MDPs with transition probabilities that are functions over unspecified parameters. T...
Preprint
Full-text available
We develop a probabilistic control algorithm, $\texttt{GTLProCo}$, for swarms of agents with heterogeneous dynamics and objectives, subject to high-level task specifications. The resulting algorithm not only achieves decentralized control of the swarm but also significantly improves scalability over state-of-the-art existing algorithms. Specificall...
Preprint
Full-text available
We develop a learning-based algorithm for the control of robotic systems governed by unknown, nonlinear dynamics to satisfy tasks expressed as signal temporal logic specifications. Most existing algorithms either assume certain parametric forms for the dynamic terms or resort to unnecessarily large control inputs (e.g., using reciprocal functions)...
Preprint
Full-text available
We develop a learning-based control algorithm for unknown dynamical systems under very severe data limitations. Specifically, the algorithm has access to streaming data only from a single and ongoing trial. Despite the scarcity of data, we show -- through a series of examples -- that the algorithm can provide performance comparable to reinforcement...
Preprint
Full-text available
We study privacy-utility trade-offs where users share privacy-correlated useful information with a service provider to obtain some utility. The service provider is adversarial in the sense that it can infer the users' private information based on the shared useful information. To minimize the privacy leakage while maintaining a desired level of uti...
Preprint
Full-text available
Geometric median (Gm) is a classical method in statistics for achieving a robust estimation of the uncorrupted data; under gross corruption, it achieves the optimal breakdown point of 0.5. However, its computational complexity makes it infeasible for robustifying stochastic gradient descent (SGD) for high-dimensional optimization problems. In this...
Preprint
We propose a novel framework for verifiable and compositional reinforcement learning (RL) in which a collection of RL sub-systems, each of which learns to accomplish a separate sub-task, are composed to achieve an overall task. The framework consists of a high-level model, represented as a parametric Markov decision process (pMDP) which is used to...