Gaurav S. SukhatmeUniversity of Southern California | USC · Department of Computer Science
Gaurav S. Sukhatme
PhD
About
391
Publications
55,733
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,655
Citations
Introduction
Publications
Publications (391)
With the increasing availability of open-source robotic data, imitation learning has emerged as a viable approach for both robot manipulation and locomotion. Currently, large generalized policies are trained to predict controls or trajectories using diffusion models, which have the desirable property of learning multimodal action distributions. How...
Generalizing vision-based reinforcement learning (RL) agents to novel environments remains a difficult and open challenge. Current trends are to collect large-scale datasets or use data augmentation techniques to prevent overfitting and improve downstream generalization. However, the computational and data collection costs increase exponentially wi...
Placement of electromagnetic signal emitting devices, such as light sources, has important usage for signal coverage tasks. Automatic placement of these devices is challenging because of the complex interaction of the signal and environment due to reflection, refraction and scattering. In this work, we iteratively improve the placement of these dev...
Multi-robot collaboration for target tracking presents significant challenges in hazardous environments, including addressing robot failures, dynamic priority changes, and other unpredictable factors. Moreover, these challenges are increased in adversarial settings if the environment is unknown. In this paper, we propose a resilient and adaptive fr...
Robotic assembly for high-mixture settings requires adaptivity to diverse parts and poses, which is an open challenge. Meanwhile, in other areas of robotics, large models and sim-to-real have led to tremendous progress. Inspired by such work, we present AutoMate, a learning framework and system that consists of 4 parts: 1) a dataset of 100 assembli...
Quality Diversity (QD) has shown great success in discovering high-performing, diverse policies for robot skill learning. While current benchmarks have led to the development of powerful QD methods, we argue that new paradigms must be developed to facilitate open-ended search and generalizability. In particular, many methods focus on learning diver...
Bimanual manipulation is critical to many robotics applications. In contrast to single-arm manipulation, bimanual manipulation tasks are challenging due to higher-dimensional action spaces. Prior works leverage large amounts of data and primitive actions to address this problem, but may suffer from sample inefficiency and limited generalization acr...
We consider a new variant of the multi-robot task allocation problem - Inverse Risk-sensitive Multi-Robot Task Allocation (IR-MRTA). "Forward" MRTA - the process of deciding which robot should perform a task given the reward (cost)-related parameters, is widely studied in the multi-robot literature. In this setting, the reward (cost)-related parame...
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct...
We consider the setting where a team of robots is tasked with tracking multiple targets with the following property: approaching the targets enables more accurate target position estimation, but also increases the risk of sensor failures. Therefore, it is essential to address the trade-off between tracking quality maximization and risk minimization...
Collision-free navigation in cluttered environments with static and dynamic obstacles is essential for many multi-robot tasks. Dynamic obstacles may also be interactive, i.e., their behavior varies based on the behavior of other entities. We propose a novel representation for interactive behavior of dynamic obstacles and a decentralized real-time m...
In this work we present CppFlow-a novel and performant planner for the Cartesian Path Planning problem, which finds valid trajectories up to 129x faster than current methods, while also succeeding on more difficult problems where others fail. At the core of the proposed algorithm is the use of a learned, generative Inverse Kinematics solver, which...
Some Learning from Demonstrations (LfD) methods handle small mismatches in the action spaces of the teacher and student. Here we address the case where the teacher’s morphology is substantially different from that of the student. Our framework, Morphological Adaptation in Imitation Learning (MAIL), bridges this gap allowing us to train an agent fro...
Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for
L
anguag
E
-Conditioned
M
ulti-robot
MA
nipulation (LEMMA) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedura...
Multi-robot collision-free and deadlock-free navigation in cluttered environments with static and dynamic obstacles is a fundamental problem for many applications. We introduce MRNAV, a framework for planning and control to effectively navigate in such environments. Our design utilizes short, medium, and long horizon decision making modules with qu...
The Alexa Prize program has empowered numerous university students to explore, experiment, and showcase their talents in building conversational agents through challenges like the SocialBot Grand Challenge and the TaskBot Challenge. As conversational agents increasingly appear in multimodal and embodied contexts, it is important to explore the affo...
Offline reinforcement learning (RL) methods strike a balance between exploration and exploitation by conservative value estimation -- penalizing values of unseen states and actions. Model-free methods penalize values at all unseen actions, while model-based methods are able to further exploit unseen states via model rollouts. However, such methods...
Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for LanguagE-Conditioned Multi-robot MAnipulation (LEMMA) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedurally generate...
Collision-free navigation in cluttered environments with static and dynamic obstacles is essential for many multi-robot tasks. Dynamic obstacles may also be interactive, i.e., their behavior varies based on the behavior of other entities. We propose a novel representation for interactive behavior of dynamic obstacles and a decentralized real-time m...
Reinforcement learning (RL) has shown promise in creating robust policies for robotics tasks. However, contemporary RL algorithms are data-hungry, often requiring billions of environment transitions to train successful policies. This necessitates the use of fast and highly-parallelizable simulators. In addition to speed, such simulators need to mod...
Granular materials are of critical interest to many robotic tasks in planetary science, construction, and manufacturing. However, the dynamics of granular materials are complex and often computationally very expensive to simulate. We propose a set of methodologies and a system for the fast simulation of granular materials on Graphics Processing Uni...
Recent progress in Quality Diversity Reinforcement Learning (QD-RL) has enabled learning a collection of behaviorally diverse, high performing policies. However, these methods typically involve storing thousands of policies, which results in high space-complexity and poor scaling to additional behaviors. Condensing the archive into a single model w...
Robotic assembly is a longstanding challenge, requiring contact-rich interaction and high precision and accuracy. Many applications also require adaptivity to diverse parts, poses, and environments, as well as low cycle times. In other areas of robotics, simulation is a powerful tool to develop algorithms, generate datasets, and train agents. Howev...
Training generally capable agents that perform well in unseen dynamic environments is a long-term goal of robot learning. Quality Diversity Reinforcement Learning (QD-RL) is an emerging class of reinforcement learning (RL) algorithms that blend insights from Quality Diversity (QD) and RL to produce a collection of high performing and behaviorally d...
We are motivated by quantile estimation of algae concentration in lakes. We find that multirobot teams improve performance in this task over single robots, and communication-enabled teams further over communication-deprived teams; however, real robots are resource-constrained, and communication networks cannot support arbitrary message loads, makin...
We consider the problem of identifying material parameters of a deformable object, such as elastic moduli, by non-destructive robotic manipulation. We assume known geometry and mass, a reliable fixed grasp, and the ability to track the positions of a few points on the object surface. We collect a dataset of grasp pose sequences and corresponding po...
When robots are deployed in the field for environmental monitoring they typically execute pre-programmed motions, such as lawnmower paths, instead of adaptive methods, such as informative path planning. One reason for this is that adaptive methods are dependent on parameter choices that are both critical to set correctly and difficult for the non-s...
Quantiles of a natural phenomena can provide scientists with an important understanding of typical, extreme, or other spreads of concentrations. When a group has several available robots, or teams of scientists come together to study a particular environment, it may be advantageous to pool robot resources in a collaborative way to improve performan...
We introduce Alexa Arena, a user-centric simulation platform for Embodied AI (EAI) research. Alexa Arena provides a variety of multi-room layouts and interactable objects, for the creation of human-robot interaction (HRI) missions. With user-friendly graphics and control mechanisms, Alexa Arena supports the development of gamified robotic tasks rea...
For service robots to become general-purpose in everyday household environments, they need not only a large library of primitive skills, but also the ability to quickly learn novel tasks specified by users. Fine-tuning neural networks on a variety of downstream tasks has been successful in many vision and language domains, but research is still lim...
We address the problem of ensuring resource availability in a networked multi-robot system performing distributed target tracking. Specifically, we consider a multi-target tracking scenario where the targets are driven by exogenous inputs that are unknown to the robots performing the tracking task. Robots track the positions of targets using a form...
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct...
Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper, we introduce a full framework for learning feedback models for reac...
We present a novel Learning from Demonstration (LfD) method, Deformable Manipulation from Demonstrations (DMfD), to solve deformable manipulation tasks using states or images as inputs, given expert demonstrations. Our method uses demonstrations in three different ways, and balances the trade-off between exploring the environment online and using g...
Multi-robot SLAM systems in GPS-denied environments require loop closures to maintain a drift-free centralized map. With an increasing number of robots and size of the environment, checking and computing the transformation for all the loop closure candidates becomes computationally infeasible. In this work, we describe a loop closure module that is...
Scientists interested in studying natural phenomena often take physical specimens from locations in the environment for later analysis. These analysis locations are typically specified by expert heuristics. Instead, we propose to choose locations for scientific analysis by using a robot to perform an informative path planning survey. The survey res...
We consider the setting where a team of robots is tasked with tracking multiple targets with the following property: approaching the targets enables more accurate target position estimation, but also increases the risk of sensor failures. Therefore, it is essential to address the trade-off between tracking quality maximization and risk minimization...
We present a novel Learning from Demonstration (LfD) method, Deformable Manipulation from Demonstrations (DMfD), to solve deformable manipulation tasks using states or images as inputs, given expert demonstrations. Our method uses demonstrations in three different ways, and balances the trade-off between exploring the environment online and using g...
Public outdoor surveillance cameras often have limited metadata describing their properties. Frequently, a public camera’s precise position, orientation, focal length, and image center are unknown; these attributes are necessary to precisely pinpoint the location of events seen in the camera. In this article, we ask: what is the minimal information...
Multi-robot SLAM systems in GPS-denied environments require loop closures to maintain a drift-free centralized map. With an increasing number of robots and size of the environment, checking and computing the transformation for all the loop closure candidates becomes computationally infeasible. In this work, we describe a loop closure module that is...
We consider a scenario where a team of robots with heterogeneous sensors must track a set of targets or hazards which may induce sensory failures on the robots. In particular, the likelihood of failures depends on the proximity between the targets and the robots. We propose a control framework that explicitly addresses the competing objectives of t...
Being able to reproduce physical phenomena ranging from light interaction to contact mechanics, simulators are becoming increasingly useful in more and more application domains where real-world interaction or labeled data are difficult to obtain. Despite recent progress, significant human effort is needed to configure simulators to accurately repro...
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct...
Scientists interested in studying natural phenomena often take physical samples for later analysis at locations specified by expert heuristics. Instead, we propose to guide scientists' physical sampling by using a robot to perform an adaptive sampling survey to find locations to suggest that correspond to the quantile values of pre-specified quanti...
Learning complex manipulation tasks in realistic, obstructed environments is a challenging problem due to hard exploration in the presence of obstacles and high-dimensional visual observations. Prior work tackles the exploration problem by integrating motion planning and reinforcement learning. However, the motion planner augmented policy requires...
Machine learning has long since become a keystone technology, accelerating science and applications in a broad range of domains. Consequently, the notion of applying learning methods to a particular problem set has become an established and valuable modus operandi to advance a particular field. In this article we argue that such an approach does no...
In order to be effective general purpose machines in real world environments, robots not only will need to adapt their existing manipulation skills to new circumstances, they will need to acquire entirely new skills on-the-fly. A great promise of continual learning is to endow robots with this ability, by using their accumulated knowledge and exper...
We investigate improving Monte Carlo Tree Search based solvers for Partially Observable Markov Decision Processes (POMDPs), when applied to adaptive sampling problems. We propose improvements in rollout allocation, the action exploration algorithm, and plan commitment. The first allocates a different number of rollouts depending on how many actions...
To accurately reproduce measurements from the real world, simulators need to have an adequate model of the physical system and require the parameters of the model be identified. We address the latter problem of estimating parameters through a Bayesian inference approach that approximates a posterior distribution over simulation parameters given rea...
We consider a scenario where a team of robots with heterogeneous sensors must track a set of hostile targets which induce sensory failures on the robots. In particular, the likelihood of failures depends on the proximity between the targets and the robots. We propose a control framework that implicitly addresses the competing objectives of performa...
Robots are used for collecting samples from natural environments to create models of, for example, temperature or algae fields in the ocean. Adaptive informative sampling is a proven technique for this kind of spatial field modeling. This paper compares the performance of humans versus adaptive informative sampling algorithms for selecting informat...
We propose the {\alpha}-suboptimal covering number to characterize multi-task control problems where the set of dynamical systems and/or cost functions is infinite, analogous to the cardinality of finite task sets. This notion may help quantify the function class expressiveness needed to represent a good multi-task policy, which is important for le...
Adaptive sampling is important in robotic environmental monitoring, allowing a robot to intelligently select sampling locations to build an informative model of a phenomenon of interest. Most adaptive sampling techniques assume the localization noise does not vary with location, or that this variation is negligible, and thus do not model this behav...
Differentiable simulators provide an avenue for closing the sim-to-real gap by enabling the use of efficient, gradient-based optimization algorithms to find the simulation parameters that best fit the observed sensor readings. Nonetheless, these analytical models can only predict the dynamical behavior of systems for which they have been designed....
Deep reinforcement learning (RL) agents are able to learn contact-rich manipulation tasks by maximizing a reward signal, but require large amounts of experience, especially in environments with many obstacles that complicate exploration. In contrast, motion planners use explicit models of the agent and environment to plan collision-free paths to fa...
Constrained robot motion planning is a widely used technique to solve complex robot tasks. We consider the problem of learning representations of constraints from demonstrations with a deep neural network, which we call Equality Constraint Manifold Neural Network (ECoMaNN). The key idea is to learn a level-set function of the constraint suitable fo...
We present a strategy for simulation-to-real transfer, which builds on recent advances in robot skill decomposition. Rather than focusing on minimizing the simulation–reality gap, we propose a method for increasing the sample efficiency and robustness of existing simulation-to-real approaches which exploits hierarchy and online adaptation. Instead...
We propose a framework for resilience in a networked heterogeneous multi-robot team subject to resource failures. Each robot in the team is equipped with resources that it shares with its neighbors. Additionally, each robot in the team executes a task, whose performance depends on the resources to which it has access. When a resource on a particula...
We present a differentiable simulation architecture for articulated rigid-body dynamics that enables the augmentation of analytical models with neural networks at any point of the computation. Through gradient-based optimization, identification of the simulation parameters and network weights is performed efficiently in preliminary experiments on a...
Robots need to be able to adapt to unexpected changes in the environment such that they can autonomously succeed in their tasks. However, hand-designing feedback models for adaptation is tedious, if at all possible, making data-driven methods a promising alternative. In this paper we introduce a full framework for learning feedback models for react...
Representing the environment is a fundamental task in enabling robots to act autonomously in unknown environments. In this work, we present confidence-rich mapping (CRM), a new algorithm for spatial grid-based mapping of the 3D environment. CRM augments the occupancy level at each voxel by its confidence value. By explicitly storing and evolving co...
Motion planning with constraints is an important part of many real-world robotic systems. In this work, we study manifold learning methods to learn such constraints from data. We explore two methods for learning implicit constraint manifolds from data: Variational Autoencoders (VAE), and a new method, Equality Constraint Manifold Neural Network (EC...
We address the problem of planning robot motions in constrained configuration spaces where the constraints change throughout the motion. A novel problem formulation is introduced that describes a task as a sequence of intersecting manifolds, which the robot needs to traverse in order to solve the task. We specify a class of sequential motion planni...