
Guillaume Sartoretti- PhD. in Robotics
- Professor (Assistant) at National University of Singapore
Guillaume Sartoretti
- PhD. in Robotics
- Professor (Assistant) at National University of Singapore
About
102
Publications
17,318
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,239
Citations
Introduction
Hello! My name is Guillaume Sartoretti, and I a new Assistant Professor in the Mechanical Engineering Department at the National University of Singapore (NUS). There, I am the director of the Multi-Agent Robotic Motion (MARMot) Lab. My research interest lies in the distributed/decentralized coordination of numerous agents, at the interface between conventional control and artificial intelligence. Applications of my work range from multi-robot systems, where independent robots need to coordinate their actions to achieve a common goal, to high-DoF articulated robots, where joints need to be carefully coupled during locomotion in rough terrain.
Current institution
Additional affiliations
June 2016 - June 2019
April 2012 - March 2016
September 2010 - March 2012
Education
April 2012 - April 2016
September 2010 - February 2012
September 2007 - June 2010
Publications
Publications (102)
In this paper, we present a distributed control strategy, enabling agents to converge onto and travel along a consensually selected curve among a class of closed planar curves. Individual agents identify the number of neighbors within a finite circular sensing range and obtain information from their neighbors through local communication. The inform...
Inspired by recent advances in single agent reinforcement learning, this paper extends the single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy an...
Multi-agent path finding (MAPF) is an essential component of many large-scale, real-world robot deployments, from aerial swarms to warehouse automation. However, despite the community's continued efforts, most state-of-the-art MAPF planners still rely on centralized planning and scale poorly past a few hundred agents. Such planning approaches are m...
State-of-the-art distributed algorithms for reinforcement learning rely on multiple independent agents, which simultaneously learn in parallel environments while asynchronously updating a common, shared policy. Moreover, decentralized control architectures (e.g., CPGs) can coordinate spatially distributed portions of an articulated robot to achieve...
Multi-Agent Path Finding (MAPF) is a critical component of logistics and warehouse management, which focuses on planning collision-free paths for a team of robots in a known environment. Recent work introduced a novel MAPF approach, LNS2, which proposed to repair a quickly-obtainable set of infeasible paths via iterative re-planning, by relying on...
Adaptive traffic signal control (ATSC) is crucial in reducing congestion, maximizing throughput, and improving mobility in rapidly growing urban areas. Recent advancements in parameter-sharing multi-agent reinforcement learning (MARL) have greatly enhanced the scalable and adaptive optimization of complex, dynamic flows in large-scale homogeneous n...
In autonomous exploration tasks, robots are required to explore and map unknown environments while efficiently planning in dynamic and uncertain conditions. Given the significant variability of environments, human operators often have specific preference requirements for exploration, such as prioritizing certain areas or optimizing for different as...
Many multi-robot applications require allocating a team of heterogeneous agents (robots) with different abilities to cooperatively complete a given set of spatially distributed tasks as quickly as possible. We focus on tasks that can only be initiated when all required agents are present otherwise arrived agents would be waiting idly. Agents need t...
In multi-robot exploration, a team of mobile robot is tasked with efficiently mapping an unknown environments. While most exploration planners assume omnidirectional sensors like LiDAR, this is impractical for small robots such as drones, where lightweight, directional sensors like cameras may be the only option due to payload constraints. These se...
Despite recent advances in learning-based controllers for legged robots, deployments in human-centric environments remain limited by safety concerns. Most of these approaches use position-based control, where policies output target joint angles that must be processed by a low-level controller (e.g., PD or impedance controllers) to compute joint tor...
The LLM-as-a-Judge paradigm shows promise for evaluating generative content but lacks reliability in reasoning-intensive scenarios, such as programming. Inspired by recent advances in reasoning models and shifts in scaling laws, we pioneer bringing test-time computation into LLM-as-a-Judge, proposing MCTS-Judge, a resource-efficient, System-2 think...
The Multi-Agent Path Finding (MAPF) problem aims to determine the shortest and collision-free paths for multiple agents in a known, potentially obstacle-ridden environment. It is the core challenge for robotic deployments in large-scale logistics and transportation. Decentralized learning-based approaches have shown great potential for addressing t...
Adaptive traffic signal control (ATSC) is crucial in alleviating congestion, maximizing throughput and promoting sustainable mobility in ever-expanding cities. Multi-Agent Reinforcement Learning (MARL) has recently shown significant potential in addressing complex traffic dynamics, but the intricacies of partial observability and coordination in de...
Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still...
Autonomous robot exploration requires a robot to efficiently explore and map unknown environments. Compared to conventional methods that can only optimize paths based on the current robot belief, learning-based methods show the potential to achieve improved performance by drawing on past experiences to reason about unknown areas. In this paper, we...
Information sharing is critical in time-sensitive and realistic multi-robot exploration, especially for smaller robotic teams in large-scale environments where connectivity may be sparse and intermittent. Existing methods often overlook such communication constraints by assuming unrealistic global connectivity. Other works account for communication...
In multi-agent reinforcement learning (MARL), achieving multi-task generalization to diverse agents and objectives presents significant challenges. Existing online MARL algorithms primarily focus on single-task performance, but their lack of multi-task generalization capabilities typically results in substantial computational waste and limited real...
In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation, where the robot must optimize its trajectory adaptively to achieve the task objective through continuous interactions in unknown environments. Specific...
The Multi-agent Path Finding (MAPF) problem involves finding collision-free paths for a team of agents in a known, static environment, with important applications in warehouse automation, logistics, or last-mile delivery. To meet the needs of these large-scale applications, current learning-based methods often deploy the same fully trained, decentr...
Communication bandwidth is an important consideration in multi-robot exploration, where information exchange among robots is critical. While existing methods typically aim to reduce communication throughput, they either require significant computation or significantly compromise exploration efficiency. In this work, we propose a deep reinforcement...
In recent years, the field of aerial robotics has witnessed significant progress, finding applications in diverse domains, including post-disaster search and rescue operations. Despite these strides, the prohibitive acquisition costs associated with deploying physical multi-UAV systems have posed challenges, impeding their widespread utilization in...
Multi-Agent Path Finding (MAPF) is a critical component of logistics and warehouse management, which focuses on planning collision-free paths for a team of robots in a known environment. Recent work introduced a novel MAPF approach, LNS2, which proposed to repair a quickly-obtainable set of infeasible paths via iterative re-planning, by relying on...
Existing deep reinforcement learning (DRL) methods for multi-objective vehicle routing problems (MOVRPs) always decompose an MOVRP into subproblems with respective preferences and then train policies to solve corresponding subproblems. However, such a paradigm is still less effective in tackling the intricate interactions among subproblems, thus ho...
In this work, we propose a deep reinforcement learning (DRL) based reactive planner to solve large-scale Lidar-based autonomous robot exploration problems in 2D action space. Our DRL-based planner allows the agent to reactively plan its exploration path by making implicit predictions about unknown areas, based on a learned estimation of the underly...
Existing neural heuristics for multiobjective vehicle routing problems (MOVRPs) are primarily conditioned on instance context, which failed to appropriately exploit preference and problem size, thus holding back the performance. To thoroughly unleash the potential, we propose a novel conditional neural heuristic (CNH) that fully leverages the insta...
The multiple traveling salesman problem (mTSP) is a well-known NP-hard problem with numerous real-world applications. In particular, this work addresses MinMax mTSP, where the objective is to minimize the max tour length among all agents. Many robotic deployments require recomputing potentially large mTSP instances frequently, making the natural tr...
This paper presents a novel, sparse sensing motion planning algorithm for autonomous mobile robots in resource limited coverage problems. Optimizing usage of limited resources while effectively exploring an area is vital in scenarios where sensing is expensive, has adverse effects, or is exhaustive. We approach this problem using ergodic search tec...
In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation, where the robot must optimize its trajectory adaptively to achieve the task objective through continuous interactions in unknown environments. Specific...
Communication in multi-agent systems is a key driver of team-level cooperation, for instance allowing individual agents to augment their knowledge about the world in partially-observable environments. In this paper, we propose two reinforcement learning-based multi-agent models, namely FCMNet and FCMTran. The two models both allow agents to simulta...
Legged robots can have a unique role in manipulating objects in dynamic, human-centric, or otherwise inaccessible environments. Although most legged robotics research to date typically focuses on traversing these challenging environments, many legged platform demonstrations have also included “moving an object” as a way of doing tangible work. Legg...
Many recent works have turned to multi-agent reinforcement learning (MARL) for adaptive traffic signal control to optimize the travel time of vehicles over large urban networks. However, achieving effective and scalable cooperation among junctions (agents) remains an open challenge, as existing methods often rely on extensive, non-generalizable rew...
Legged robots can have a unique role in manipulating objects in dynamic, human-centric, or otherwise inaccessible environments. Although most legged robotics research to date typically focuses on traversing these challenging environments, many legged platform demonstrations have also included "moving an object" as a way of doing tangible work. Legg...
This work focuses on the persistent monitoring problem, where a set of targets moving based on an unknown model must be monitored by an autonomous mobile robot with a limited sensing range. To keep each target's position estimate as accurate as possible, the robot needs to adaptively plan its path to (re-)visit all the targets and update its belief...
In multi-agent informative path planning (MAIPP), agents must collectively construct a global belief map of an underlying distribution of interest (e.g., gas concentration, light intensity, or pollution levels) over a given domain, based on measurements taken along their trajectory. They must frequently replan their path to balance the exploration...
Trading off performance guarantees in favor of scalability, the Multi-Agent Path Finding (MAPF) community has recently started to embrace Multi-Agent Reinforcement Learning (MARL), where agents learn to collaboratively generate individual, collision-free (but often suboptimal) paths. Scalability is usually achieved by assuming a local field of view...
Communication in multi-agent systems is a key driver of team-level cooperation, for instance allowing individual agents to augment their knowledge about the world in partially-observable environments. In this paper, we propose two reinforcement learning-based multi-agent models, namely FCMNet and FCMTran. The two models both allow agents to simulta...
In autonomous robot exploration tasks, a mobile robot needs to actively explore and map an unknown environment as fast as possible. Since the environment is being revealed during exploration, the robot needs to frequently re-plan its path online, as new information is acquired by onboard sensors and used to update its partial map. While state-of-th...
The aim of traffic signal control (TSC) is to optimize vehicle traffic in urban road networks, via the control of traffic lights at intersections.
From insects to larger mammals, legged animals can be seen easily traversing a wide variety of challenging environments, by carefully selecting, reaching, and exploiting high-quality contacts with the terrain. In contrast, existing robotic foothold planning methods remain computationally expensive, often relying on exhaustive search and/or (often o...
Purpose of Review
Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems consisting of hundreds/thousands of robots, with promising applications to automated manufacturing, disaster relief, harvesting, last-mile delivery, port/airport operations, or search and rescue. The community has leveraged model-fre...
Serially connected robots are promising candidates for performing tasks in confined spaces such as search and rescue in large-scale disasters. Such robots are typically limbless, and we hypothesize that the addition of limbs could improve mobility. However, a challenge in designing and controlling such devices lies in the coordination of high-dimen...
Purpose of review: Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems consisting of hundreds/thousands of robots, with promising applications to automated manufacturing, disaster relief, harvesting, last-mile delivery, port/airport operations, or search and rescue. The community has leveraged model-fr...
Multi-agent foraging (MAF) involves distributing a team of agents to search an environment and extract resources from it. Nature provides several examples of highly effective foragers, where individuals within the foraging collective use biological markers (e.g., pheromones) to communicate critical information to others via the environment. In this...
Decentralized cooperation in partially-observable multi-agent systems requires effective communications among agents. To support this effort, this work focuses on the class of problems where global communications are available but may be unreliable, thus precluding differentiable communication learning methods. We introduce FCMNet, a reinforcement...
This paper develops a multi-agent heterogeneous search approach that leverages the sensing and motion capabilities of different agents to improve search performance (i.e., decrease search time and increase coverage efficiency). To do so, we build upon recent results in ergodic coverage methods for homogeneous teams, where the search paths of the ag...
Deploying Unmanned Aerial Vehicles (UAVs) for traffic monitoring has been a hotspot given their flexibility and broader view. However, a UAV is usually constrained by battery capacity due to limited payload. On the other hand, the development of wireless charging technology has allowed UAVs to replenish energy from charging stations. In this paper,...
Serially connected robots are promising candidates for performing tasks in confined spaces such as search-and-rescue in large-scale disasters. Such robots are typically limbless, and we hypothesize that the addition of limbs could improve mobility. However, a challenge in designing and controlling such devices lies in the coordination of high-dimen...
The multiple traveling salesman problem (mTSP) is a well-known NP-hard problem with numerous real-world applications. In particular, this work addresses MinMax mTSP, where the objective is to minimize the max tour length (sum of Euclidean distances) among all agents. The mTSP is normally considered as a combinatorial optimization problem, but due t...
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operatio...
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) – an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one – in...
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operatio...
Many animals generate propulsive forces by coordinating legs, which contact and push against the surroundings, with bending of the body, which can only indirectly influence these forces. Such body–leg coordination is not commonly employed in quadrupedal robotic systems. To elucidate the role of back bending during quadrupedal locomotion, we study a...
Efficient automated scheduling of trains remains a major challenge for modern railway systems. The underlying vehicle rescheduling problem (VRSP) has been a major focus of Operations Research (OR) since decades. Traditional approaches use complex simulators to study VRSP, where experimenting with a broad range of novel ideas is time consuming and h...
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) -- an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one -- i...
Multi-agent foraging (MAF) involves distributing a team of agents to search an environment and extract resources from it. Many foraging algorithms use biologically-inspired signaling mechanisms, such as pheromones, to help agents navigate from resources back to a central nest while relying on local sensing only. However, these approaches often rely...
This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework....
Decentralized multi-agent reinforcement learning has been demonstrated to be an effective solution to large multi-agent control problems. However, agents typically can only make decisions based on local information, resulting in suboptimal performance in partially-observable settings. The addition of a communication channel overcomes this limitatio...
This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework....
State-of-the-art distributed algorithms for reinforcement learning rely on multiple independent agents, which simultaneously learn in parallel environments
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup>
while asynchronously updating a common, shared policy. Moreover, decentralized control arch...
Motion planning for mobile robots with many degrees-of-freedom (DoF) is challenging due to their high-dimensional configuration spaces. To manage this curse of di-mensionality, this paper proposes a new hierarchical framework that decomposes the system into subsystems (based on shared capabilities of DoFs), for which we can design and coordinate mo...
In this paper, we focus on the problem of directing the gaze of a vision system mounted to the body of a high-degree-of-freedom (DOF) legged robot for active perception deployments. In particular, we consider the case where the vision system is rigidly attached to the robot's body (i.e., without any additional DOF between the vision system and robo...
This paper extends the state-of-the-art single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy and critic learning, but decentralized policy executi...
Multi-agent path finding (MAPF) is an essential component of many large-scale, real-world robot deployments, from aerial swarms to warehouse automation to collaborative search-and-rescue. However, despite the community's continued efforts, most state-of-the-art MAPF algorithms still rely on centralized planning, and scale poorly past a few hundred...
Many quadrupedal animals have lateral degrees of freedom in their backs that assist locomotion. This paper seeks to use a robotic model to demonstrate that back bending assists not only forward motion, but also lateral and turning motions. This paper uses geometric mechanics to prescribe gaits that coordinate both leg movements and back bending mot...
Inspired by the ability of animals to rely on propri-oception and vestibular feedback to adapt their gait, we propose a modular framework for autonomous locomotion that relies on force sensing and inertial information. A first controller exploits anti-compliance, a new application of positive force feedback, to quickly react against obstacles upon...
Decentralized control architectures, such as those conventionally defined by central pattern generators, independently coordinate spatially distributed portions of articulated bodies to achieve system-level objectives. State of the art distributed algorithms for reinforcement learning employ a different but conceptually related idea; independent ag...
Inspired by the locomotor nervous system of vertebrates, central pattern generator (CPG) models can be used to design gaits for articulated robots, such as crawling, swimming or legged robots. Incorporating sensory feedback for gait adaptation in these models can improve the locomotive performance of such robots in challenging terrain. However, mos...
We present a distributed control mechanism allowing a swarm of non-holonomic autonomous surface vehicles (ASVs) to synchronously arrange around a rectangular floating object in a grasping formation; the swarm is then able to collaboratively transport the object to a desired final position and orientation. We analytically consider the problem of syn...
The collective dynamic behavior of large groups of interacting autonomous agents (swarms) have inspired much research in both fundamental and engineering sciences.
It is now widely acknowledged that the intrinsic nonlinearities due to mutual interactions can generate highly collective spatio-temporal patterns.
Moreover, the resulting self-organized...
We consider the dynamics of swarms of scalar Brownian agents subject to local imitation mechanisms implemented using mutual rank-based interactions. For appropriate values of the underlying control parameters, the swarm propagates tightly and the distances separating successive agents are iid exponential random variables. Implicitly, the implementa...
We focus on the control of heterogeneous swarms of agents that evolve in a random environment. Control is achieved by introducing special agents: leader and infiltrated (shill) agents. A refined distinction is made between hidden and apparent controlling agents. For each case, we provide an analytically solvable example of swarm dynamics.
This contribution is addressed to the dynamics of heterogeneous interacting agents evolving on the plane. Heterogeneity is due to the presence of an unfiltered externally controllable fellow, a shill, which via mutual interactions ultimately drives (i.e. soft controls) the whole society towards a given goal. We are able to calculate relevant dynami...
We propose a decentralized method for traffic monitoring, fully distributed over the vehicles. An algorithm is provided, specifying which information should be tracked to reconstruct an instantaneous map of traffic flow. We test the accuracy of our method in a simple cellular automata traffic simulation model, for which the traffic condition can be...
We consider a collection of N homogeneous interacting Brownian agents evolving on the plane. The time continuous individual dynamics are jointly driven by mixed canonical-dissipative (MCD) type dynamics and White Gaussian noise sources. Each agent is permanently at the center of a finite size observation disk D ρ. Steadily with time, agents count t...