Guillaume Sartoretti

Guillaume Sartoretti
Verified
Guillaume verified their affiliation via an institutional email.
Verified
Guillaume verified their affiliation via an institutional email.
  • PhD. in Robotics
  • Professor (Assistant) at National University of Singapore

About

102
Publications
17,318
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,239
Citations
Introduction
Hello! My name is Guillaume Sartoretti, and I a new Assistant Professor in the Mechanical Engineering Department at the National University of Singapore (NUS). There, I am the director of the Multi-Agent Robotic Motion (MARMot) Lab. My research interest lies in the distributed/decentralized coordination of numerous agents, at the interface between conventional control and artificial intelligence. Applications of my work range from multi-robot systems, where independent robots need to coordinate their actions to achieve a common goal, to high-DoF articulated robots, where joints need to be carefully coupled during locomotion in rough terrain.
Current institution
National University of Singapore
Current position
  • Professor (Assistant)
Additional affiliations
June 2016 - June 2019
Carnegie Mellon University
Position
  • PostDoc Position
April 2012 - March 2016
Swiss Federal Institute of Technology in Lausanne
Position
  • PhD Student
September 2010 - March 2012
University of Geneva
Position
  • Master's Student
Education
April 2012 - April 2016
Swiss Federal Institute of Technology in Lausanne
Field of study
  • Robotics and Intelligent Systems
September 2010 - February 2012
University of Geneva
Field of study
  • Mathematics and Computer Science
September 2007 - June 2010
University of Geneva
Field of study
  • Mathematics and Computer Science

Publications

Publications (102)
Article
Full-text available
In this paper, we present a distributed control strategy, enabling agents to converge onto and travel along a consensually selected curve among a class of closed planar curves. Individual agents identify the number of neighbors within a finite circular sensing range and obtain information from their neighbors through local communication. The inform...
Conference Paper
Full-text available
Inspired by recent advances in single agent reinforcement learning, this paper extends the single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy an...
Article
Full-text available
Multi-agent path finding (MAPF) is an essential component of many large-scale, real-world robot deployments, from aerial swarms to warehouse automation. However, despite the community's continued efforts, most state-of-the-art MAPF planners still rely on centralized planning and scale poorly past a few hundred agents. Such planning approaches are m...
Preprint
Full-text available
State-of-the-art distributed algorithms for reinforcement learning rely on multiple independent agents, which simultaneously learn in parallel environments while asynchronously updating a common, shared policy. Moreover, decentralized control architectures (e.g., CPGs) can coordinate spatially distributed portions of an articulated robot to achieve...
Article
Multi-Agent Path Finding (MAPF) is a critical component of logistics and warehouse management, which focuses on planning collision-free paths for a team of robots in a known environment. Recent work introduced a novel MAPF approach, LNS2, which proposed to repair a quickly-obtainable set of infeasible paths via iterative re-planning, by relying on...
Preprint
Adaptive traffic signal control (ATSC) is crucial in reducing congestion, maximizing throughput, and improving mobility in rapidly growing urban areas. Recent advancements in parameter-sharing multi-agent reinforcement learning (MARL) have greatly enhanced the scalable and adaptive optimization of complex, dynamic flows in large-scale homogeneous n...
Preprint
In autonomous exploration tasks, robots are required to explore and map unknown environments while efficiently planning in dynamic and uncertain conditions. Given the significant variability of environments, human operators often have specific preference requirements for exploration, such as prioritizing certain areas or optimizing for different as...
Article
Many multi-robot applications require allocating a team of heterogeneous agents (robots) with different abilities to cooperatively complete a given set of spatially distributed tasks as quickly as possible. We focus on tasks that can only be initiated when all required agents are present otherwise arrived agents would be waiting idly. Agents need t...
Preprint
In multi-robot exploration, a team of mobile robot is tasked with efficiently mapping an unknown environments. While most exploration planners assume omnidirectional sensors like LiDAR, this is impractical for small robots such as drones, where lightweight, directional sensors like cameras may be the only option due to payload constraints. These se...
Preprint
Despite recent advances in learning-based controllers for legged robots, deployments in human-centric environments remain limited by safety concerns. Most of these approaches use position-based control, where policies output target joint angles that must be processed by a low-level controller (e.g., PD or impedance controllers) to compute joint tor...
Preprint
The LLM-as-a-Judge paradigm shows promise for evaluating generative content but lacks reliability in reasoning-intensive scenarios, such as programming. Inspired by recent advances in reasoning models and shifts in scaling laws, we pioneer bringing test-time computation into LLM-as-a-Judge, proposing MCTS-Judge, a resource-efficient, System-2 think...
Preprint
The Multi-Agent Path Finding (MAPF) problem aims to determine the shortest and collision-free paths for multiple agents in a known, potentially obstacle-ridden environment. It is the core challenge for robotic deployments in large-scale logistics and transportation. Decentralized learning-based approaches have shown great potential for addressing t...
Article
Adaptive traffic signal control (ATSC) is crucial in alleviating congestion, maximizing throughput and promoting sustainable mobility in ever-expanding cities. Multi-Agent Reinforcement Learning (MARL) has recently shown significant potential in addressing complex traffic dynamics, but the intricacies of partial observability and coordination in de...
Preprint
Full-text available
Lifelong Multi-Agent Path Finding (LMAPF) is a variant of MAPF where agents are continually assigned new goals, necessitating frequent re-planning to accommodate these dynamic changes. Recently, this field has embraced learning-based methods, which reactively generate single-step actions based on individual local observations. However, it is still...
Preprint
Autonomous robot exploration requires a robot to efficiently explore and map unknown environments. Compared to conventional methods that can only optimize paths based on the current robot belief, learning-based methods show the potential to achieve improved performance by drawing on past experiences to reason about unknown areas. In this paper, we...
Preprint
Information sharing is critical in time-sensitive and realistic multi-robot exploration, especially for smaller robotic teams in large-scale environments where connectivity may be sparse and intermittent. Existing methods often overlook such communication constraints by assuming unrealistic global connectivity. Other works account for communication...
Preprint
In multi-agent reinforcement learning (MARL), achieving multi-task generalization to diverse agents and objectives presents significant challenges. Existing online MARL algorithms primarily focus on single-task performance, but their lack of multi-task generalization capabilities typically results in substantial computational waste and limited real...
Preprint
In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation, where the robot must optimize its trajectory adaptively to achieve the task objective through continuous interactions in unknown environments. Specific...
Preprint
The Multi-agent Path Finding (MAPF) problem involves finding collision-free paths for a team of agents in a known, static environment, with important applications in warehouse automation, logistics, or last-mile delivery. To meet the needs of these large-scale applications, current learning-based methods often deploy the same fully trained, decentr...
Preprint
Communication bandwidth is an important consideration in multi-robot exploration, where information exchange among robots is critical. While existing methods typically aim to reduce communication throughput, they either require significant computation or significantly compromise exploration efficiency. In this work, we propose a deep reinforcement...
Preprint
In recent years, the field of aerial robotics has witnessed significant progress, finding applications in diverse domains, including post-disaster search and rescue operations. Despite these strides, the prohibitive acquisition costs associated with deploying physical multi-UAV systems have posed challenges, impeding their widespread utilization in...
Preprint
Multi-Agent Path Finding (MAPF) is a critical component of logistics and warehouse management, which focuses on planning collision-free paths for a team of robots in a known environment. Recent work introduced a novel MAPF approach, LNS2, which proposed to repair a quickly-obtainable set of infeasible paths via iterative re-planning, by relying on...
Conference Paper
Existing deep reinforcement learning (DRL) methods for multi-objective vehicle routing problems (MOVRPs) always decompose an MOVRP into subproblems with respective preferences and then train policies to solve corresponding subproblems. However, such a paradigm is still less effective in tackling the intricate interactions among subproblems, thus ho...
Article
In this work, we propose a deep reinforcement learning (DRL) based reactive planner to solve large-scale Lidar-based autonomous robot exploration problems in 2D action space. Our DRL-based planner allows the agent to reactively plan its exploration path by making implicit predictions about unknown areas, based on a learned estimation of the underly...
Article
Existing neural heuristics for multiobjective vehicle routing problems (MOVRPs) are primarily conditioned on instance context, which failed to appropriately exploit preference and problem size, thus holding back the performance. To thoroughly unleash the potential, we propose a novel conditional neural heuristic (CNH) that fully leverages the insta...
Chapter
The multiple traveling salesman problem (mTSP) is a well-known NP-hard problem with numerous real-world applications. In particular, this work addresses MinMax mTSP, where the objective is to minimize the max tour length among all agents. Many robotic deployments require recomputing potentially large mTSP instances frequently, making the natural tr...
Chapter
This paper presents a novel, sparse sensing motion planning algorithm for autonomous mobile robots in resource limited coverage problems. Optimizing usage of limited resources while effectively exploring an area is vital in scenarios where sensing is expensive, has adverse effects, or is exhaustive. We approach this problem using ergodic search tec...
Article
In this paper, we introduce HDPlanner, a deep reinforcement learning (DRL) based framework designed to tackle two core and challenging tasks for mobile robots: autonomous exploration and navigation, where the robot must optimize its trajectory adaptively to achieve the task objective through continuous interactions in unknown environments. Specific...
Article
Full-text available
Communication in multi-agent systems is a key driver of team-level cooperation, for instance allowing individual agents to augment their knowledge about the world in partially-observable environments. In this paper, we propose two reinforcement learning-based multi-agent models, namely FCMNet and FCMTran. The two models both allow agents to simulta...
Article
Full-text available
Legged robots can have a unique role in manipulating objects in dynamic, human-centric, or otherwise inaccessible environments. Although most legged robotics research to date typically focuses on traversing these challenging environments, many legged platform demonstrations have also included “moving an object” as a way of doing tangible work. Legg...
Preprint
Many recent works have turned to multi-agent reinforcement learning (MARL) for adaptive traffic signal control to optimize the travel time of vehicles over large urban networks. However, achieving effective and scalable cooperation among junctions (agents) remains an open challenge, as existing methods often rely on extensive, non-generalizable rew...
Preprint
Legged robots can have a unique role in manipulating objects in dynamic, human-centric, or otherwise inaccessible environments. Although most legged robotics research to date typically focuses on traversing these challenging environments, many legged platform demonstrations have also included "moving an object" as a way of doing tangible work. Legg...
Preprint
This work focuses on the persistent monitoring problem, where a set of targets moving based on an unknown model must be monitored by an autonomous mobile robot with a limited sensing range. To keep each target's position estimate as accurate as possible, the robot needs to adaptively plan its path to (re-)visit all the targets and update its belief...
Preprint
In multi-agent informative path planning (MAIPP), agents must collectively construct a global belief map of an underlying distribution of interest (e.g., gas concentration, light intensity, or pollution levels) over a given domain, based on measurements taken along their trajectory. They must frequently replan their path to balance the exploration...
Preprint
Trading off performance guarantees in favor of scalability, the Multi-Agent Path Finding (MAPF) community has recently started to embrace Multi-Agent Reinforcement Learning (MARL), where agents learn to collaboratively generate individual, collision-free (but often suboptimal) paths. Scalability is usually achieved by assuming a local field of view...
Preprint
Full-text available
Communication in multi-agent systems is a key driver of team-level cooperation, for instance allowing individual agents to augment their knowledge about the world in partially-observable environments. In this paper, we propose two reinforcement learning-based multi-agent models, namely FCMNet and FCMTran. The two models both allow agents to simulta...
Preprint
Full-text available
In autonomous robot exploration tasks, a mobile robot needs to actively explore and map an unknown environment as fast as possible. Since the environment is being revealed during exploration, the robot needs to frequently re-plan its path online, as new information is acquired by onboard sensors and used to update its partial map. While state-of-th...
Chapter
The aim of traffic signal control (TSC) is to optimize vehicle traffic in urban road networks, via the control of traffic lights at intersections.
Article
From insects to larger mammals, legged animals can be seen easily traversing a wide variety of challenging environments, by carefully selecting, reaching, and exploiting high-quality contacts with the terrain. In contrast, existing robotic foothold planning methods remain computationally expensive, often relying on exhaustive search and/or (often o...
Article
Full-text available
Purpose of Review Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems consisting of hundreds/thousands of robots, with promising applications to automated manufacturing, disaster relief, harvesting, last-mile delivery, port/airport operations, or search and rescue. The community has leveraged model-fre...
Article
Full-text available
Serially connected robots are promising candidates for performing tasks in confined spaces such as search and rescue in large-scale disasters. Such robots are typically limbless, and we hypothesize that the addition of limbs could improve mobility. However, a challenge in designing and controlling such devices lies in the coordination of high-dimen...
Preprint
Purpose of review: Recent advances in sensing, actuation, and computation have opened the door to multi-robot systems consisting of hundreds/thousands of robots, with promising applications to automated manufacturing, disaster relief, harvesting, last-mile delivery, port/airport operations, or search and rescue. The community has leveraged model-fr...
Article
Multi-agent foraging (MAF) involves distributing a team of agents to search an environment and extract resources from it. Nature provides several examples of highly effective foragers, where individuals within the foraging collective use biological markers (e.g., pheromones) to communicate critical information to others via the environment. In this...
Preprint
Decentralized cooperation in partially-observable multi-agent systems requires effective communications among agents. To support this effort, this work focuses on the class of problems where global communications are available but may be unreliable, thus precluding differentiable communication learning methods. We introduce FCMNet, a reinforcement...
Chapter
This paper develops a multi-agent heterogeneous search approach that leverages the sensing and motion capabilities of different agents to improve search performance (i.e., decrease search time and increase coverage efficiency). To do so, we build upon recent results in ergodic coverage methods for homogeneous teams, where the search paths of the ag...
Article
Deploying Unmanned Aerial Vehicles (UAVs) for traffic monitoring has been a hotspot given their flexibility and broader view. However, a UAV is usually constrained by battery capacity due to limited payload. On the other hand, the development of wireless charging technology has allowed UAVs to replenish energy from charging stations. In this paper,...
Preprint
Full-text available
Serially connected robots are promising candidates for performing tasks in confined spaces such as search-and-rescue in large-scale disasters. Such robots are typically limbless, and we hypothesize that the addition of limbs could improve mobility. However, a challenge in designing and controlling such devices lies in the coordination of high-dimen...
Preprint
Full-text available
The multiple traveling salesman problem (mTSP) is a well-known NP-hard problem with numerous real-world applications. In particular, this work addresses MinMax mTSP, where the objective is to minimize the max tour length (sum of Euclidean distances) among all agents. The mTSP is normally considered as a combinatorial optimization problem, but due t...
Chapter
Full-text available
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operatio...
Article
Full-text available
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) – an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one – in...
Preprint
Full-text available
The Flatland competition aimed at finding novel approaches to solve the vehicle re-scheduling problem (VRSP). The VRSP is concerned with scheduling trips in traffic networks and the re-scheduling of vehicles when disruptions occur, for example the breakdown of a vehicle. While solving the VRSP in various settings has been an active area in operatio...
Article
Many animals generate propulsive forces by coordinating legs, which contact and push against the surroundings, with bending of the body, which can only indirectly influence these forces. Such body–leg coordination is not commonly employed in quadrupedal robotic systems. To elucidate the role of back bending during quadrupedal locomotion, we study a...
Preprint
Full-text available
Efficient automated scheduling of trains remains a major challenge for modern railway systems. The underlying vehicle rescheduling problem (VRSP) has been a major focus of Operations Research (OR) since decades. Traditional approaches use complex simulators to study VRSP, where experimenting with a broad range of novel ideas is time consuming and h...
Preprint
Full-text available
Multi-agent path finding (MAPF) is an indispensable component of large-scale robot deployments in numerous domains ranging from airport management to warehouse automation. In particular, this work addresses lifelong MAPF (LMAPF) -- an online variant of the problem where agents are immediately assigned a new goal upon reaching their current one -- i...
Preprint
Multi-agent foraging (MAF) involves distributing a team of agents to search an environment and extract resources from it. Many foraging algorithms use biologically-inspired signaling mechanisms, such as pheromones, to help agents navigate from resources back to a central nest while relying on local sensing only. However, these approaches often rely...
Article
This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework....
Article
Full-text available
Decentralized multi-agent reinforcement learning has been demonstrated to be an effective solution to large multi-agent control problems. However, agents typically can only make decisions based on local information, resulting in suboptimal performance in partially-observable settings. The addition of a communication channel overcomes this limitatio...
Preprint
Full-text available
This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework....
Article
State-of-the-art distributed algorithms for reinforcement learning rely on multiple independent agents, which simultaneously learn in parallel environments <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> while asynchronously updating a common, shared policy. Moreover, decentralized control arch...
Conference Paper
Full-text available
Motion planning for mobile robots with many degrees-of-freedom (DoF) is challenging due to their high-dimensional configuration spaces. To manage this curse of di-mensionality, this paper proposes a new hierarchical framework that decomposes the system into subsystems (based on shared capabilities of DoFs), for which we can design and coordinate mo...
Conference Paper
Full-text available
In this paper, we focus on the problem of directing the gaze of a vision system mounted to the body of a high-degree-of-freedom (DOF) legged robot for active perception deployments. In particular, we consider the case where the vision system is rigidly attached to the robot's body (i.e., without any additional DOF between the vision system and robo...
Preprint
Full-text available
This paper extends the state-of-the-art single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy and critic learning, but decentralized policy executi...
Preprint
Full-text available
Multi-agent path finding (MAPF) is an essential component of many large-scale, real-world robot deployments, from aerial swarms to warehouse automation to collaborative search-and-rescue. However, despite the community's continued efforts, most state-of-the-art MAPF algorithms still rely on centralized planning, and scale poorly past a few hundred...
Preprint
Full-text available
Many quadrupedal animals have lateral degrees of freedom in their backs that assist locomotion. This paper seeks to use a robotic model to demonstrate that back bending assists not only forward motion, but also lateral and turning motions. This paper uses geometric mechanics to prescribe gaits that coordinate both leg movements and back bending mot...
Preprint
Full-text available
Inspired by the ability of animals to rely on propri-oception and vestibular feedback to adapt their gait, we propose a modular framework for autonomous locomotion that relies on force sensing and inertial information. A first controller exploits anti-compliance, a new application of positive force feedback, to quickly react against obstacles upon...
Preprint
Full-text available
Decentralized control architectures, such as those conventionally defined by central pattern generators, independently coordinate spatially distributed portions of articulated bodies to achieve system-level objectives. State of the art distributed algorithms for reinforcement learning employ a different but conceptually related idea; independent ag...
Preprint
Full-text available
Inspired by the locomotor nervous system of vertebrates, central pattern generator (CPG) models can be used to design gaits for articulated robots, such as crawling, swimming or legged robots. Incorporating sensory feedback for gait adaptation in these models can improve the locomotive performance of such robots in challenging terrain. However, mos...
Conference Paper
Full-text available
We present a distributed control mechanism allowing a swarm of non-holonomic autonomous surface vehicles (ASVs) to synchronously arrange around a rectangular floating object in a grasping formation; the swarm is then able to collaboratively transport the object to a desired final position and orientation. We analytically consider the problem of syn...
Thesis
Full-text available
The collective dynamic behavior of large groups of interacting autonomous agents (swarms) have inspired much research in both fundamental and engineering sciences. It is now widely acknowledged that the intrinsic nonlinearities due to mutual interactions can generate highly collective spatio-temporal patterns. Moreover, the resulting self-organized...
Article
Full-text available
We consider the dynamics of swarms of scalar Brownian agents subject to local imitation mechanisms implemented using mutual rank-based interactions. For appropriate values of the underlying control parameters, the swarm propagates tightly and the distances separating successive agents are iid exponential random variables. Implicitly, the implementa...
Conference Paper
Full-text available
We focus on the control of heterogeneous swarms of agents that evolve in a random environment. Control is achieved by introducing special agents: leader and infiltrated (shill) agents. A refined distinction is made between hidden and apparent controlling agents. For each case, we provide an analytically solvable example of swarm dynamics.
Conference Paper
Full-text available
This contribution is addressed to the dynamics of heterogeneous interacting agents evolving on the plane. Heterogeneity is due to the presence of an unfiltered externally controllable fellow, a shill, which via mutual interactions ultimately drives (i.e. soft controls) the whole society towards a given goal. We are able to calculate relevant dynami...
Article
Full-text available
We propose a decentralized method for traffic monitoring, fully distributed over the vehicles. An algorithm is provided, specifying which information should be tracked to reconstruct an instantaneous map of traffic flow. We test the accuracy of our method in a simple cellular automata traffic simulation model, for which the traffic condition can be...
Article
Full-text available
We consider a collection of N homogeneous interacting Brownian agents evolving on the plane. The time continuous individual dynamics are jointly driven by mixed canonical-dissipative (MCD) type dynamics and White Gaussian noise sources. Each agent is permanently at the center of a finite size observation disk D ρ. Steadily with time, agents count t...

Network

Cited By