Brian C. Williams’s research while affiliated with Massachusetts Institute of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (296)


Multi-Agent Vulcan: An Information-Driven Multi-Agent Path Finding Approach
  • Preprint

September 2024

·

5 Reads

Jake Olkin

·

Viraj Parimi

·

Brian Williams

Scientists often search for phenomena of interest while exploring new environments. Autonomous vehicles are deployed to explore such areas where human-operated vehicles would be costly or dangerous. Online control of autonomous vehicles for information-gathering is called adaptive sampling and can be framed as a POMDP that uses information gain as its principal objective. While prior work focuses largely on single-agent scenarios, this paper confronts challenges unique to multi-agent adaptive sampling, such as avoiding redundant observations, preventing vehicle collision, and facilitating path planning under limited communication. We start with Multi-Agent Path Finding (MAPF) methods, which address collision avoidance by decomposing the MAPF problem into a series of single-agent path planning problems. We then present information-driven MAPF which addresses multi-agent information gain under limited communication. First, we introduce an admissible heuristic that relaxes mutual information gain to an additive function that can be evaluated as a set of independent single agent path planning problems. Second, we extend our approach to a distributed system that is robust to limited communication. When all agents are in range, the group plans jointly to maximize information. When some agents move out of range, communicating subgroups are formed and the subgroups plan independently. Since redundant observations are less likely when vehicles are far apart, this approach only incurs a small loss in information gain, resulting in an approach that gracefully transitions from full to partial communication. We evaluate our method against other adaptive sampling strategies across various scenarios, including real-world robotic applications. Our method was able to locate up to 200% more unique phenomena in certain scenarios, and each agent located its first unique phenomenon faster by up to 50%.


Task-driven Risk-bounded Hierarchical Reinforcement Learning Based on Iterative Refinement

May 2024

·

1 Read

Proceedings of the AAAI Symposium Series

Deep Reinforcement Learning (DRL) has garnered substantial acclaim for its versatility and widespread applications across diverse domains. Aligned with human-like learning, DRL is grounded in the fundamental principle of learning from interaction, wherein agents dynamically adjust behavior based on environmental feedback in the form of rewards. This iterative trial-and-error process, mirroring human learning, underscores the importance of observation, experimentation, and feedback in shaping understanding and behavior. DRL agents, trained to navigate complex surroundings, refine their knowledge through hierarchical and abstract representations, empowered by deep neural networks. These representations enable efficient handling of long-horizon tasks and flexible adaptation to novel situations, akin to the human ability to construct mental models for comprehending complex concepts and predicting outcomes. Hence, abstract representation building emerges as a critical aspect in the learning processes of both artificial agents and human learners, particularly in long-horizon tasks. Furthermore, human decision-making, deeply rooted in evolutionary history, exhibits a remarkable capacity to balance the tradeoff between risk and cost across various domains. This cognitive process involves assessing potential negative consequences, evaluating factors such as the likelihood of adverse outcomes, severity of potential harm, and overall uncertainty. Humans intuitively gauge inherent risks and adeptly weigh associated costs, extending beyond monetary expenses to include time, effort, and opportunity costs. The nuanced ability of humans to consider the tradeoff between risk and cost highlights the complexity and adaptability of human decision-making, a skill lacking in typical DRL agents. Principles like these derived from human-like learning present an avenue for inspiring advancements in DRL, fostering the development of more adaptive and intelligent artificial agents. Motivated by these observations and focusing on practical challenges in robotics, our efforts target risk-aware stochastic sequential decision-making problem which is crucial for tasks with extended time frames and varied strategies. A novel integration of model-based conditional planning with DRL is proposed, inspired by hierarchical techniques. This approach breaks down complex tasks into manageable subtasks(motion primitives), ensuring safety constraints and informed decision-making. Unlike existing methods, our approach addresses motion primitive improvement iteratively, employing diverse prioritization functions to guide the search process effectively. This risk-bounded planning algorithm seamlessly integrates conditional planning and motion primitive learning, prioritizing computational efforts for enhanced efficiency within specified time limits.






Motion Planning Under Uncertainty with Complex Agents and Environments via Hybrid Search (Extended Abstract)

August 2023

·

4 Reads

As autonomous systems tackle more real-world situations, mission success oftentimes cannot be guaranteed and the planner must reason about the probability of failure. Unfortunately, computing a trajectory that satisfies mission goals while constraining the probability of failure is difficult because of the need to reason about complex, multidimensional probability distributions. Recent methods have seen success using chance-constrained, model-based planning. We argue there are two main drawbacks to these approaches. First, current methods suffer from an inability to deal with expressive environment models such as 3D non-convex obstacles. Second, most planners rely on considerable simplifications when computing trajectory risk including approximating the agent's dynamics, geometry, and uncertainty. We apply hybrid search to the risk-bound, goal-directed planning problem. The hybrid search consists of a region planner and a trajectory planner. The region planner makes discrete choices by reasoning about geometric regions that the agent should visit in order to accomplish its mission. In formulating the region planner, we propose landmark regions that help produce obstacle-free paths. The region planner passes paths through the environment to a trajectory planner; the task of the trajectory planner is to optimize trajectories that respect the agent's dynamics and the user's desired risk of mission failure. We discuss three approaches to modeling trajectory risk: a CDF-based approach, a sampling-based collocation method, and an algorithm named Shooting Method Monte Carlo. A variety of 2D and 3D test cases are presented in the full paper including a linear case, a Dubins car model, and an underwater autonomous vehicle. The method is shown to outperform other methods in terms of speed and utility of the solution. Additionally, the models of trajectory risk are shown to better approximate risk in simulation.


Convex risk-bounded continuous-time trajectory planning and tube design in uncertain nonconvex environments

July 2023

·

8 Reads

The International Journal of Robotics Research

In this paper, we address the trajectory planning problem in uncertain nonconvex static and dynamic environments that contain obstacles with probabilistic location, size, and geometry. To address this problem, we provide a risk-bounded trajectory planning method that looks for continuous-time trajectories with guaranteed bounded risk over the planning time horizon. Risk is defined as the probability of collision with uncertain obstacles. Existing approaches to address risk-bounded trajectory planning problems either are limited to Gaussian uncertainties and convex obstacles or rely on sampling-based methods that need uncertainty samples and time discretization. To address the risk-bounded trajectory planning problem, we leverage the notion of risk contours to transform the risk-bounded planning problem into a deterministic optimization problem. Risk contours are the set of all points in the uncertain environment with guaranteed bounded risk. The obtained deterministic optimization is, in general, nonlinear and nonconvex time-varying optimization. We provide convex methods based on sum-of-squares optimization to efficiently solve the obtained nonconvex time-varying optimization problem and obtain the continuous-time risk-bounded trajectories without time discretization. The provided approach deals with arbitrary (and known) probabilistic uncertainties, nonconvex and nonlinear, static and dynamic obstacles, and is suitable for online trajectory planning problems. In addition, we provide convex methods based on sum-of-squares optimization to build the max-sized tube with respect to its parameterization along the trajectory so that any state inside the tube is guaranteed to have bounded risk.


Figure 1: Example pointed plausibility model with legend
Figure 2: Plausibility model for nested beliefs on plans For example, Figure 2 captures the agents' nested beliefs on plans from Case1. Each world in the plausibility model is now a knowledge base that contains the constraints of the task. Since H (human) finds w 2 more plausible, H believes that constraint C1 does not need to hold. An example action where the robot announces the intent of coffee is shown in Figure 3 (left). The action has a single event whose precondition is that R (robot) must believe that adding the constraint of coffee is satisfiable given its belief of the current feasible plans. As a result of the action, all worlds now have the constraint of coffee added, including w 2 that H believes in.
Figure 3: intent announcement (left) & resulting state (right)
Figure 5: Example execution action of e i with ordering constraints ⟨e j , e i , guard(o j )⟩ and ⟨e k , e i , guard(o k )⟩
Figure 6: Explanation action (left) and resulting state (right)

+2

Adaptation and Communication in Human-Robot Teaming to Handle Discrepancies in Agents' Beliefs about Plans
  • Preprint
  • File available

July 2023

·

39 Reads

When agents collaborate on a task, it is important that they have some shared mental model of the task routines -- the set of feasible plans towards achieving the goals. However, in reality, situations often arise that such a shared mental model cannot be guaranteed, such as in ad-hoc teams where agents may follow different conventions or when contingent constraints arise that only some agents are aware of. Previous work on human-robot teaming has assumed that the team has a set of shared routines, which breaks down in these situations. In this work, we leverage epistemic logic to enable agents to understand the discrepancy in each other's beliefs about feasible plans and dynamically plan their actions to adapt or communicate to resolve the discrepancy. We propose a formalism that extends conditional doxastic logic to describe knowledge bases in order to explicitly represent agents' nested beliefs on the feasible plans and state of execution. We provide an online execution algorithm based on Monte Carlo Tree Search for the agent to plan its action, including communication actions to explain the feasibility of plans, announce intent, and ask questions. Finally, we evaluate the success rate and scalability of the algorithm and show that our agent is better equipped to work in teams without the guarantee of a shared mental model.

Download


Citations (41)


... The concept of risk contours was proposed in [18] representing the risk contour problem as a risk constraint problem and using the theory of moments and nonnegative polynomials to provide a convex optimization in the form of sum of squares optimization. In [19] considering non-Gaussian uncertainties, motion planning for stochastic nonlinear systems is completed by using offline-constructed discrete-time motion primitives and their corresponding continuous-time tubes. Probabilistic surrogate reliability and risk contours were introduced in [20] with the minimal covering disk calculation for the vehicle model. ...

Reference:

RALTPER: A Risk-Aware Local Trajectory Planner for Complex Environment with Gaussian Uncertainty
Real-Time Tube-Based Non-Gaussian Risk Bounded Motion Planning for Stochastic Nonlinear Systems in Uncertain Environments via Motion Primitives
  • Citing Conference Paper
  • October 2023

... To handle non-Gaussian uncertainty, a scenario-based approach is proposed in [21], but only for problems that can be formulated with convex constraints. Further, the authors in [22], and [23] utilize moment-based methodologies for planning safe trajectories based on chance-constraints, without making any Gaussian and convexity assumptions. However, they do not consider system dynamics, and instead, it is assumed that a behavior prediction system provides the distribution of future positions for the agent over the planning horizon. ...

Risk Contours Map for Risk Bounded Motion Planning under Perception Uncertainties
  • Citing Conference Paper
  • June 2019

... Firstly, after learning a set of motion primitives between landmarks, we utilize conditional planning to derive a policy, ensuring the failure probability stays below the user-specified threshold. Departing from treating this primitive chaining problem as a conventional path planning task, we approach it as a constrained stochastic shortest path problem (C-SSP) (Hong and Williams 2023). This allows us to manage the failure probability associated with the motion primitives effectively. ...

An Anytime Algorithm For Constrained Stochastic Shortest Path Problems With Deterministic Policies
  • Citing Article
  • December 2022

Artificial Intelligence

... [79] employs a diffusion model for joint multi-agent motion prediction and introduces a constrained sampling framework for controlled trajectory sampling, while [80] utilizes a conditional diffusion model to create a controllable and realistic multi-agent traffic model. Finally, [42] and [81] apply deep learning techniques to simulate realistic and reactive agent trajectories. ...

InterSim: Interactive Traffic Simulation via Explicit Relation Modeling
  • Citing Conference Paper
  • October 2022

... Mirchevska et al. [8] used fitted Q-learning for high-level decisionmaking on a busy simulated highway. However, the microscopic traffic flows of these studies are based on rule-based models, such as the Intelligent Driver Model (IDM) [9][10][11] and the Minimize Overall Braking Induced by Lane Change (MOBIL) model. These are mathematical models based on traffic flow theory [12]. ...

TIP: Task-Informed Motion Prediction for Intelligent Vehicles
  • Citing Conference Paper
  • October 2022

... Observe the target graph of task N2, and find that the point F on the boundary is the closest to the ellipse, now the problem is transformed into finding the shortest distance from the point F to the ellipse, and establish a nonlinear planning model [5]: The solution is performed and visualised using MATLAB and the results are given below: The shortest path length is 7.78, so the total length of the null journey is and the optimal route is as follows: ...

An Anytime Algorithm for Chance Constrained Stochastic Shortest Path Problems and Its Application to Aircraft Routing
  • Citing Article
  • January 2021

... In particular, recent advances in motion generation [30,36,55] have shown that factorizing the joint distribution of multi-agent time-series in a social autoregressive manner [1,32,44] can better characterize the evolution of traffic scenarios. On the other hand, some works utilize sequential modeling in the agent dimension for multi-agent motion prediction [33,43]. For example, M2I [43] uses heuristic methods to label influencers and reactors from pairs of agents, followed by predicting the marginal distribution of the influencers and the conditional distribution of the reactors. ...

M2I: From Factored Marginal Trajectory Prediction to Interactive Prediction
  • Citing Conference Paper
  • June 2022

... Several machine learning-based approaches have been developed to address intersection challenges, such as imitation learning [5, 8-10, 13, 14], online planning [6,15,16,42], and offline learning [7,19,21,22,39,44,47]. In imitation learning, policies are learned from human drivers, yet their efficacy falls short in scenarios where the agent encounters states beyond its training data. ...

Online Risk-Bounded Motion Planning for Autonomous Vehicles in Dynamic Environments
  • Citing Article
  • May 2021

Proceedings of the International Conference on Automated Planning and Scheduling

... Reinforcement learning can be utilized in tasks such as imperfect-information game (Liu et al., 2022), (Bertsimas and Paskov, 2022), and imitating animal or human cognition (Mitchener et al., 2022). Motion planning is a domain that tries to model motions of autonomously operating systems such as self-driving cars (Strawser and Williams, 2022). Motion planning can be utilized while developing autonomous vehicles (Strawser and Williams, 2022). ...

Motion Planning Under Uncertainty with Complex Agents and Environments via Hybrid Search
  • Citing Article
  • September 2022

Journal of Artificial Intelligence Research

... Information-theoretic methods to generate intrinsic motivations to control various scenarios have been used for a number of years [11]- [23]. One insight is that the maximal potential mutual information between actions and observations acts as a meaningful signal for learning nontrivial behaviour. ...

An Empowerment-based Solution to Robotic Manipulation Tasks with Sparse Rewards

Autonomous Robots