Mykel John Kochenderfer

Mykel John Kochenderfer
  • Ph.D.
  • Professor (Assistant) at Stanford University

About

533
Publications
65,698
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,847
Citations
Introduction
Current institution
Stanford University
Current position
  • Professor (Assistant)
Additional affiliations
December 2006 - October 2013
Massachusetts Institute of Technology
Position
  • Technical Staff
November 2013 - present
Stanford University
Position
  • Professor (Assistant)
October 2003 - June 2006
University of Edinburgh
Position
  • PhD Student

Publications

Publications (533)
Preprint
The world is entering an unprecedented period of critical mineral demand, driven by the global transition to renewable energy technologies and electric vehicles. This transition presents unique challenges in mineral resource development, particularly due to geological uncertainty-a key characteristic that traditional supply chain optimization appro...
Preprint
We tackle the problem of informative input design for system identification, where we select inputs, observe the corresponding outputs from the true system, and optimize the parameters of our model to best fit the data. We propose a methodology that is compatible with any system and parametric family of models. Our approach only requires input-outp...
Preprint
Full-text available
Importance sampling is a rare event simulation technique used in Monte Carlo simulations to bias the sampling distribution towards the rare event of interest. By assigning appropriate weights to sampled points, importance sampling allows for more efficient estimation of rare events or tails of distributions. However, importance sampling can fail wh...
Preprint
Full-text available
Flight test analysis often requires predefined test points with arbitrarily tight tolerances, leading to extensive and resource-intensive experimental campaigns. To address this challenge, we propose a novel approach to flight test analysis using Gaussian processes (GPs) with physics-informed mean functions to estimate aerodynamic quantities from a...
Preprint
Full-text available
We introduce Gradient Agreement Filtering (GAF) to improve on gradient averaging in distributed deep learning optimization. Traditional distributed data-parallel stochastic gradient descent involves averaging gradients of microbatches to calculate a macrobatch gradient that is then used to update model parameters. We find that gradients across micr...
Preprint
Full-text available
Decentralized partially observable Markov decision processes with communication (Dec-POMDP-Com) provide a framework for multiagent decision making under uncertainty, but the NEXP-complete complexity renders solutions intractable in general. While sharing actions and observations can reduce the complexity to PSPACE-complete, we propose an approach t...
Preprint
Optimal control problems with state distribution constraints have attracted interest for their expressivity, but solutions rely on linear approximations. We approach the problem of driving the state of a dynamical system in distribution from a sequential decision-making perspective. We formulate the optimal control problem as an appropriate Markov...
Preprint
Full-text available
Public AI benchmark results are widely broadcast by model developers as indicators of model quality within a growing and competitive market. However, these advertised scores do not necessarily reflect the traits of interest to those who will ultimately apply AI models. In this paper, we seek to understand if and how AI benchmarks are used to inform...
Preprint
Full-text available
Estimating the probability of failure is a critical step in developing safety-critical autonomous systems. Direct estimation methods such as Monte Carlo sampling are often impractical due to the rarity of failures in these systems. Existing importance sampling approaches do not scale to sequential decision-making systems with large state spaces and...
Preprint
Full-text available
AI models are increasingly prevalent in high-stakes environments, necessitating thorough assessment of their capabilities and risks. Benchmarks are popular for measuring these attributes and for comparing model performance, tracking progress, and identifying weaknesses in foundation and non-foundation models. They can inform model selection for dow...
Preprint
This paper addresses the problem of task planning for robots that must comply with operational manuals in real-world settings. Task planning under these constraints is essential for enabling autonomous robot operation in domains that require adherence to domain-specific knowledge. Current methods for generating robot goals and plans rely on common...
Preprint
Full-text available
While working within the spatial domain can pose problems associated with ill-conditioned scores caused by power-law decay, recent advances in diffusion-based generative models have shown that transitioning to the wavelet domain offers a promising alternative. However, within the wavelet domain, we encounter unique challenges, especially the sparse...
Preprint
Deciding on appropriate mechanical ventilator management strategies significantly impacts the health outcomes for patients with respiratory diseases. Acute Respiratory Distress Syndrome (ARDS) is one such disease that requires careful ventilator operation to be effectively treated. In this work, we frame the management of ventilators for patients w...
Preprint
To achieve autonomy in complex real-world exploration missions, we consider deployment strategies for a team of robots with heterogeneous autonomy capabilities. In this work, we formulate a multi-robot exploration mission and compute an operation policy to maintain robot team productivity and maximize mission rewards. The environment description, r...
Preprint
Full-text available
Responsible AI (RAI) has emerged as a major focus across industry, policymaking, and academia, aiming to mitigate the risks and maximize the benefits of AI, both on an organizational and societal level. This study explores the global state of RAI through one of the most extensive surveys to date on the topic, surveying 1000 organizations across 20...
Preprint
The transfer of patients between two aircraft using an underway watercraft increases medical evacuation reach and flexibility in maritime environments. The selection of any one of multiple underway watercraft for patient exchange is complicated by participating aircraft utilization history and a participating watercraft position and velocity. The s...
Preprint
Full-text available
This paper presents a solution to the problem of optimal ground station selection for low-Earth orbiting (LEO) space missions that enables mission operators to precisely design their ground segment performance and costs. Space mission operators are increasingly turning to Ground-Station-as-a-Service (GSaaS) providers to supply the terrestrial commu...
Preprint
Full-text available
Efficiently estimating system dynamics from data is essential for minimizing data collection costs and improving model performance. This work addresses the challenge of designing future control inputs to maximize information gain, thereby improving the efficiency of the system identification process. We propose an approach that integrates informati...
Preprint
Testing controllers in safety-critical systems is vital for ensuring their safety and preventing failures. In this paper, we address the falsification problem within learning-based closed-loop control systems through simulation. This problem involves the identification of counterexamples that violate system safety requirements and can be formulated...
Preprint
Ambulance exchange points are preidentified sites where patients are transferred between evacuation platforms while en route to enhanced medical care. We propose a new capability for maritime medical evacuation, which involves co-opting underway watercraft as overwater ambulance exchange points to transfer patients between medical evacuation aircra...
Chapter
In the absence of a comprehensive stroke policy, stroke patients are typically routed to the nearest hospital, a method that fails to account for the varying capabilities of hospitals to treat strokes of different severities. These patients, when routed to a facility incapable of providing necessary and timely treatment, often experience critical d...
Conference Paper
To plan safely in uncertain environments, agents must balance utility with safety constraints. Safe planning problems can be modeled as a chance-constrained partially observable Markov decision process (CC-POMDP) and solutions often use expensive rollouts or heuristics to estimate the optimal value and action-selection policy. This work introduces...
Conference Paper
Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP...
Article
Full-text available
Introduction Sexual victimization (SV) is common among college women, with approximately half of those who have experienced SV meeting criteria for posttraumatic stress disorder (PTSD) within a year. Both SV and PTSD are associated with alcohol misuse among college women, often explained by the self-medication hypothesis. Existing literature focuse...
Preprint
Environment prediction frameworks are critical for the safe navigation of autonomous vehicles (AVs) in dynamic settings. LiDAR-generated occupancy grid maps (L-OGMs) offer a robust bird's-eye view for the scene representation, enabling self-supervised joint scene predictions while exhibiting resilience to partial observability and perception detect...
Preprint
Full-text available
This paper addresses the challenge of probabilistic parameter estimation given measurement uncertainty in real-time. We provide a general formulation and apply this to pose estimation for an autonomous visual landing system. We present three probabilistic parameter estimators: a least-squares sampling approach, a linear approximation method, and a...
Preprint
Training intelligent agents to navigate highly interactive environments presents significant challenges. While guided meta reinforcement learning (RL) approach that first trains a guiding policy to train the ego agent has proven effective in improving generalizability across various levels of interaction, the state-of-the-art method tends to be ove...
Preprint
Full-text available
AI progress is creating a growing range of risks and opportunities, but it is often unclear how they should be navigated. In many cases, the barriers and uncertainties faced are at least partly technical. Technical AI governance, referring to technical analysis and tools for supporting the effective governance of AI, seeks to address such challenge...
Preprint
Full-text available
Typical schemes for automated red-teaming large language models (LLMs) focus on discovering prompts that trigger a frozen language model (the defender) to generate toxic text. This often results in the prompting model (the adversary) producing text that is unintelligible and unlikely to arise. Here, we propose a reinforcement learning formulation o...
Article
The integration of manned and unmanned aircraft can help improve wildfire response. Wildfire containment failures occur when the resources available to first responders, who execute the initial stages of wildfire management referred to as the initial attack, are ineffective or insufficient. Initial attack surveillance and suppression models have li...
Article
Min-max problems are important in multi-agent sequential decision-making because they improve the performance of the worst-performing agent in the network. However, solving the multi-agent min-max problem is challenging. We propose a modular, distributed, online planning-based algorithm that is able to approximate the solution of the min-max object...
Preprint
The integration of manned and unmanned aircraft can help improve wildfire response. Wildfire containment failures occur when resources available to first responders, who execute the initial stages of wildfire management referred to as the initial attack, are ineffective or insufficient. Initial attack surveillance and suppression models have linked...
Preprint
Full-text available
Validating safety-critical autonomous systems in high-dimensional domains such as robotics presents a significant challenge. Existing black-box approaches based on Markov chain Monte Carlo may require an enormous number of samples, while methods based on importance sampling often rely on simple parametric families that may struggle to represent the...
Article
Lagrangian-guided Monte Carlo tree search with global dual ascent has been applied to solve large constrained partially observable Markov decision processes (CPOMDPs) online. In this work, we demonstrate that these global dual parameters can lead to myopic action selection during exploration, ultimately leading to suboptimal decision making. To add...
Preprint
Min-max problems are important in multi-agent sequential decision-making because they improve the performance of the worst-performing agent in the network. However, solving the multi-agent min-max problem is challenging. We propose a modular, distributed, online planning-based algorithm that is able to approximate the solution of the min-max object...
Article
Estimating the probability of failure is an important step in the certification of safety-critical systems. Efficient estimation methods are often needed due to the challenges posed by high-dimensional input spaces, risky test scenarios, and computationally expensive simulators. This work frames the problem of black-box safety validation as a Bayes...
Preprint
This paper addresses the problem of object-goal navigation in autonomous inspections in real-world environments. Object-goal navigation is crucial to enable effective inspections in various settings, often requiring the robot to identify the target object within a large search space. Current object inspection methods fall short of human efficiency...
Article
Full-text available
Low-enthalpy geothermal energy can make a major contribution towards reducing CO2 emissions. However, the development of geothermal reservoirs is costly and time intensive. In particular, high capital expenditures, data acquisition costs, and long periods of time from identifying a geothermal resource to geothermal heat extraction make geothermal f...
Article
Full-text available
Given the high energy demands for heating and cooling, and the currently limited use of renewable sources, there is a pressing need for efficient and economically viable geothermal development. However, the success of new geothermal reservoir projects hinges on numerous uncertain geological and economic factors. Prior to development, the project un...
Preprint
Full-text available
This paper explores the integration of optimal transport (OT) theory with multi-agent reinforcement learning (MARL). This integration uses OT to handle distributions and transportation problems to enhance the efficiency, coordination, and adaptability of MARL. There are five key areas where OT can impact MARL: (1) policy alignment, where OT's Wasse...
Article
Deep reinforcement learning (DRL) provides a promising way for intelligent agents (e.g., autonomous vehicles) to learn to navigate complex scenarios. However, DRL with neural networks as function approximators is typically considered a black box with little explainability and often suffers from suboptimal performance, especially for autonomous navi...
Article
This paper presents an appendix to the original NeBula autonomy solution [Agha et al., 2021] developed by the TEAM CoSTAR (Collaborative SubTerranean Autonomous Robots), participating in the DARPA Subterranean Challenge. Specifically, this paper presents extensions to NeBula’s hardware, software, and algorithmic components that focus on increasing...
Article
Training intelligent agents to navigate highly interactive environments presents significant challenges. While guided meta reinforcement learning (RL) approach that first trains a guiding policy to train the ego agent has proven effective in improving generalizability across scenarios with various levels of interaction, the state-of-the-art method...
Article
Full-text available
Designing reward functions is a difficult task in AI and robotics. The complex task of directly specifying all the desirable behaviors a robot needs to optimize often proves challenging for humans. A popular solution is to learn reward functions using expert demonstrations. This approach, however, is fraught with many challenges. Some methods requi...
Article
Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP...
Preprint
Full-text available
As machine learning (ML) systems increasingly permeate high-stakes settings such as healthcare, transportation, military, and national security, concerns regarding their reliability have emerged. Despite notable progress, the performance of these systems can significantly diminish due to adversarial attacks or environmental changes, leading to over...
Preprint
Although deep reinforcement learning (DRL) has shown promising results for autonomous navigation in interactive traffic scenarios, existing work typically adopts a fixed behavior policy to control social vehicles in the training environment. This may cause the learned driving policy to overfit the environment, making it difficult to interact well w...
Preprint
Perception systems operate as a subcomponent of the general autonomy stack, and perception system designers often need to optimize performance characteristics while maintaining safety with respect to the overall closed-loop system. For this reason, it is useful to distill high-level safety requirements into component-level requirements on the perce...
Article
Rather than augmenting rewards with penalties for undesired behavior, Constrained Partially Observable Markov Decision Processes (CPOMDPs) plan safely by imposing inviolable hard constraint value budgets. Previous work performing online planning for CPOMDPs has only been applied to discrete action and observation spaces. In this work, we propose al...
Preprint
Safe and reliable state estimation techniques are a critical component of next-generation robotic systems. Agents in such systems must be able to reason about the intentions and trajectories of other agents for safe and efficient motion planning. However, classical state estimation techniques such as Gaussian filters often lack the expressive power...
Preprint
Designing robust machine learning systems remains an open problem, and there is a need for benchmark problems that cover both environmental changes and evaluation on a downstream task. In this work, we introduce AVOIDDS, a realistic object detection benchmark for the vision-based aircraft detect-and-avoid problem. We provide a labeled dataset consi...
Article
For the safe deployment of multirobot systems, accurate position estimation is necessary. We present a multirobot navigation algorithm that accomplishes the dual objective of reaching designated goal positions and maintaining low position uncertainty. We pose the problem as a [Formula: see text]-partially observable Markov decision process ([Formul...
Preprint
Full-text available
Real-world planning problems$\unicode{x2014}$including autonomous driving and sustainable energy applications like carbon storage and resource exploration$\unicode{x2014}$have recently been modeled as partially observable Markov decision processes (POMDPs) and solved using approximate methods. To solve high-dimensional POMDPs in practice, state-of-...
Preprint
One of the bottlenecks of training autonomous vehicle (AV) agents is the variability of training environments. Since learning optimal policies for unseen environments is often very costly and requires substantial data collection, it becomes computationally intractable to train the agent on every possible environment or task the AV may encounter. Th...
Preprint
Full-text available
Estimating the distribution over failures is a key step in validating autonomous systems. Existing approaches focus on finding failures for a small range of initial conditions or make restrictive assumptions about the properties of the system under test. We frame estimating the distribution over failure trajectories for sequential systems as Bayesi...
Preprint
Full-text available
Safety validation is a crucial component in the development and deployment of autonomous systems, such as self-driving vehicles and robotic systems. Ensuring safe operation necessitates extensive testing and verification of control policies, typically conducted in simulation environments. High-fidelity simulators accurately model real-world dynamic...
Preprint
Full-text available
Accurately estimating the probability of failure for safety-critical systems is important for certification. Estimation is often challenging due to high-dimensional input spaces, dangerous test scenarios, and computationally expensive simulators; thus, efficient estimation techniques are important to study. This work reframes the problem of black-b...
Preprint
Full-text available
To combat global warming and mitigate the risks associated with climate change, carbon capture and storage (CCS) has emerged as a crucial technology. However, safely sequestering CO2 in geological formations for long-term storage presents several challenges. In this study, we address these issues by modeling the decision-making process for carbon s...
Preprint
Full-text available
To achieve autonomy in unknown and unstructured environments, we propose a method for semantic-based planning under perceptual uncertainty. This capability is crucial for safe and efficient robot navigation in environment with mobility-stressing elements that require terrain-specific locomotion policies. We propose the Semantic Belief Graph (SBG),...
Preprint
Realistic aircraft trajectory models are useful in the design and validation of air traffic management (ATM) systems. Models of aircraft operated under instrument flight rules (IFR) require capturing the variability inherent in how aircraft follow standard flight procedures. The variability in aircraft behavior varies among flight stages. In this p...
Preprint
Full-text available
Robots that can effectively understand human intentions from actions are crucial for successful human-robot collaboration. In this work, we address the challenge of a robot navigating towards an unknown goal while also accounting for a human's preference for a particular path in the presence of obstacles. This problem is particularly challenging wh...
Preprint
Full-text available
We present a method for solving the coverage problem with the objective of autonomously exploring an unknown environment under mission time constraints. Here, the robot is tasked with planning a path over a horizon such that the accumulated area swept out by its sensor footprint is maximized. Because this problem exhibits a diminishing returns prop...
Preprint
Full-text available
In active source seeking, a robot takes repeated measurements in order to locate a signal source in a cluttered and unknown environment. A key component of an active source seeking robot planner is a model that can produce estimates of the signal at unknown locations with uncertainty quantification. This model allows the robot to plan for future me...
Article
Full-text available
As neural networks (NNs) become more prevalent in safety-critical applications such as control of vehicles, there is a growing need to certify that systems with NN components are safe. This paper presents a set of backward reachability approaches for safety certification of neural feedback loops (NFLs), i.e., closed-loop systems with NN control pol...
Article
Effective interaction modeling and behavior prediction of dynamic agents play a significant role in interactive motion planning for autonomous robots. Although existing methods have improved prediction accuracy, few research efforts have been devoted to enhancing prediction model interpretability and out-of-distribution (OOD) generalizability. This...
Preprint
Full-text available
Simulation-based falsification is a practical testing method to increase confidence that the system will meet safety requirements. Because full-fidelity simulations can be computationally demanding, we investigate the use of simulators with different levels of fidelity. As a first step, we express the overall safety specification in terms of enviro...
Preprint
Rather than augmenting rewards with penalties for undesired behavior, Constrained Partially Observable Markov Decision Processes (CPOMDPs) plan safely by imposing inviolable hard constraint value budgets. Previous work performing online planning for CPOMDPs has only been applied to discrete action and observation spaces. In this work, we propose al...
Preprint
Full-text available
An important step in the design of autonomous systems is to evaluate the probability that a failure will occur. In safety-critical domains, the failure probability is extremely small so that the evaluation of a policy through Monte Carlo sampling is inefficient. Adaptive importance sampling approaches have been developed for rare event estimation b...

Network

Cited By