ArticlePDF Available

Abstract

This manual describes the syntax of PDDL, the Planning Domain Definition Language, the problem-specification language for the AIPS-98 planning competition. The language has roughly the the expressiveness of Pednault's ADL [10] for propositions, and roughly the expressiveness of UMCP [6] for actions. Our hope is to encourage empirical evaluation of planner performance, and development of standard sets of problems all in comparable notations. 1 Introduction This manual describes the syntax, and, less formally, the semantics, of the Planning Domain Definition Language (PDDL). The language supports the following syntactic features: ffl Basic STRIPS-style actions ffl Conditional effects ffl Universal quantification over dynamic universes (i.e., object creation and destruction), ffl Domain axioms over stratified theories, ffl Specification of safety constraints. ffl Specification of hierarchical actions composed of subactions and subgoals. ffl Management of multiple problems in mul...
... We introduce TheoryCoder 1 , an agent that generalizes quickly to new problem domains by leveraging prior knowledge in the form of general-purpose, high-level abstractions (expressed in the Planning Domain Definition Language, or PDDL; Ghallab et al., 1998;McDermott, 2000). TheoryCoder grounds these abstractions in particular environments by synthesizing a low-level transition model using LLM queries. ...
... The abstract states are referred to as predicates, and the abstract actions are referred to as operators. Planning problems can be expressed in PDDL (Ghallab et al., 1998) using these operators and predicates. For a given planning problem, the domain and problem files represent the actions available in a state and the initial state of the task. ...
... Our system's theory language is composed of two programming languages: PDDL 1.2 (Ghallab et al., 1998) and Python, which together represent the domain hierarchically. PDDL represents the domain using highlevel abstractions and handles the high-level planning for TheoryCoder. ...
Preprint
Full-text available
Modern reinforcement learning (RL) systems have demonstrated remarkable capabilities in complex environments, such as video games. However, they still fall short of achieving human-like sample efficiency and adaptability when learning new domains. Theory-based reinforcement learning (TBRL) is an algorithmic framework specifically designed to address this gap. Modeled on cognitive theories, TBRL leverages structured, causal world models - "theories" - as forward simulators for use in planning, generalization and exploration. Although current TBRL systems provide compelling explanations of how humans learn to play video games, they face several technical limitations: their theory languages are restrictive, and their planning algorithms are not scalable. To address these challenges, we introduce TheoryCoder, an instantiation of TBRL that exploits hierarchical representations of theories and efficient program synthesis methods for more powerful learning and planning. TheoryCoder equips agents with general-purpose abstractions (e.g., "move to"), which are then grounded in a particular environment by learning a low-level transition model (a Python program synthesized from observations by a large language model). A bilevel planning algorithm can exploit this hierarchical structure to solve large domains. We demonstrate that this approach can be successfully applied to diverse and challenging grid-world games, where approaches based on directly synthesizing a policy perform poorly. Ablation studies demonstrate the benefits of using hierarchical abstractions.
... We prompt existing LLMs for high-level task anticipation, use the Planning Domain Definition Language (PDDL) [6] as the action language, and use the Fast Downward (FD) solver [7] to generate fine-granularity plans for any given task. We evaluate our framework's abilities in VirtualHome, a realistic simulation environment [8], and in complex household scenarios involving multiple tasks, rooms, objects, and actions. ...
... Integrating LLMs with PDDL: Given the existing literature on using PDDL to encode prior knowledge for planning [6], recent papers have emphasized the need for such planning in combination with LLMs in complex domains [18], [25]. LLMs have been used to generate (or translate prior knowledge to) goal states to be achieved by a classical (PDDL-based) planner [26], [27]. ...
... The tasks anticipated by the LLM are considered as goals, and our framework uses a classical planner to compute the sequence of finer-granularity actions to be executed to jointly achieve these goals. As stated earlier, the planner uses domain-specific knowledge in the form of a theory of actions; we use the STRIPS [29] subset of PDDL [6] enriched with types, negative preconditions, and action costs as the action language to describe this theory. We also focus on goal-based problems and discrete actions with deterministic effects; other action languages can be used to represent durative actions [30] or non-determinism [31]. ...
Preprint
Full-text available
Assistive agents performing household tasks such as making the bed or cooking breakfast often compute and execute actions that accomplish one task at a time. However, efficiency can be improved by anticipating upcoming tasks and computing an action sequence that jointly achieves these tasks. State-of-the-art methods for task anticipation use data-driven deep networks and Large Language Models (LLMs), but they do so at the level of high-level tasks and/or require many training examples. Our framework leverages the generic knowledge of LLMs through a small number of prompts to perform high-level task anticipation, using the anticipated tasks as goals in a classical planning system to compute a sequence of finer-granularity actions that jointly achieve these goals. We ground and evaluate our framework's abilities in realistic scenarios in the VirtualHome environment and demonstrate a 31% reduction in execution time compared with a system that does not consider upcoming tasks.
... Domain Description Language or PDDL [25]) to discretise the robot's environment and possible states. Afterwards, the task planner (e.g., Fast ...
... Fast Downward can efficiently solve general propositional planning tasks specified with the PDDL language [25]. It exploits the hierarchical structure inherent in causal graphs. ...
... Planning) to find the discrete sequence of symbolic actions and geometric values to fulfil a high-level goal. The task planner uses a language like PDDL[25] to discretise the robot's environment and possible states. Then, the task planner (e.g., FastDownward[51]) translates this representation into a graph in which every node corresponds to a possible robot's state. ...
Thesis
Full-text available
The world population is ageing, leading to an increasing number of older adults who will suffer from geriatric syndromes such as frailty, delirium, or falls, limiting their ability to perform activities of daily living (ADLs) independently. These individuals will require assistance from family or professional caregivers, who can experience burnout due to caregiving's physical and emotional toll. Service robots offer a potential solution to alleviate the caregiver’s workload by assisting older adults with ADLs. These robots must be capable of dual-arm manipulation, as many ADLs require using both hands to manipulate objects. Service robots exhibit limited dual-arm manipulation abilities owing to several drawbacks in their task and motion planners. Specifically, existing task planners: 1) Spend significant time finding task plans due to combinatorial explosion; 2) Rely on pre-discretisation and programming of the robot's environment by human experts, which limits their ability to learn; 3) Focus on unimanual manipulation rather than dual-arm manipulation. This thesis proposes a novel learning-based efficient task planner using a bio-inspired action context-free grammar, paired with a motion planner, to enable service robots to achieve dual-arm manipulation in household environments. Combining the novel learning-based efficient task planner with the motion planner through a task plan execution framework forms this thesis’s efficient learning-based task and motion planner. This research accomplishes three scientific objectives: 1) Collecting a dataset of human dual-arm manipulation actions by asking subjects to perform three ADLs. A camera records the hands' movements, and an expert annotates the videos following the rules of a bio-inspired action context-free grammar; 2) Training an action prediction model using the Long Short-Term Memory network (LSTM) The resulting model (i.e., the task planner) infers a task plan to realise a high-level goal. The LSTM is coupled with a motion planner that derives the motor control parameters; 3) Integrating this thesis’s task and motion planner into a service robot prototype to achieve dual-arm manipulation of objects. This thesis makes two contributions: 1) BiCap, a novel dataset that includes task plans annotated using the bio-inspired action context-free grammar to develop learning-based robotic dual-arm manipulation methods; 2) A unique, efficient learning-based task planner that couples the LSTM network with the bio-inspired action context-free grammar to mitigate combinatorial explosion. Four experiments were conducted to compare the efficiency of the novel task planner with Fast Downward, a state-of-the-art method. The results showed that the novel task planner significantly outperformed Fast Downward, with average task planning times of 40.22ms compared to 17,020ms for Fast Downward. Additionally, the novel planner demonstrated an ability to mitigate combinatorial explosion, maintaining consistently lower task planning times even as the complexity of the planning domain increased, with more objects and symbolic locations. The proposed planner’s average times ranged from 2.37ms to 3.92ms, while Fast Downward's ranged from 173.75ms to 707ms. The practicality of this thesis’s task and motion planner was validated by integrating it into a simulated and physical dual-arm robot prototype, which performed three ADLs: “pouring,” “passing,” and “opening.”
... Task inference: The world model instantiated with objects keeps track of symbolic information, such as whether the manipulator is holding an object or whether a bottle contains water. The task inference module identifies which tasks are currently possible, based on the state of the objects in the world model and the symbolic task preconditions, represented in the Planning Domain Definition Language (PDDL) 63 . Based on the EE distance to the target of the task and the task constraints, the inference additionally keeps track of how likely each task is to be the user's current desired task. ...
Article
Full-text available
Mobile manipulation aids aim at enabling people with motor impairments to physically interact with their environment. To facilitate the operation of such systems, a variety of components, such as suitable user interfaces and intuitive control of the system, play a crucial role. In this article, we validate our highly integrated assistive robot EDAN, operated by an interface based on bioelectrical signals, combined with shared control and a whole-body coordination of the entire system, through a case study involving people with motor impairments to accomplish real-world activities. Three individuals with amyotrophia were able to perform a range of everyday tasks, including pouring a drink, opening and driving through a door, and opening a drawer. Rather than considering these tasks in isolation, our study focuses on the continuous execution of long sequences of realistic everyday tasks.
... We examined the natural language descriptions our participants provided to identify recurring semantic components, which we then mapped onto elements in our DSL, iterating between translating more programs and updating the DSL grammar. We began by attempting to translate directly into PDDL 25 , which offers a basic representation for specifying planning problems, but deviated from it as we encountered game elements our participants specified with no clear PDDL analogues. We assume that the translation process is not lossless, as there are probably multiple natural language descriptions for each underlying set of game semantics and multiple programmatic encodings of vague natural language descriptions; however, we aimed to develop representations that capture the core semantics of the rich, generative and creative structure in goals. ...
Article
Full-text available
People are remarkably capable of generating their own goals, beginning with child’s play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behaviour, models are still far from capturing the richness of everyday human goals. Here we bridge this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-player games), modelling them as reward-producing programs and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints and allow program execution on behavioural traces to evaluate progress. To build a generative model of goals, we learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. We also discovered that our model’s internal fitness scores predict games that are evaluated as more fun to play and more human-like.
... The robotic environment set up for experimentation is detailed in Hmedan et al. (2022; Cf. Figure 5). (Ghallab et al., 1998), is utilized for continuous adaptation of robot behavior in response to perception and human interaction. The Planning solution is managed by the state-of-the-art Solver PDDL4J (Pellier & Fiorino, 2018), employing hierarchical planning to determine the optimal sequence of tasks for the robot to execute, thereby minimizing human exposure. ...
Article
Full-text available
Objective: This study is a proof of concept that aims to measure the impacts of a human/cobot collaboration on the human and his task during a simulated chemistry assembly. Background: The 5th industrial revolution calls for refocusing work on the human operator, placing him or her at the center of the system. Thus, cobotic systems are increasingly implemented to support human work. In this research, we study the impact of a real-life cobot on the performance (e.g. number of errors, time completion), workload, risk exposure and acceptability of participants realizing an industrial-like assembly task. Method: Participants had to reproduce an assembly model with Duplos in collaboration with a cobot in a laboratory setting. The effect of the human expertise on the task (prior to the collaboration) and the level of cobot adaptation to human safety constraints on the performance at the task and on operator were tested. Results: The main results report that expert participants did less mistakes and were less exposed to risks than non-experts. However, both of them succeeded in the task thanks to the cobot adaptation. Also, the cobot was able to adapt to human safety constraints. This adaptation led participants to expose themselves to fewer risks. Also, contrary to previous findings, experts had a similar score of acceptability than non-experts. Conclusion: This laboratory experiment is a proof of concept demonstrating that using a cobotic solution could potentially assist humans in supporting high-risk work operations. Application: Cobotic system designers and work designers could benefit from this research's exploratory results when supporting the design of constraints in workstations for high-risk work operations.
... While there are various approaches to representing planning problems, the current standard planning language is the Planning Domain Definition Language (PDDL) [24]. PDDL is a general-purpose planning language based on the idea of separating the definition of a problem into two files: ...
Preprint
In domains requiring intelligent agents to emulate plausible human-like behaviour, such as formative simulations, traditional techniques like behaviour trees encounter significant challenges. Large Language Models (LLMs), despite not always yielding optimal solutions, usually offer plausible and human-like responses to a given problem. In this paper, we exploit this capability and propose a novel architecture that integrates an LLM for decision-making with a classical automated planner that can generate sound plans for that decision. The combination aims to equip an agent with the ability to make decisions in various situations, even if they were not anticipated during the design phase.
Article
Symbolic task representation is a powerful tool for encoding human instructions and domain knowledge. Such instructions guide robots to accomplish diverse objectives and meet constraints through reinforcement learning (RL). Most existing methods are based on fixed mappings from environmental states to symbols. However, in inspection tasks, where equipment conditions must be evaluated from multiple perspectives to avoid errors of oversight, robots must fulfill the same symbol from different states. To help robots respond to flexible symbol mapping, we propose representing symbols and their mapping specifications separately within an RL policy. This approach imposes on RL policy to learn combinations of symbolic instructions and mapping specifications, requiring an efficient learning framework. To cope with this issue, we introduce an approach for learning flexible policies called Symbolic Instructions with Adjustable Mapping Specifications (SIAMS). This paper represents symbolic instructions using linear temporal logic (LTL), a formal language that can be easily integrated into RL. Our method addresses the diversified completion patterns of instructions by (1) a specification-aware state modulation, which embeds differences in mapping specifications in state features, and (2) a symbol-number-based task curriculum, which gradually provides tasks according to the learning's progress. Evaluations in 3D simulations with discrete and continuous action spaces demonstrate that our method outperforms context-aware multi-task RL comparisons.
Article
Full-text available
PRODIGY is a general-purpose problem-solving architecture du serves as a basis for research in planning, machine learning, apprentice-type knowledge-refinement interfaces, and expert systems. This document is a manual for the latest version of the PRODIGY system, PRODIGY4.0, and includes descriptions of the PRODIGY representation language, control structure, user interface, abstraction module, and other features. The tutorial style is meant to provide the reader with the ability to run PRODIGY and make use of all the basic features, as well as gradually learning the more esoteric aspects of PRODIGY4.0.
Conference Paper
Full-text available
A query evaluation process for a logic data base comprising a set of clauses is described. It is essentially a Horn clause theorem prover augmented with a special inference rule for dealing with negation. This is the negation as failure inference rule whereby ~ P can be inferred if every possible proof of P fails. The chief advantage of the query evaluator described is the effeciency with which it can be implemented. Moreover, we show that the negation as failure rule only allows us to conclude negated facts that could be inferred from the axioms of the completed data base, a data base of relation definitions and equality schemas that we consider is implicitly given by the data base of clauses. We also show that when the clause data base and the queries satisfy certain constraints, which still leaves us with a data base more general than a conventional relational data base, the query evaluation process will find every answer that is a logical consequence of the completed data base.
Article
Full-text available
The last twenty years of AI planning research has discovered a wide variety of planning techniques such as state-space search, hierarchical planning, case-based planning and reactive planning. These techniques have been implemented in numerous planning systems (e.g., [12, 8, 9, 10, 11]). Initially, a number of simple toy domains have been devised to assist in the analysis and evaluation of planning systems and techniques. The most well known examples are "Blocks World" and "Towers of Hanoi." As planning systems grow in sophistication and capabilities, however, there is a clear need for planning benchmarks with matching complexity to evaluate those new features and capabilities. UM Translog is a planning domain designed specifically for this purpose. UM Translog was inspired by the CMU Transport Logistics domain developed by Manuela Veloso. UM Translog is an order of magnitude larger in size (41 actions versus 6), number of features and types interactions. It provides a rich set of enti...
Article
Full-text available
Means-ends analysis is a seemingly well understood search technique, which can be described, using planning terminology, as: keep adding actions that are feasible and achieve pieces of the goal. Unfortunately, it is often the case that no action is both feasible and relevant in this sense. The traditional answer is to make subgoals out of the preconditions of relevant but infeasible actions. These subgoals become part of the search state. An alternative, surprisingly good, idea is to recompute the entire subgoal hierarchy after every action. This hierarchy is represented by a greedy regression-match graph. The actions near the leaves of this graph are feasible and relevant to a sub. . . subgoals of the original goal. Furthermore, each subgoal is assigned an estimate of the number of actions required to achieve it. This number can be shown in practice to be a useful heuristic estimator for domains that are otherwise intractable. Keywords: planning, search, means-ends analysis Reinven...
Article
This paper presents a method of solving planning problems that involve actions whose effects change according to the situations in which they are performed. The approach is an extension of the conventional planning methodology in which plans are constructed through an iterative process of scanning for goals that are not yet satisfied, inserting actions to achieve them, and introducing subgoals to these actions. This methodology was originally developed under the assumption that one would be dealing exclusively with actions that produce the same effects in every situation. The extension involves introducing additional subgoals to actions above and beyond the preconditions of execution normally introduced. These additional subgoals, called secondary preconditions, ensure that the actions are performed in contexts conducive to producing the effects we desire. This paper defines and analyzes secondary preconditions from a mathematically rigorous standpoint and demonstrates how they can be derived from regression operators. Cet article présente une méthode pour résoudre les problèmes de planification qui mettent en cause des actions dont les effets varient selon les situations dans lesquelles elles sont exécutées. Cette méthode est le prolongement du processus de planification traditionnel dans Iequel les plans sont élaborés par le biais d'un procéde iteratif d'analyse des objectifs qui ne sont pas encore rencontrés, introduisant des actions qui produisent les mémes effets dans chaque situation. Le prolongement necessite l'introduction de sous-buts additionnels à des actions au-delá des préconditions d'exécution normalement introduites. Ces sous-buts additionnels, appelés préconditions secondaires, garantissent l'exécution des actions dans des contextes propices à l'obtention des effets désirés. Cet article définit et analyse les préconditions secondaires d'un point de vue mathématique rigoureux et démontre comment elles peuvent ětre dérivées d'opérateurs de régression.
Article
The ucpop partial order planning algorithm handles a subset of the KRSL action representation [1] that corresponds roughly with Pednault's ADL language [12] and prodigy's PDL language [10]. In particular, ucpop operates with actions defined using conditionals and nested quantification. This manual describes a Common Lisp implementation of ucpop, explains how to set up and run the planner, details the action representations syntax, and outlines the main data structures and how they are manipulated. Features include: ffl Universal quantification over dynamic universes (i.e., object creation and destruction), ffl Domain axioms over stratified theories, ffl "Facts" i.e. predicates expanding to lisp code, ffl Specification of safety constraints in domain definition, ffl Advanced CLIM-based graphic plan-space browser with multiple views, ffl Declarative specification of search control rules, ffl Large set of domain theories & search functions for testing, ffl Expanded users manual an...
Article
Even before the advent of Artificial Intelligence, science fiction writer Isaac Asimov recognized that an agent must place the protection of humans from harm at a higher priority than obeying human orders. Inspired by Asimov, we pose the following fundamental questions: (1) How should one formalize the rich, but informal, notion of "harm"? (2) How can an agent avoid performing harmful actions, and do so in a computationally tractable manner? (3) How should an agent resolve conflict between its goals and the need to avoid harm? (4) When should an agent prevent a human from harming herself? While we address some of these questions in technical detail, the primary goal of this paper is to focus attention on Asimov's concern: society will reject autonomous agents unless we have some credible means of making them safe! The Three Laws of Robotics: 1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm. 2. A robot must obey orders given it by human ...
UCPOP user's manual, (version 4.0) Available via FTP from pub
  • A Barrett
  • D Christianson
  • M Friedman
  • K Golden
  • C Kwok
  • J S Penberthy
  • Y Sun
  • D Weld
A. Barrett, D. Christianson, M. Friedman, K. Golden, C. Kwok, J.S. Penberthy, Y. Sun, and D. Weld. UCPOP user's manual, (version 4.0). Technical Report 93-09-06d, University of Washington, Department of Computer Science and Engineering, November 1995. Available via FTP from pub/ai/ at ftp.cs.washington.edu.