Conference Paper

Goal Recognition in Latent Space

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... The most probable goal that can explicate the observation sequence is deduced by comparing the observation sequence and the planning sequence [10,18]. Conversely, learning-based methods primarily utilize historical or interactive data to acquire knowledge about the domain of the goal individual [19,20]. ...
... Model-based goal recognition focuses on learning the action model and domain theory of the recognizer. Amir et al. [20,[29][30][31] employed various learning methods to study behavior models, but have not established a link between these models and the recognizer's strategy. Zeng et al. [32] used inverse reinforcement learning to learn the recognizer's reward and implemented a Markov-based goal recognition algorithm. ...
... However, for goal recognition, it is not necessary to learn the reward for the transition between all actions. To extract useful information from the image-based domain and perform goal recognition, Amado et al. [20] used a pre-trained encoder and LSTM network to represent and analyze observed state sequences, rather than relying on actions. Additionally, by training an LSTM-based system to recognize missing observations about states, Amado et al. [19] achieved improved performance for the model-based goal recognition method based on learning. ...
Article
Full-text available
The problem of goal recognition involves inferring the high-level task goals of an agent based on observations of its behavior in an environment. Current methods for achieving this task rely on offline comparison inference of observed behavior in discrete environments, which presents several challenges. First, accurately modeling the behavior of the observed agent requires significant computational resources. Second, continuous simulation environments cannot be accurately recognized using existing methods. Finally, real-time computing power is required to infer the likelihood of each potential goal. In this paper, we propose an advanced and efficient real-time online goal recognition algorithm based on deep reinforcement learning in continuous domains. By leveraging the offline modeling of the observed agent’s behavior with deep reinforcement learning, our algorithm achieves real-time goal recognition. We evaluate the algorithm’s online goal recognition accuracy and stability in continuous simulation environments under communication constraints.
... where N O is a set of operator nodes, N A are action nodes and E are edges. 1 Operator nodes are of type DEP, UNORDERED/AND, OR and ORDERED/AND nodes, i.e. ...
... symbolic) methods [47,67]. Data-driven approaches train a recognition model from a large dataset [1,3,33,57,67]. The main disadvantages of this method are that often a large amount of labelled training data is required and the produced models often only work on data similar to the training set [51,66]. Since our work belongs to the category of knowledge-driven methods, data-driven methods are not further discussed. ...
... As developing the PDDL can be time consuming and challenging, researches have attempted to replace this manual process, with deep learning methods [1,2]. We will explore the potential of learning the Action Graph structure from pairs of images, and then, converting the Action Graph into a PDDL defined domain. ...
Article
Full-text available
Goal recognisers attempt to infer an agent’s intentions from a sequence of observed actions. This is an important component of intelligent systems that aim to assist or thwart actors; however, there are many challenges to overcome. For example, the initial state of the environment could be partially unknown, and agents can act suboptimally and observations could be missing. Approaches that adapt classical planning techniques to goal recognition have previously been proposed, but, generally, they assume the initial world state is accurately defined. In this paper, a state is inaccurate if any fluent’s value is unknown or incorrect. Our aim is to develop a goal recognition approach that is as accurate as the current state-of-the-art algorithms and whose accuracy does not deteriorate when the initial state is inaccurately defined. To cope with this complication, we propose solving goal recognition problems by means of an Action Graph. An Action Graph models the dependencies, i.e. order constraints, between all actions rather than just actions within a plan. Leaf nodes correspond to actions and are connected to their dependencies via operator nodes. After generating an Action Graph, the graph’s nodes are labelled with their distance from each hypothesis goal. This distance is based on the number and type of nodes traversed to reach the node in question from an action node that results in the goal state being reached. For each observation, the goal probabilities are then updated based on either the distance the observed action’s node is from each goal or the change in distance. Our experimental results, for 15 different domains, demonstrate that our approach is robust to inaccuracies within the defined initial state.
... However, recent advances use classical planning instead of plan libraries, showing that automated planning techniques can efficiently recognize goals and plans [18]. Although much effort has been focused on improving the recognition algorithms themselves [18], recent research has focused on the quality of the domain models used to drive such algorithms [2,1,17]. Unlike most approaches that assume that a human domain engineer can provide an accurate and complete domain model for the plan recognition algorithm, recent work on goal recognition use the latent space [2,1] to overcome this limitation. ...
... Although much effort has been focused on improving the recognition algorithms themselves [18], recent research has focused on the quality of the domain models used to drive such algorithms [2,1,17]. Unlike most approaches that assume that a human domain engineer can provide an accurate and complete domain model for the plan recognition algorithm, recent work on goal recognition use the latent space [2,1] to overcome this limitation. These approaches build planning domain knowledge from raw data using a latent representation of the input data. ...
... We encode these transitions in a binary vector containing all the possible relationships that appear in training data. This approach is similar to the one performed by Amado et al. [1]. However, instead of deriving binary vectors from an autoencoder, we consider a binary vector composed by all the relationships containing in the training set. ...
Chapter
Goal and plan recognition of daily living activities has attracted much interest due to its applicability to ambient assisted living. Such applications require the automatic recognition of high-level activities based on multiple steps performed by human beings in an environment. In this work, we address the problem of plan and goal recognition of human activities in an indoor environment. Unlike existing approaches that use only actions to identify the goal, we use objects and their relations to identify the plan and goal towards which the subject in the video is pursuing. Our approach combines state-of-the-art object and relationship detection to analyze raw video data with a goal recognition algorithm to identify the subject’s ultimate goal in the video. Experiments show that our approach identifies cooking activities in a kitchen scenario.
... Goal Recognition is the problem of discerning the intentions of autonomous agents or humans, given a sequence of observations as evidence of their behavior in an environment, and a domain model describing how the observed agents generate such behavior to achieve their goals [86]. Recognizing goals is important in several applications, especially for monitoring and anticipating agent behavior in an environment, including crime detection and prevention [26], monitoring activities in elder-care [25], recognizing plans in educational environments [91] and exploratory domains [52], and traffic monitoring [76], among others [26,31,53,3]. ...
... States x k , controls u k , and disturbances w k are required to be part of spaces S ⊂ R d , C ⊂ R p , and D ⊆ R d+p . Controls u k are further required to belong to the set U (x k ) ⊂ C, for each state 3 Referred to as target regions in Control Theory. ...
... Importantly, the ablation study shows that overlooked landmarks contribute substantially to the accuracy of our approach. As future work, we envision such techniques to be instrumental in using learned planning models [5] for goal recognition [3]. ...
Preprint
Goal recognition is the problem of recognizing the intended goal of autonomous agents or humans by observing their behavior in an environment. Over the past years, most existing approaches to goal and plan recognition have been ignoring the need to deal with imperfections regarding the domain model that formalizes the environment where autonomous agents behave. In this thesis, we introduce the problem of goal recognition over imperfect domain models, and develop solution approaches that explicitly deal with two distinct types of imperfect domains models: (1) incomplete discrete domain models that have possible, rather than known, preconditions and effects in action descriptions; and (2) approximate continuous domain models, where the transition function is approximated from past observations and not well-defined. We develop novel goal recognition approaches over imperfect domains models by leveraging and adapting existing recognition approaches from the literature. Experiments and evaluation over these two types of imperfect domains models show that our novel goal recognition approaches are accurate in comparison to baseline approaches from the literature, at several levels of observability and imperfections.
... Most goal and plan recognition approaches (Avrahami-Zilberbrand and Kaminka, 2005;Geib and Goldman, 2009;Amir and Gal, 2013;Mirsky et al., 2017c) employ plan libraries to represent agent behavior, i.e., a plan library with plans for achieving goals, resulting in approaches to recognize plans that are analogous to parsing. Such techniques have been used in applications including crime detection and prevention (Geib and Goldman, 2001), monitoring activities in elder-care (Geib, 2002), recognizing plans in educational environments (Uzan et al., 2015) and exploratory domains (Mirsky et al., 2017a), and traffic monitoring (Pynadath and Wellman, 2013), among others (Geib and Goldman, 2001;Granada et al., 2017;Mirsky et al., 2017b;Amado et al., 2018). Existing work (Ramírez and Geffner, 2009;Ramírez and Geffner, 2010;Pattison and Long, 2010;Keren et al., 2014;E-Martín et al., 2015;Sohrabi et al., 2016;Pereira and Meneguzzi, 2016;Pereira et al., 2017b;Masters and Sardiña, 2017) use a planning domain definition (a domain theory) to represent potential agent behavior, bringing goal and plan recognition closer to planning algorithms. ...
... Tables A.5 and A.6 show comparative results of our heuristics and previous approaches for the second set of domains (from Kitchen to Zeno-Travel). From these tables, it is possible to see that our heuristics are both faster and more accurate than R&G 2009, R&G 2010, FGR 2015, IBM 2016, and M+L 2018. Note that the resulting combination of our filtering method and the approaches of R&G (2009 and 2010) get a substantial speedup and often accuracy improvements. ...
... For each approach, we plot its recognition results for all domains into a cloud of points, which represents (in general) how well each approach recognizes the correct hidden goal from missing and full observations. The ROC space plots show that our heuristics are not only competitive against the other five approaches (R&G 2009, R&G 2010, FGR 2015, IBM 2016, M+L 2018 for all variations of observability, but also surpasses these approaches in a substantial number of domains and problems. Figure 8a shows a comparison of our two heuristics against R&G 2009, and it is possible to see that all three compared approaches have most points in the upper left corner, showing that these approaches have similar results (R&G 2009 being slightly worse than our heuristics) regarding true and false positive results. ...
Preprint
The task of recognizing goals and plans from missing and full observations can be done efficiently by using automated planning techniques. In many applications, it is important to recognize goals and plans not only accurately, but also quickly. To address this challenge, we develop novel goal recognition approaches based on planning techniques that rely on planning landmarks. In automated planning, landmarks are properties (or actions) that cannot be avoided to achieve a goal. We show the applicability of a number of planning techniques with an emphasis on landmarks for goal and plan recognition tasks in two settings: (1) we use the concept of landmarks to develop goal recognition heuristics; and (2) we develop a landmark-based filtering method to refine existing planning-based goal and plan recognition approaches. These recognition approaches are empirically evaluated in experiments over several classical planning domains. We show that our goal recognition approaches yield not only accuracy comparable to (and often higher than) other state-of-the-art techniques, but also substantially faster recognition time over such techniques.
... It builds a set of propositional state representation from the raw observations (e.g. images) of the environment, which can be used for classical planning as well as goal recognition (Amado et al. 2018). However, Latplan still contains many rooms for improvements in terms of the interpretability and the scalability which are trivially available in the symbolic systems. ...
... While there are several learning-based AMA methods that approximate AMA 1 (e.g. AMA 2 (Asai and Fukunaga 2018) and Action Learner (Amado et al. 2018)), there is information loss between the learned action model and the original search space generated by FOSAE, which make them unsuitable for our purpose of testing the feasibility of the representation. ...
... Extending the existing Action Model Acquisition methods (e.g. AMA 1 , AMA 2 , Action Learner (Amado et al. 2018)) or leveraging the existing work on model acquisition (Yang, Wu, and Jiang 2007;Mourão et al. 2012;Cresswell, McCluskey, and West 2013) is a promising direction. ...
Preprint
Recently, there is an increasing interest in obtaining the relational structures of the environment in the Reinforcement Learning community. However, the resulting "relations" are not the discrete, logical predicates compatible to the symbolic reasoning such as classical planning or goal recognition. Meanwhile, Latplan (Asai and Fukunaga 2018) bridged the gap between deep-learning perceptual systems and symbolic classical planners. One key component of the system is a Neural Network called State AutoEncoder (SAE), which encodes an image-based input into a propositional representation compatible to classical planning. To get the best of both worlds, we propose First-Order State AutoEncoder, an unsupervised architecture for grounding the first-order logic predicates and facts. Each predicate models a relationship between objects by taking the interpretable arguments and returning a propositional value. In the experiment using 8-Puzzle and a photo-realistic Blocksworld environment, we show that (1) the resulting predicates capture the interpretable relations (e.g. spatial), (2) they help obtaining the compact, abstract model of the environment, and finally, (3) the resulting model is compatible to symbolic classical planning.
... Subsequent approaches have gradually relaxed such requirements using expressive planning and plan-library-based formalisms [Avrahami-Zilberbrand and Kaminka, 2005;Geib and Steedman, 2007;Meneguzzi and Oh, 2010;Fagundes et al., 2014] as well as achieving different levels of accuracy and amount of information available in observations required to recognize goals [Martín et al., 2015;Pereira and Meneguzzi, 2016;Pereira et al., 2017]. Recent work on goal recognition in latent space [Amado et al., 2018] overcomes this limitation by building planning domain knowledge from raw data and using such domain knowledge on traditional goal recognition techniques [Pereira et al., 2017] to infer goals from image data. However, to build this domain knowledge, their approach requires a substantial amount of training data to create a complete PDDL domain. ...
... Goal recognition in Latent Space is a technique to apply classical goal recognition algorithms in raw data (such as images) by converting it into a latent representation [Amado et al., 2018]. In Figure 1, we provide an example of the goal recognition problem in image domains. ...
... To recognize goals in image based domains, Amado et al. [2018] proposed four steps. First, we must develop an autoencoder capable of creating a latent representation to a state of such image domain. ...
Preprint
Full-text available
Approaches to goal recognition have progressively relaxed the requirements about the amount of domain knowledge and available observations, yielding accurate and efficient algorithms capable of recognizing goals. However, to recognize goals in raw data, recent approaches require either human engineered domain knowledge, or samples of behavior that account for almost all actions being observed to infer possible goals. This is clearly too strong a requirement for real-world applications of goal recognition, and we develop an approach that leverages advances in recurrent neural networks to perform goal recognition as a classification task, using encoded plan traces for training. We empirically evaluate our approach against the state-of-the-art in goal recognition with image-based domains, and discuss under which conditions our approach is superior to previous ones.
... Although automated planning is a key and current research field in artificial intelligence [8,9,10], deep learning had a limited impact on it. In fact, it is mostly used for predicting heuristics [11], processing sensor data in order to create a symbolic representation of a planning problem [12] and for goal recognition tasks [13,14] or in specific applications [15]. ...
... This representation can be further exploited for other purposes, such as Goal Recognition, which is defined as the task of recognising the goal that an agent is trying to achieve from observations about the agent's behaviour in the environment [28]. Using only image-based domains such as MNIST or 8-Puzzle, the work [13] first computes a latent representation using the tool proposed by [12] and then analyses the sequence of actions using an LSTM Neural Network [29] for predicting the goal. ...
Conference Paper
Despite some similarities that have been pointed out in the literature, the parallelism between automated planning and natural language processing has not been fully analysed yet. However, the success of Transformer-based models and, more generally, deep learning techniques for NLP, could open interesting research lines also for automated planning. Therefore, in this work, we investigate whether these impressive results could be transferred to planning. In particular, we study how a BERT model trained on plans computed for three well-known planning domains is able to understand how a domain works, its actions and how they are related to each other. In order to do that, we designed a variation of the typical masked language modeling task which is used for the training of BERT, and two additional experiments into which, given a sequence of consecutive actions, the model has to predict what the agent did previously (Previous Action Prediction) and what it is going to do next (Next Action Prediction).
... We extend the definition of goal recognition in latent space (i.e., image-based domains) proposed by Amado et al. (2018) to also formalize the task of plan recognition in latent space. Here, an image-based plan (alternatively goal) recognition problem I Π Ω π (alternatively I Π Ω G ) is a tuple ⟨Ξ I , I I , G I , Ω I ⟩, where Ξ I is an inferred domain knowledge from a set of images, I I is an image representation of an initial state, G I is a set of image representations of goal hypotheses, which includes a correct goal G * I (unknown to the observer), and Ω I is a sequence of image observations. ...
... Latent Space Datasets: To evaluate our approaches in real-world data, we generated a set of image-based datasets based on existing recognition problems (Amado et al. 2018). We select two domains from (Asai and Fukunaga 2018): MNIST 8-puzzle and Lights-Out Digital (LODIGITAL). ...
Article
Goal Recognition is the task of discerning the intended goal of an agent given a sequence of observations, whereas Plan Recognition consists of identifying the plan to achieve such intended goal. Regardless of the underlying techniques, most recognition approaches are directly affected by the quality of the available observations. In this paper, we develop neuro-symbolic recognition approaches that can combine learning and planning techniques, compensating for noise and missing observations using prior data. We evaluate our approaches in standard human-designed planning domains as well as domain models automatically learned from real-world data. Empirical experimentation shows that our approaches reliably infer goals and compute correct plans in the experimental datasets. An ablation study shows that outperform approaches that rely exclusively on the domain model, or exclusively on machine learning in problems with both noisy observations and low observability.
... We focus the first instance of this framework on tabular domains to enable an evaluation against existing GR baselines. Planning-based GR algorithms use PDDL as their domain descriptions, which can be easily translated into tabular representations (Ramírez and Geffner 2009;Amado et al. 2018). Figure 2 illustrates the specific components that require implementation in italics. ...
... By directly using RL, we skip this stage and learn utility functions or policies based on past actor experiences towards achieving specific goals. Amado et al. (2018) learn domain theories for GR using autoencoders. However, they require observation of all possible transitions of a domain in order to infer its encoding, whereas we need only a small sample of transitions to learn a utility function informative enough to carry out GR effectively. ...
Article
Most approaches for goal recognition rely on specifications of the possible dynamics of the actor in the environment when pursuing a goal. These specifications suffer from two key issues. First, encoding these dynamics requires careful design by a domain expert, which is often not robust to noise at recognition time. Second, existing approaches often need costly real-time computations to reason about the likelihood of each potential goal. In this paper, we develop a framework that combines model-free reinforcement learning and goal recognition to alleviate the need for careful, manual domain design, and the need for costly online executions. This framework consists of two main stages: Offline learning of policies or utility functions for each potential goal, and online inference. We provide a first instance of this framework using tabular Q-learning for the learning stage, as well as three measures that can be used to perform the inference stage. The resulting instantiation achieves state-of-the-art performance against goal recognizers on standard evaluation domains and superior performance in noisy environments.
... They encode either an execution of an action in the model, i.e., an occurrence of a transition that describes a move of the agent in the grid, or again, a skip '≫'. For example, the first step in γ 1 represents the occurrence of transition t 1 in the net from Figure 4 that describes the move from cell (5, 0) to cell (5,1). Steps at positions one, four, and seven in γ 1 are, respectively, examples of synchronous, asynchronous (on model), and asynchronous (on trace) steps. ...
... There exists work in (statistical) learning that proposes to build GR from previous behavior data. For example, [1] and [30] learn the underlying domain transition model that can then support planningbased GR, [5] learns the decision-making model of the observed agent when executing an HTN-style plan-library (which is known a priori), while [6] and [26] are cases of end-to-end learning from observed behavior to the intended goal. Like our work, their overarching objective is to ease the traditional requirement of having the observed agent model at hand. ...
Conference Paper
Full-text available
The problem of probabilistic goal recognition consists of automatically inferring a probability distribution over a range of possible goals of an autonomous agent based on the observations of its behavior. The state-of-the-art approaches for probabilistic goal recognition assume the full knowledge about the world the agent operates in and possible agent's operations in this world. In this paper, we propose a framework for solving the probabilistic goal recognition problem using process mining techniques for discovering models that describe the observed behavior and diagnosing deviations between the discovered models and observations. The framework imitates the principles of observational learning, one of the core mechanisms of social learning exhibited by humans, and relaxes the above assumptions. It has been implemented in a publicly available tool. The reported experimental results confirm the effectiveness and efficiency of the approach, both for rational and irrational agents' behaviors.
... In contrast, there have been so far fewer research efforts to apply deep learning to infer goals behind long sequences of actions. The rare existing approaches use conventional deep learning architectures such as convolutional or long-short term memory networks, with the difficulty of being able to generalize across domains (Min et al. 2016;Amado et al. 2018). For example, while it is possible to predict the goals of others walking around in one particular scenario, the learned model does not apply to a completely different one, let alone a new one with fewer samples. ...
... Using data collected from in-game interactions, an LSTM was trained to predict the player's goal from his sequence of interactions, with reliable performance. Amado et al. (2018) introduced a pipeline to recognize the goal achieved by a player in different simple games (such as 8-puzzle and tower of Hanoi) from constructed images of the game state, divided into three steps. First, they convert inputs into a latent space (which is a representation of state features) using a dense auto-encoder network previously introduced in Asai and Fukunaga (2018). ...
Preprint
Full-text available
The ability to infer the intentions of others, predict their goals, and deduce their plans are critical features for intelligent agents. For a long time, several approaches investigated the use of symbolic representations and inferences with limited success, principally because it is difficult to capture the cognitive knowledge behind human decisions explicitly. The trend, nowadays, is increasingly focusing on learning to infer intentions directly from data, using deep learning in particular. We are now observing interesting applications of intent classification in natural language processing, visual activity recognition, and emerging approaches in other domains. This paper discusses a novel approach combining few-shot and transfer learning with cross-domain features, to learn to infer the intent of an agent navigating in physical environments, executing arbitrary long sequences of actions to achieve their goals. Experiments in synthetic environments demonstrate improved performance in terms of learning from few samples and generalizing to unseen configurations, compared to a deep-learning baseline approach.
... Thus the system grounds two kinds of symbols: Propositional symbols and action symbols, and opens a promising direction for applying a variety of symbolic methods to the real world. The search space generated by Latplan was shown to be compatible to a symbolic Goal Recognition system (Amado et al. 2018a). Another approach replacing SAE/AMA 2 with Info-GAN was also proposed recently (Kurutach et al. 2018). ...
... The search space generated by Latplan was shown to be compatible to an existing Goal Recognition system (Amado et al. 2018a;2018b). Another recent approach replacing SAE/AMA 2 with InfoGAN (Kurutach et al. 2018) has no explicit mechanism for improving the stability of the binary representation. ...
Preprint
While classical planning has been an active branch of AI, its applicability is limited to the tasks precisely modeled by humans. Fully automated high-level agents should be instead able to find a symbolic representation of an unknown environment without supervision, otherwise it exhibits the knowledge acquisition bottleneck. Meanwhile, Latplan (Asai and Fukunaga 2018) partially resolves the bottleneck with a neural network called State AutoEncoder (SAE). SAE obtains the propositional representation of the image-based puzzle domains with unsupervised learning, generates a state space and performs classical planning. In this paper, we identify the problematic, stochastic behavior of the SAE-produced propositions as a new sub-problem of symbol grounding problem, the symbol stability problem. Informally, symbols are stable when their referents (e.g. propositional values) do not change against small perturbation of the observation, and unstable symbols are harmful for symbolic reasoning. We analyze the problem in Latplan both formally and empirically, and propose "Zero-Suppressed SAE", an enhancement that stabilizes the propositions using the idea of closed-world assumption as a prior for NN optimization. We show that it finds the more stable propositions and the more compact representations, resulting in an improved success rate of Latplan. It is robust against various hyperparameters and eases the tuning effort, and also provides a weight pruning capability as a side effect.
... Recently, some ideas have appeared in the literature proposing methodologies for integrating sub-symbolic and symbolic approaches, or more generally, low-level and high-level modules [53]. On the one hand, some works tried to reconcile deep learning with planning [54,55], goal recognition [56] and the synthesis of a symbolic representation of the domain [57,58]. On the other hand, the integration has also been performed through a specific algorithm designed to produce an automated symbolic abstraction of the low-level information acquired by an exploring agent [59] in terms of a high-level planning representation such as the PDDL formalism [60], which explicitly describes the context necessary to execute an action on the current state (i.e., the preconditions and the effects) making use of symbols. ...
Preprint
Full-text available
Recently, AI systems have made remarkable progress in various tasks. Deep Reinforcement Learning(DRL) is an effective tool for agents to learn policies in low-level state spaces to solve highly complex tasks. Researchers have introduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the agent's curiosity, encouraging agents to explore interesting areas of the environment. This new feature has proved vital in enabling agents to learn policies without being given specific goals. However, even though DRL intelligence emerges through a sub-symbolic model, there is still a need for a sort of abstraction to understand the knowledge collected by the agent. To this end, the classical planning formalism has been used in recent research to explicitly represent the knowledge an autonomous agent acquires and effectively reach extrinsic goals. Despite classical planning usually presents limited expressive capabilities, PPDDL demonstrated usefulness in reviewing the knowledge gathered by an autonomous system, making explicit causal correlations, and can be exploited to find a plan to reach any state the agent faces during its experience. This work presents a new architecture implementing an open-ended learning system able to synthesize from scratch its experience into a PPDDL representation and update it over time. Without a predefined set of goals and tasks, the system integrates intrinsic motivations to explore the environment in a self-directed way, exploiting the high-level knowledge acquired during its experience. The system explores the environment and iteratively: (a) discover options, (b) explore the environment using options, (c) abstract the knowledge collected and (d) plan. This paper proposes an alternative approach to implementing open-ended learning architectures exploiting low-level and high-level representations to extend its knowledge in a virtuous loop.
... This task is relevant in many application domains like crime detection [6], pervasive computing [19,5], or traffic monitoring [12]. Previous goal recognition systems often rely on the principle of Plan Recognition As Planning (PRAP) and, hence, utilize concepts and algorithms from the classical planning community to solve the goal recognition problem [13,14,17,1]. Nevertheless, a fundamental limitation of many systems from this area, which require computing entire plans to solve a goal recognition problem, is their computational complexity. ...
Preprint
Full-text available
We present a new approach to goal recognition that involves comparing observed facts with their expected probabilities. These probabilities depend on a specified goal g and initial state s0. Our method maps these probabilities and observed facts into a real vector space to compute heuristic values for potential goals. These values estimate the likelihood of a given goal being the true objective of the observed agent. As obtaining exact expected probabilities for observed facts in an observation sequence is often practically infeasible, we propose and empirically validate a method for approximating these probabilities. Our empirical results show that the proposed approach offers improved goal recognition precision compared to state-of-the-art techniques while reducing computational complexity.
... A Model-Free GR problem (MFGR) (Geffner, 2018) is one in which the recognizer does not have access to the underlying model that describes the properties and dynamics of the environment. While some approaches learn the model dynamics and then employ MBGR methods (Asai & Fukunaga, 2018;Amado et al., 2018), other approaches perform GR directly, without learning the model of the world. Among these approaches, the main techniques include leveraging RL (Amado et al., 2022) or employing deep neural networks to perform GR using a classification network (Min et al., 2014;Borrajo et al., 2020;Chiari et al., 2023). ...
Preprint
Full-text available
Traditionally, Reinforcement Learning (RL) problems are aimed at optimization of the behavior of an agent. This paper proposes a novel take on RL, which is used to learn the policy of another agent, to allow real-time recognition of that agent's goals. Goal Recognition (GR) has traditionally been framed as a planning problem where one must recognize an agent's objectives based on its observed actions. Recent approaches have shown how reinforcement learning can be used as part of the GR pipeline, but are limited to recognizing predefined goals and lack scalability in domains with a large goal space. This paper formulates a novel problem, "Online Dynamic Goal Recognition" (ODGR), as a first step to address these limitations. Contributions include introducing the concept of dynamic goals into the standard GR problem definition, revisiting common approaches by reformulating them using ODGR, and demonstrating the feasibility of solving ODGR in a navigation domain using transfer learning. These novel formulations open the door for future extensions of existing transfer learning-based GR methods, which will be robust to changing and expansive real-time environments.
... GRNet is more general, as it can be applied to any domain of which the sets of fluents and actions (F and A) are known. In order to extract useful information from image-based domains and perform goal recognition, Amado et al. (2018) used a pre-trained encoder and a LSTM network for representing and analysing a sequence of observed states, rather than actions as in our approach. Amado et al. (2020) trained a LSTM-based system to identify missing observations about states in order to derive a more complete sequence of states by which a MBGR system can obtain better performance. ...
Article
Recognising the goal of an agent from a trace of observations is an important task with many applications. The state-of-the-art approach to goal recognition (GR) relies on the application of automated planning techniques. We study an alternative approach, called GRNet, where GR is formulated as a classification task addressed by machine learning. GRNet is primarily aimed at solving GR instances more accurately and more quickly by learning how to solve them in a given domain, which is specified by a set of propositions and a set of action names. The goal classification instances in the domain are solved by a Recurrent Neural Network (RNN). The only information required as input of the trained RNN is a trace of action labels, each one indicating just the name of an observed action. A run of the RNN processes a trace of observed actions to compute how likely it is that each domain proposition is part of the agent's goal, for the problem instance under consideration. These predictions are then aggregated to choose one of the candidate goals. An experimental analysis confirms that GRNet achieves good performance in terms of both goal classification accuracy and runtime, obtaining better results w.r.t. a state-of-the-art GR system over the considered benchmarks. Moreover, such a state-of-the-art system and GRNet can be combined achieving higher performance than with each of the two integrated systems alone.
... This task is relevant in many real-world application domains like crime detection [5], pervasive computing [19], [4], or traffic monitoring [12]. State-of-the-art goal recognition systems often rely on the principle of Plan Recognition As Planning (PRAP) and hence, utilize classical planning systems to solve the goal recognition problem [13], [14], [17], [1]. However, many of these approaches require quite large amounts of computation time to calculate a solution. ...
Preprint
Full-text available
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios, it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible. However, many early approaches in the area of Plan Recognition As Planning, require quite large amounts of computation time to calculate a solution. Mainly to address this issue, recently, Pereira et al. developed an approach that is based on planning landmarks and is much more computationally efficient than previous approaches. However, the approach, as proposed by Pereira et al., also uses trivial landmarks (i.e., facts that are part of the initial state and goal description are landmarks by definition). In this paper, we show that it does not provide any benefit to use landmarks that are part of the initial state in a planning landmark based goal recognition approach. The empirical results show that omitting initial state landmarks for goal recognition improves goal recognition performance.
... Concerning GR systems using neural networks, some works use them for specific applications, such as game playing (Min et al. 2016). GRNet is more general, as it can be applied to any domain of which we know sets F and A. In order to extract useful information from image-based domains and perform goal recognition, Amado et al. (2018) used a pre-trained encoder and a LSTM network for representing and analysing sequence of observed states, rather than actions as in our approach. Amado et al. (2020) trained a LSTM-based system to identify missing observations about states in order to derive a more complete sequence of states by which a MBGR system can obtain better performance. ...
Preprint
Full-text available
In automated planning, recognising the goal of an agent from a trace of observations is an important task with many applications. The state-of-the-art approaches to goal recognition rely on the application of planning techniques, which requires a model of the domain actions and of the initial domain state (written, e.g., in PDDL). We study an alternative approach where goal recognition is formulated as a classification task addressed by machine learning. Our approach, called GRNet, is primarily aimed at making goal recognition more accurate as well as faster by learning how to solve it in a given domain. Given a planning domain specified by a set of propositions and a set of action names, the goal classification instances in the domain are solved by a Recurrent Neural Network (RNN). A run of the RNN processes a trace of observed actions to compute how likely it is that each domain proposition is part of the agent's goal, for the problem instance under considerations. These predictions are then aggregated to choose one of the candidate goals. The only information required as input of the trained RNN is a trace of action labels, each one indicating just the name of an observed action. An experimental analysis confirms that \our achieves good performance in terms of both goal classification accuracy and runtime, obtaining better performance w.r.t. a state-of-the-art goal recognition system over the considered benchmarks.
... Previous work in goal and plan recognition has typically relied on rich domain knowledge (e.g., Kautz and Allen 1986;Ramírez and Geffner 2011), thus limiting the applicability of this body of work. To leverage the existence of large datasets and machine learning techniques, some approaches to goal recognition eschew assumptions about domain knowledge and instead propose to learn models from data and use these models to predict an agent's goal given a sequence of observations (e.g., Geib and Kantharaju 2018;Amado et al. 2018;Polyvyanyy et al. 2020). Such approaches either learn models of the dynamics that govern the environment which are then used in goal recognition, or directly learn classifiers that are given a sequence of observations and predict the goal. ...
Article
Sequence classification is the task of predicting a class label given a sequence of observations. In many applications such as healthcare monitoring or intrusion detection, early classification is crucial to prompt intervention. In this work, we learn sequence classifiers that favour early classification from an evolving observation trace. While many state-of-the-art sequence classifiers are neural networks, and in particular LSTMs, our classifiers take the form of finite state automata and are learned via discrete optimization. Our automata-based classifiers are interpretable---supporting explanation, counterfactual reasoning, and human-in-the-loop modification---and have strong empirical performance. Experiments over a suite of goal recognition and behaviour classification datasets show our learned automata-based classifiers to have comparable test performance to LSTM-based classifiers, with the added advantage of being interpretable.
... Precondition extraction: To extract the preconditions of an action a, we propose a simple ad-hoc method that is similar to (Wang, 1994) and (Amado et al., 2018). It looks for bits that always have the same value when a is used. ...
Article
Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems such as planners. We propose Latplan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), Latplan learns a complete propositional PDDL action model of the environment. Later, when a pair of images representing the initial and the goal states (planning inputs) is given, Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. We evaluate Latplan using image-based versions of 6 planning domains: 8-puzzle, 15-Puzzle, Blocksworld, Sokoban and Two variations of LightsOut.
... Pereira et al. (2019) combine deep learning with planning techniques to recognize goals with continuous action spaces. Amado et al. (2018) also use deep learning in an unsupervised fashion to lessen the need for domain expertise in goal recognition approaches; Polyvyanyy et al. (2020) take a similar approach, but using process mining techniques. To learn these models, existing data of agents' behaviors is required to learn these models. ...
Article
Full-text available
Goal or intent recognition, where one agent recognizes the goals or intentions of another, can be a powerful tool for effective teamwork and improving interaction between agents. Such reasoning can be challenging to perform, however, because observations of an agent can be unreliable and, often, an agent does not have access to the reasoning processes and mental models of the other agent. Despite this difficulty, recent work has made great strides in addressing these challenges. In particular, two Artificial Intelligence (AI)-based approaches to goal recognition have recently been shown to perform well: goal recognition as planning, which reduces a goal recognition problem to the problem of plan generation; and Combinatory Categorical Grammars (CCGs), which treat goal recognition as a parsing problem. Additionally, new advances in cognitive science with respect to Theory of Mind reasoning have yielded an approach to goal recognition that leverages analogy in its decision making. However, there is still much unknown about the potential and limitations of these approaches, especially with respect to one another. Here, we present an extension of the analogical approach to a novel algorithm, Refinement via Analogy for Goal Reasoning (RAGeR). We compare RAGeR to two state-of-the-art approaches which use planning and CCGs for goal recognition, respectively, along two different axes: reliability of observations and inspectability of the other agent's mental model. Overall, we show that no approach dominates across all cases and discuss the relative strengths and weaknesses of these approaches. Scientists interested in goal recognition problems can use this knowledge as a guide to select the correct starting point for their specific domains and tasks.
... Amado et al. (2018b) applied state-of-the-art goal recognition methods to binary latent states and an action model obtained by a simple, clustering based algorithm. Amado, Aires, Pereira, Magnaguagno, Granada, and Meneguzzi (2018a) enhanced the approach through the use of LSTMs. ...
Preprint
Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems such as planners. We propose Latplan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), Latplan learns a complete propositional PDDL action model of the environment. Later, when a pair of images representing the initial and the goal states (planning inputs) is given, Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. We evaluate Latplan using image-based versions of 6 planning domains: 8-puzzle, 15-Puzzle, Blocksworld, Sokoban and Two variations of LightsOut.
... In the case of activity recognition, recurrent neural networks have demonstrated to be very useful at classifying activities that are short in duration but have a natural ordering, thanks to their ability to take the context into account (Hammerla et al., 2016). Amado et al. (2018a) proposed the use of long short-term memory networks for a goal recognition task dealing with sensory input data, requiring much less manual introduction of domain knowledge than other stateof-the-art goal recognition approaches. Ordóñez and Roggen (2016) combined convolutional neural networks with long shortterm memory networks for the task of activity recognition. ...
Article
Full-text available
Recognizing the actions, plans, and goals of a person in an unconstrained environment is a key feature that future robotic systems will need in order to achieve a natural human-machine interaction. Indeed, we humans are constantly understanding and predicting the actions and goals of others, which allows us to interact in intuitive and safe ways. While action and plan recognition are tasks that humans perform naturally and with little effort, they are still an unresolved problem from the point of view of artificial intelligence. The immense variety of possible actions and plans that may be encountered in an unconstrained environment makes current approaches be far from human-like performance. In addition, while very different types of algorithms have been proposed to tackle the problem of activity, plan, and goal (intention) recognition, these tend to focus in only one part of the problem (e.g., action recognition), and techniques that address the problem as a whole have been not so thoroughly explored. This review is meant to provide a general view of the problem of activity, plan, and goal recognition as a whole. It presents a description of the problem, both from the human perspective and from the computational perspective, and proposes a classification of the main types of approaches that have been proposed to address it (logic-based, classical machine learning, deep learning, and brain-inspired), together with a description and comparison of the classes. This general view of the problem can help on the identification of research gaps, and may also provide inspiration for the development of new approaches that address the problem in a unified way.
... Previous work in goal and plan recognition has typically relied on rich domain knowledge (e.g., (Kautz and Allen 1986;Geib and Goldman 2001;Ramírez and Geffner 2011;Pereira, Oren, and Meneguzzi 2017)), thus limiting the applicability of this body of work. To leverage the existence of large datasets and machine learning techniques, some approaches to goal recognition eschew assumptions about domain knowledge and instead propose to learn models from data and use the learned models to predict an agent's goal given a sequence of observations (e.g., (Geib and Kantharaju 2018;Amado et al. 2018;Polyvyanyy et al. 2020)). Our work partially shares its motivation with this body of work and proposes to learn models from data that offer a set of interpretability services, are optimized for early prediction, and demonstrate a capacity to generalize in noisy sequence classification settings. ...
Preprint
Sequence classification is the task of predicting a class label given a sequence of observations. In many applications such as healthcare monitoring or intrusion detection, early classification is crucial to prompt intervention. In this work, we learn sequence classifiers that favour early classification from an evolving observation trace. While many state-of-the-art sequence classifiers are neural networks, and in particular LSTMs, our classifiers take the form of finite state automata and are learned via discrete optimization. Our automata-based classifiers are interpretable---supporting explanation, counterfactual reasoning, and human-in-the-loop modification---and have strong empirical performance. Experiments over a suite of goal recognition and behaviour classification datasets show our learned automata-based classifiers to have comparable test performance to LSTM-based classifiers, with the added advantage of being interpretable.
... We then decode the recognized goal, obtaining its image representation using the decoder. We illustrate this process in Figure 1(c), and detail the process in Amado et al. 2018. ...
Article
Recent approaches to goal recognition have progressively relaxed the requirements about the amount of domain knowledge and available observations, yielding accurate and efficient algorithms. These approaches, however, assume that there is a domain expert capable of building complete and correct domain knowledge to successfully recognize an agent's goal. This is too strong for most real-world applications. We overcome these limitations by combining goal recognition techniques from automated planning, and deep autoencoders to carry out unsupervised learning to generate domain theories from data streams and use the resulting domain theories to deal with incomplete and noisy observations. Moving forward, we aim to develop a new data-driven goal recognition technique that infers the domain model using the same set of observations used in recognition itself.
... Some works explored LSTM networks trained on observed data for the task of goal recognition (Min et al. 2016;Amado et al. 2018). However, these networks were trained and applied in single environment domains, and it is not realistic to expect they could generalize to multiple environments. ...
Preprint
Full-text available
Being able to infer the goal of people we observe, interact with, or read stories about is one of the hallmarks of human intelligence. A prominent idea in current goal-recognition research is to infer the likelihood of an agent's goal from the estimations of the costs of plans to the different goals the agent might have. Different approaches implement this idea by relying only on handcrafted symbolic representations. Their application to real-world settings is, however, quite limited, mainly because extracting rules for the factors that influence goal-oriented behaviors remains a complicated task. In this paper, we introduce a novel idea of using a symbolic planner to compute plan-cost insights, which augment a deep neural network with an imagination capability, leading to improved goal recognition accuracy in real and synthetic domains compared to a symbolic recognizer or a deep-learning goal recognizer alone.
... Methods for intention recognition can be broadly categorised as data-driven and knowledge-driven methods [28,38]. Data-driven approaches train a recognition model from a large dataset [1,3,33,38]. The main disadvantages of this method are, that often a large amount of labelled training data is required and the produced models often only work on data similar to the training set [32,37]. ...
Article
Full-text available
Smart environments can already observe the actions of a human through pervasive sensors. Based on these observations, our work aims to predict the actions a human is likely to perform next. Predictions can enable a robot to proactively assist humans by autonomously executing an action on their behalf. In this paper, Action Graphs are introduced to model the order constraints between actions. Action Graphs are derived from a problem defined in Planning Domain Definition Language (PDDL). When an action is observed, the node values are updated and next actions predicted. Subsequently, a robot executes one of the predicted actions if it does not impact the flow of the human by obstructing or delaying them. Our Action Graph approach is applied to a kitchen domain.
... 4) The decoding module maps the propositional plan back to an image sequence. The proposed framework opened a promising direction for applying a variety of symbolic methods to the real world -For example, the search space generated by Latplan was shown to be compatible with a symbolic Goal Recognition system (Amado et al. 2018a;2018b). Several variations replacing the state encoding modules have also been proposed: Causal InfoGAN (Kurutach et al. 2018) uses a GAN-based framework, First-Order SAE (Asai 2019) learns the First Order Logic symbols (instead of the propositional ones), and Zero-Suppressed SAE (Asai (:action a0 :parameters () :precondition [D0] :effect (and (when [E00] (z0)) (when (not [E00]) (not (z0))) (when [E01] (z1)) (when (not [E01]) (not (z1))) ...)) Figure 1: An example DSAMA compilation result for the first action (i.e. ...
Preprint
Recent work on Neural-Symbolic systems that learn the discrete planning model from images has opened a promising direction for expanding the scope of Automated Planning and Scheduling to the raw, noisy data. However, previous work only partially addressed this problem, utilizing the black-box neural model as the successor generator. In this work, we propose Double-Stage Action Model Acquisition (DSAMA), a system that obtains a descriptive PDDL action model with explicit preconditions and effects over the propositional variables unsupervized-learned from images. DSAMA trains a set of Random Forest rule-based classifiers and compiles them into logical formulae in PDDL. While we obtained a competitively accurate PDDL model compared to a black-box model, we observed that the resulting PDDL is too large and complex for the state-of-the-art standard planners such as Fast Downward primarily due to the PDDL-SAS+ translator bottleneck. From this negative result, we argue that this translator bottleneck cannot be addressed just by using a different, existing rule-based learning method, and we point to the potential future directions.
... Methods for GR can be broadly categorised as data-driven and knowledge-driven methods [4,28]. Data-driven approaches train a recognition model from a large dataset [4,[29][30][31]. The main disadvantages of this method are that often a large amount of labelled training data is required, and the produced models often only work on data similar to the training set [32,33]. ...
Article
Full-text available
Goal recognition is an important component of many context-aware and smart environment services; however, a person’s goal often cannot be determined until their plan nears completion. Therefore, by modifying the state of the environment, our work aims to reduce the number of observations required to recognise a human’s goal. These modifications result in either: Actions in the available plans being replaced with more distinctive actions; or removing the possibility of performing some actions, so humans are forced to take an alternative (more distinctive) plan. In our solution, a symbolic representation of actions and the world state is transformed into an Action Graph, which is then traversed to discover the non-distinctive plan prefixes. These prefixes are processed to determine which actions should be replaced or removed. For action replacement, we developed an exhaustive approach and an approach that shrinks the plans then reduces the non-distinctive plan prefixes, namely Shrink–Reduce. Exhaustive is guaranteed to find the minimal distinctiveness but is more computationally expensive than Shrink–Reduce. These approaches are compared using a test domain with varying amounts of goals, variables and values, and a realistic kitchen domain. Our action removal method is shown to increase the distinctiveness of various grid-based navigation problems, with a width/height ranging from 4 to 16 and between 2 and 14 randomly selected goals, by an average of 3.27 actions in an average time of 4.69 s, whereas a state-of-the-art approach often breaches a 10 min time limit.
... While there are several learning-based AMA methods that approximate AMA 1 (e.g. AMA 2 (Asai & Fukunaga, 2018) and Action Learner (Amado, Pereira, Aires, Magnaguagno, Granada, & Meneguzzi, 2018b;Amado, Aires, Pereira, Magnaguagno, Granada, & Meneguzzi, 2018a)), there is information loss between the learned action model and the original search space generated. ...
Preprint
In this report, we introduce an artificial dataset generator for Photo-realistic Blocksworld domain. Blocksworld is one of the oldest high-level task planning domain that is well defined but contains sufficient complexity, e.g., the conflicting subgoals and the decomposability into subproblems. We aim to make this dataset a benchmark for Neural-Symbolic integrated systems and accelerate the research in this area. The key advantage of such systems is the ability to obtain a symbolic model from the real-world input and perform a fast, systematic, complete algorithm for symbolic reasoning, without any supervision and the reward signal from the environment.
Chapter
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios, it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible. However, many early approaches in the area of Plan Recognition As Planning, require quite large amounts of computation time to calculate a solution. Mainly to address this issue, recently, Pereira et al. developed an approach that is based on planning landmarks and is much more computationally efficient than previous approaches. However, the approach, as proposed by Pereira et al., considers trivial landmarks (i.e., facts that are part of the initial state and goal description are landmarks by definition) for goal recognition. In this paper, we show that it does not provide any benefit to use landmarks that are part of the initial state in a planning landmark based goal recognition approach. The empirical results show that omitting initial state landmarks for goal recognition improves goal recognition performance.
Article
Multi-agent systems is an evolving discipline that encompasses many different branches of research. The long-standing Agents at Aberdeen ( A 3 ) group undertakes research across several areas of multi-agent systems, focusing in particular on aspects related to resilience, reliability, and coordination. In this article we introduce the group and highlight past research successes in those themes, building a picture of the strengths within the group. We close the paper outlining the future direction of the group and identify key open challenges and our vision towards solving them.
Article
There are many scenarios in which a mobile agent may not want its path to be predictable. Examples include preserving privacy or confusing an adversary. However, this desire for deception can conflict with the need for a low path cost. Optimal plans such as those produced by RRT* may have low path cost, but their optimality makes them predictable. Similarly, a deceptive path that features numerous zig-zags may take too long to reach the goal. We address this trade-off by drawing inspiration from adversarial machine learning. We propose a new planning algorithm, which we title Adversarial RRT*. Adversarial RRT* attempts to deceive machine learning classifiers by incorporating a predicted measure of deception into the planner cost function. Adversarial RRT* considers both path cost and a measure of predicted deceptiveness in order to produce a trajectory with low path cost that still has deceptive properties. We demonstrate the performance of Adversarial RRT*, with two measures of deception, using a simulated Dubins vehicle. We show how Adversarial RRT* can decrease cumulative RNN accuracy across paths to 10%, compared to 46% cumulative accuracy on near-optimal RRT* paths, while keeping path length within 16% of optimal. We also present an example demonstration where the Adversarial RRT* planner attempts to safely deliver a high value package while an adversary observes the path and tries to intercept the package.
Article
Recognizing goals and plans from complete or partial observations can be efficiently achieved through automated planning techniques. In many applications, it is important to recognize goals and plans not only accurately, but also quickly. To address this challenge, we develop novel goal recognition approaches based on planning techniques that rely on planning landmarks. In automated planning, landmarks are properties (or actions) that cannot be avoided to achieve a goal. We show the applicability of a number of planning techniques with an emphasis on landmarks for goal recognition tasks in two settings: (1) we use the concept of landmarks to develop goal recognition heuristics; and (2) we develop a landmark-based filtering method to refine existing planning-based goal and plan recognition approaches. These recognition approaches are empirically evaluated in experiments over several classical planning domains. We show that our goal recognition approaches yield not only accuracy comparable to (and often higher than) other state-of-the-art techniques, but also result in substantially faster recognition time over existing techniques.
Article
Plan recognition aims to discover target plans (i.e., sequences of actions) behind observed actions, with history plan libraries or domain models in hand. Previous approaches either discover plans by maximally "matching" observed actions to plan libraries, assuming target plans are from plan libraries, or infer plans by executing domain models to best explain the observed actions, assuming that complete domain models are available. In real world applications, however, target plans are often not from plan libraries, and complete domain models are often not available, since building complete sets of plans and complete domain models are often difficult or expensive. In this paper we view plan libraries as corpora and learn vector representations of actions using the corpora, we then discover target plans based on the vector representations. Specifically, we propose two approaches, DUP and RNNPlanner, to discover target plans based on vector representations of actions. DUP explores the EM-style framework to capture local contexts of actions and discover target plans by optimizing the probability of target plans, while RNNPlanner aims to leverage long-short term contexts of actions based on RNNs (recurrent neural networks) framework to help recognize target plans. In the experiments, we empirically show that our approaches are capable of discovering underlying plans that are not from plan libraries, without requiring domain models provided. We demonstrate the effectiveness of our approaches by comparing its performance to traditional plan recognition approaches in three planning domains. We also compare DUP and RNNPlanner to see their advantages and disadvantages.
Conference Paper
Full-text available
Computer-based human activity recognition of daily living has recently attracted much interest due to its applicability to ambient assisted living. Such applications require the automatic recognition of high-level activities composed of multiple actions performed by human beings in an environment. In this work, we address the problem of activity recognition in an indoor environment, focusing on a kitchen scenario. Unlike existing approaches that identify single actions from video sequences, we also identify the goal towards which the subject of the video is pursuing. Our hybrid approach combines a deep learning architecture to analyze raw video data and identify individual actions which are then processed by a goal recognition algorithm that uses a plan library describing possible overarching activities to identify the ultimate goal of the subject in the video. Experiments show that our approach achieves the state-of-the-art for identifying cooking activities in a kitchen scenario.
Article
Full-text available
Plan recognition algorithms infer agents' plans from their observed actions. Due to imperfect knowledge about the agent's behavior and the environment, it is often the case that there are multiple hypotheses about an agent's plans that are consistent with the observations, though only one of these hypotheses is correct. This paper addresses the problem of how to disambiguate between hypotheses, by querying the acting agent about whether a candidate plan in one of the hypotheses matches its intentions. This process is performed sequentially and used to update the set of possible hypotheses during the recognition process. The paper defines the sequential plan recognition process (SPRP), which seeks to reduce the number of hypotheses using a minimal number of queries. We propose a number of policies for the SPRP which use maximum likelihood and information gain to choose which plan to query. We show this approach works well in practice on two domains from the literature, significantly reducing the number of hypotheses using fewer queries than a baseline approach. Our results can inform the design of future plan recognition systems that interleave the recognition process with intelligent interventions of their users.
Article
Full-text available
Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to backpropagate through samples. In this work, we present an efficient gradient estimator that replaces the non-differentiable sample from a categorical distribution with a differentiable sample from a novel Gumbel-Softmax distribution. This distribution has the essential property that it can be smoothly annealed into a categorical distribution. We show that our Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Article
Full-text available
Recognition of goals and plans using incomplete evidence from action execution can be done efficiently by using planning techniques. In many applications it is important to recognize goals and plans not only accurately, but also quickly. In this paper, we develop a heuristic approach for recognizing plans based on planning techniques that rely on ordering constraints to filter candidate goals from observations. In the planning literature, these ordering constraints are called landmarks, which are facts or actions that cannot be avoided to achieve a goal. We show the applicability of planning landmarks in two settings: first, we use it directly to develop a heuristic-based plan recognition approach; second, we refine an existing planning-based plan recognition approach. Our empirical evaluation shows that our approach is not only substantially more accurate than the state-of-the-art in all available datasets, it is also an order of magnitude faster.
Article
Full-text available
Recent discoveries in automated planning are broadening the scope of planners, from toy problems to real applications. However, applying automated planners to real-world problems is far from simple. On the one hand, the definition of accurate action models for planning is still a bottleneck. On the other hand, off-the-shelf planners fail to scale-up and to provide good solutions in many domains. In these problematic domains, planners can exploit domain-specific control knowledge to improve their performance in terms of both speed and quality of the solutions. However, manual definition of control knowledge is quite difficult. This paper reviews recent techniques in machine learning for the automatic definition of planning knowledge. It has been organized according to the target of the learning process: automatic definition of planning action models and automatic definition of planning control knowledge. In addition, the paper reviews the advances in the related field of reinforcement learning.
Conference Paper
Full-text available
Plan recognition is the problem of inferring the goals and plans of an agent after observing its behavior. Recently, it has been shown that this problem can be solved efficiently, without the need of a plan library, using slightly modified planning algorithms. In this work, we extend this approach to the more general problem of probabilistic plan recognition where a probability distribution over the set of goals is sought under the assumptions that actions have deterministic effects and both agent and observer have complete information about the initial state. We show that this problem can be solved efficiently using classical planners provided that the probability of a partially observed execution given a goal is defined in terms of the cost difference of achieving the goal under two conditions: complying with the observations, and not complying with them. This cost, and hence the posterior goal probabilities, are computed by means of two calls to a classical planner that no longer has to be modified in any way. A number of examples is considered to illustrate the quality, flexibility, and scalability of the approach. Copyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Conference Paper
Full-text available
In this paper, we present a software assistant agent that can proactively manage information on behalf of cognitively overloaded users. We develop an agent architecture, known here as ANTicipatory Information and Planning Agent (ANTIPA), to provide the user with relevant information in a timely manner. In order both to recognize user plans unobtmsively and to reason about time constraints, ANTIPA integrates probabilistic plan recognition with constraint-based information gathering. This paper focuses on our probabilistic plan prediction algorithm inspired by a decision theory that human users make decisions based on long-term outcomes. A proof of concept user study shows a promising result.
Conference Paper
Full-text available
Previous work has shown that the dicul- ties in learning deep generative or discrim- inative models can be overcome by an ini- tial unsupervised learning step that maps in- puts to useful intermediate representations. We introduce and motivate a new training principle for unsupervised learning of a rep- resentation based on the idea of making the learned representations robust to partial cor- ruption of the input pattern. This approach can be used to train autoencoders, and these denoising autoencoders can be stacked to ini- tialize deep architectures. The algorithm can be motivated from a manifold learning and information theoretic perspective or from a generative model perspective. Comparative experiments clearly show the surprising ad- vantage of corrupting the input of autoen- coders on a pattern classification benchmark suite.
Conference Paper
Full-text available
In this paper we describe a software assistant agent that can proactively assist human users situated in a time-constrained environment to perform normative reasoning-reasoning about prohibitions and obligations-so that the user can focus on her planning objectives. In order to provide proactive assistance, the agent must be able to 1) recognize the user's planned activities, 2) reason about potential needs of assistance associated with those predicted activities, and 3) plan to provide appropriate assistance suitable for newly identified user needs. To address these specific requirements, we develop an agent architecture that integrates user intention recognition, normative reasoning over a user's intention, and planning, execution and replanning for assistive actions. This paper presents the agent architecture and discusses practical applications of this approach.
Conference Paper
Full-text available
In this work we aim to narrow the gap between plan recognition and planning by exploiting the power and generality of recent planning algorithms for recognizing the set G of goals G that explain a se- quence of observations given a domain theory. Af- ter providing a crisp definition of this set, we show by means of a suitable problem transformation that a goalGbelongs toG if there is an action sequence that is an optimal plan for both the goal G and the goal G extended with extra goals representing the observations. Exploiting this result, we show how the set G can be computed exactly and approxi- mately by minor modifications of existing optimal and suboptimal planning algorithms, and existing polynomial heuristics. Experiments over several domains show that the suboptimal planning algo- rithms and the polynomial heuristics provide good approximations of the optimal goal set G while scaling up as well as state-of-the-art planning al- gorithms and heuristics.
Article
Full-text available
In recent years research in the planning community has moved increasingly towards application of plan- ners to realistic problems involving both time and many types of resources. For example, interest in planning demonstrated by the space research community has inspired work in observation scheduling, planetary rover exploration and spacecraft control domains. Other temporal and resource-intensive domains including logis- tics planning, plant control and manufacturing have also helped to focus the community on the modelling and reasoning issues that must be confronted to make planning technology meet the challenges of application. The international planning competitions have acted as an important motivating force behind the progress that has been made in planning since 1998. The third competition (held in 2002) set the planning community the challenge of handling time and numeric resources. This necessitated the development of a modelling lan- guage capable of expressing temporal and numeric properties of planning domains. In this paper we describe the language, PDDL2.1, that was used in the competition. We describe the syntax of the language, its formal semantics and the validation of concurrent plans. We observe that PDDL2.1 has considerable modelling power — exceeding the capabilities of current planning technology — and presents a number of important challenges to the research community.
Conference Paper
Full-text available
The research areas of plan recognition and natu- ral language parsing share many common features and even algorithms. However, the dialog between these two disciplines has not been effective. Specif- ically, significant recent results in parsing mildly context sensitive grammars have not been lever- aged in the state of the art plan recognition sys- tems. This paper will outline the relations between natural language processing(NLP) and plan recog- nition(PR), argue that each of them can effectively inform the other, and then focus on key recent re- search results in NLP and argue for their applica- bility to PR.
Article
Full-text available
Many known planning tasks have inherent constraints concerning the best order in which to achieve the goals. A number of research efforts have been made to detect such constraints and to use them for guiding search, in the hope of speeding up the planning process. We go beyond the previous approaches by considering ordering constraints not only over the (top-level) goals, but also over the sub-goals that will necessarily arise during planning. Landmarks are facts that must be true at some point in every valid solution plan. We extend Koehler and Hoffmann's definition of reasonable orders between top level goals to the more general case of landmarks. We show how landmarks can be found, how their reasonable orders can be approximated, and how this information can be used to decompose a given planning task into several smaller sub-tasks. Our methodology is completely domain- and planner-independent. The implementation demonstrates that the approach can yield significant runtime performance improvements when used as a control loop around state-of-the-art sub-optimal planning systems, as exemplified by FF and LPG.
Article
We propose a new problem we refer to as goal recognitiondesign (grd), in which we take a domain theory and a set ofgoals and ask the following questions: to what extent do theactions performed by an agent within the model reveal its objective, and what is the best way to modify a model so thatany agent acting in the model reveals its objective as early aspossible. Our contribution is the introduction of a new measure we call worst case distinctiveness (wcd) with which weassess a grd model. The wcd represents the maximal lengthof a prefix of an optimal path an agent may take within a system before it becomes clear at which goal it is aiming. Tomodel and solve the grd problem we choose to use the models and tools from the closely related field of automated planning. We present two methods for calculating the wcd of agrd model, one of which is based on a novel compilation to aclassical planning problem. We then propose a way to reducethe wcd of a model by limiting the set of available actions anagent can perform and provide a method for calculating theoptimal set of actions to be removed from the model. Our empirical evaluation shows the proposed solution to be effectivein computing and minimizing wcd.
Article
Plan recognition is the problem of inferring the goals and plans of an agent after observing its behavior. Recently, it has been shown that this problem can be solved efficiently, without the need of a plan library, using slightly modified planning algorithms. In this work, we extend this approach to the more general problem of probabilistic plan recognition where a probability distribution over the set of goals is sought under the assumptions that actions have deterministic effects and both agent and observer have complete information about the initial state. We show that this problem can be solved efficiently using classical planners provided that the probability of a partially observed execution given a goal is defined in terms of the cost difference of achieving the goal under two conditions: complying with the observations, and not complying with them. This cost, and hence the posterior goal probabilities, are computed by means of two calls to a classical planner that no longer has to be modified in any way. A number of examples is considered to illustrate the quality, flexibility, and scalability of the approach.
Article
Current domain-independent, classical planners require symbolic models of the problem domain and instance as input, resulting in a knowledge acquisition bottleneck. Meanwhile, although deep learning has achieved significant success in many fields, the knowledge is encoded in a subsymbolic representation which is incompatible with symbolic systems such as planners. We propose Latplan, an unsupervised architecture combining deep learning and classical planning. Given only an unlabeled set of image pairs showing a subset of transitions allowed in the environment (training inputs), Latplan learns a complete propositional PDDL action model of the environment. Later, when a pair of images representing the initial and the goal states (planning inputs) is given, Latplan finds a plan to the goal state in a symbolic latent space and returns a visualized plan execution. We evaluate Latplan using image-based versions of 6 planning domains: 8-puzzle, 15-Puzzle, Blocksworld, Sokoban and Two variations of LightsOut.
Article
Automated planning can be used to efficiently recognize goals and plans from partial or full observed action sequences. In this paper, we propose goal recognition heuristics that rely on information from planning landmarks - facts or actions that must occur if a plan is to achieve a goal when starting from some initial state. We develop two such heuristics: the first estimates goal completion by considering the ratio between achieved and extracted landmarks of a candidate goal, while the second takes into account how unique each landmark is among landmarks for all candidate goals. We empirically evaluate these heuristics over both standard goal/plan recognition problems, and a set of very large problems. We show that our heuristics can recognize goals more accurately, and run orders of magnitude faster, than the current state-of-the-art.
Article
While logistic sigmoid neurons are more biologically plausable that hyperbolic tangent neurons, the latter work better for training multi-layer neural networks. This paper shows that rectifying neurons are an even better model of biological neurons and yield equal or better performance than hyperbolic tangent networks in spite of the hard non-linearity and non-differentiability at zero, creating sparse representations with true zeros, which seem remarkably suitable for naturally sparse data. Even though they can take advantage of semi-supervised setups with extra-unlabelled data, deep rectifier networks can reach their best performance without requiring any unsupervised pre-training on purely supervised tasks with large labelled data sets. Hence, these results can be seen as a new milestone in the attempts at understanding the difficulty in training deep but purely supervised nueral networks, and closing the performance gap between neural networks learnt with and without unsupervised pre-training
Conference Paper
How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions is two-fold. First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods. Second, we show that for i.i.d. datasets with continuous latent variables per datapoint, posterior inference can be made especially efficient by fitting an approximate inference model (also called a recognition model) to the intractable posterior using the proposed lower bound estimator. Theoretical advantages are reflected in experimental results.
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Conference Paper
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.
Conference Paper
Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to backpropagate through samples. In this work, we present an efficient gradient estimator that replaces the non-differentiable sample from a categorical distribution with a differentiable sample from a novel Gumbel-Softmax distribution. This distribution has the essential property that it can be smoothly annealed into a categorical distribution. We show that our Gumbel-Softmax estimator outperforms state-of-the-art gradient estimators on structured output prediction and unsupervised generative modeling tasks with categorical latent variables, and enables large speedups on semi-supervised classification.
Article
Planning is the model-based approach to autonomous behavior where the agent behavior is derived automatically from a model of the actions, sensors, and goals. The main challenges in planning are computational as all models, whether featuring uncertainty and feedback or not, are intractable in the worst case when represented in compact form. In this book, we look at a variety of models used in AI planning, and at the methods that have been developed for solving them. The goal is to provide a modern and coherent view of planning that is precise, concise, and mostly self-contained, without being shallow. For this, we make no attempt at covering the whole variety of planning approaches, ideas, and applications, and focus on the essentials. The target audience of the book are students and researchers interested in autonomous behavior and planning from an AI, engineering, or cognitive science perspective.
Article
Lights Out! is an electrical game played on a 5×5-grid where each cell has a button and an indicator light. Pressing the button will change the light of the cell and the lights of its rectilinear adjacent neighbors. Given an initial configuration of lights, some on and some off, the goal of the game is to switch all lights off. The game can be generalized to arbitrary graphs instead of a grid. Lights Out! has been studied independently by three different communities, graph theoreticians, gamers, and algorithmicists. In this paper, we survey the game and present the results in a unified framework.
Article
The ever-increasing size of modern data sets combined with the difficulty of obtaining label information has made semi-supervised learning one of the problems of significant practical importance in modern data analysis. We revisit the approach to semi-supervised learning with generative models and develop new models that allow for effective generalisation from small labelled data sets to large unlabelled ones. Generative approaches have thus far been either inflexible, inefficient or non-scalable. We show that deep generative models and approximate Bayesian inference exploiting recent advances in variational methods can be used to provide significant improvements, making generative approaches highly competitive for semi-supervised learning.
Conference Paper
Plan recognition has been widely used in agents that need to infer which plans are being executed or which activities are being performed by others. In many applications reasoning and acting in response to plan recognition requires time. In such systems, plan recognition is expected to be made not only with precision, but also in a timely fashion. When recognition cannot be made in time, the plan recognition agent can interact with the observed agents to disambiguate multiple hypotheses, however, such an intrusive behavior is either not possible, very costly, or undesirable. In this paper, we focus on the problem of deciding when to interact with the observed agents in order to determine their plans under execution. To tackle this problem, we develop a plan recognizer that, on the one hand is the least intrusive possible, and on the other hand, attempts to recognize the plans of the observed agents with precision as soon as possible and no later than it is viable to respond to the recognized plan.
Article
Can we efficiently learn the parameters of directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and in case of large datasets? We introduce a novel learning and approximate inference method that works efficiently, under some mild conditions, even in the on-line and intractable case. The method involves optimization of a stochastic objective function that can be straightforwardly optimized w.r.t. all parameters, using standard gradient-based optimization methods. The method does not require the typically expensive sampling loops per datapoint required for Monte Carlo EM, and all parameter updates correspond to optimization of the variational lower bound of the marginal likelihood, unlike the wake-sleep algorithm. These theoretical advantages are reflected in experimental results.
Article
We present the PHATT algorithm for plan recognition. Unlike previous approaches to plan recognition, PHATT is based on a model of plan execution. We show that this clarifies several difficult issues in plan recognition including the execution of multiple interleaved root goals, partially ordered plans, and failing to observe actions. We present the PHATT algorithm's theoretical basis, and an implementation based on tree structures. We also investigate the algorithm's complexity, both analytically and empirically. Finally, we present PHATT's integrated constraint reasoning for parametrized actions and temporal constraints.
Article
We describe a new problem solver called STRIPS that attempts to find a sequence of operators in a space of world models to transform a given initial world model in which a given goal formula can be proven to be true. STRIPS represents a world model as an arbitrary collection in first-order predicate calculus formulas and is designed to work with models consisting of large numbers of formula. It employs a resolution theorem prover to answer questions of particular models and uses means-ends analysis to guide it to the desired goal-satisfying model.
Conference Paper
Recent applications of plan recognition face sev- eral open challenges: (i) matching observations to the plan library is costly, especially with com- plex multi-featured observations; (ii) computing recognition hypotheses is expensive. We present techniques for addressing these challenges. First, we show a novel application of machine-learning decision-tree to efficiently map multi-featured ob- servations to matching plan steps. Second, we pro- vide efficient lazy-commitment recognition algo- rithms that avoid enumerating hypotheses with ev- ery observation, instead only carrying out book- keeping incrementally. The algorithms answer queries as to the current state of the agent, as well as its history of selected states. We provide empirical results demonstrating their efficiency and capabili- ties.
Article
Fast Downward is a classical planning system based on heuristic search. It can deal with general deterministic planning problems encoded in the propositional fragment of PDDL2.2, including advanced features like ADL conditions and effects and derived predicates (axioms). Like other well-known planners such as HSP and FF, Fast Downward is a progression planner, searching the space of world states of a planning task in the forward direction. However, unlike other PDDL planning systems, Fast Downward does not use the propositional PDDL representation of a planning task directly. Instead, the input is first translated into an alternative representation called multi-valued planning tasks, which makes many of the implicit constraints of a propositional planning task explicit. Exploiting this alternative representation, Fast Downward uses hierarchical decompositions of planning tasks for computing its heuristic function, called the causal graph heuristic, which is very different from traditional HSP-like heuristics based on ignoring negative interactions of operators. In this article, we give a full account of Fast Downwards approach to solving multi-valued planning tasks. We extend our earlier discussion of the causal graph heuristic to tasks involving axioms and conditional effects and present some novel techniques for search control that are used within Fast Downwards best-first search algorithm: preferred operators transfer the idea of helpful actions from local search to global best-first search, deferred evaluation of heuristic functions mitigates the negative effect of large branching factors on search performance, and multi-heuristic best-first search combines several heuristic evaluation functions within a single search algorithm in an orthogonal way. We also describe efficient data structures for fast state expansion (successor generators and axiom evaluators) and present a new non-heuristic search algorithm called focused iterative-broadening search, which utilizes the information encoded in causal graphs in a novel way. Fast Downward has proven remarkably successful: It won the "classical (i.e., propositional, non-optimising) track of the 4th International Planning Competition at ICAPS 2004, following in the footsteps of planners such as FF and LPG. Our experiments show that it also performs very well on the benchmarks of the earlier planning competitions and provide some insights about the usefulness of the new search enhancements.
Article
High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.
Monitoring plan optimality using landmarks and domain-independent heuristics
  • R F Pereira
  • N Oren
  • F Meneguzzi
A Fast Goal Recognition Technique Based on Interaction Estimates
  • Y E Martín
  • M D R Moreno
  • D E Smith
Plan Recognition as Planning Revisited
  • S Sohrabi
  • A V Riabov
  • O Udrea
Goal and plan recognition datasets using classical planning domains
  • pereira
Monitoring plan optimality using landmarks and domain-independent heuristics
  • pereira
A Fast Goal Recognition Technique Based on Interaction Estimates
  • martín
Plan Recognition as Planning Revisited
  • sohrabi