Conference Paper

Cost-Based Goal Recognition for the Path-Planning Domain

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

"Plan recognition as planning" uses an off-the-shelf planner to perform goal recognition. In this paper, we apply the technique to path-planning. We show that a simpler formula provides an identical result in all but one set of conditions and, further, that identical ranking of goals by probability can be achieved without using any observations other than the agent's start location and where she is "now".

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... These calls, however few, can still be arbitrarily expensive, making this approach unsuitable for fast recognition under certain conditions. Recent research on goal recognition substantially improves efficiency for both domains [12,17,16]. However, these approaches formulate the problem in a discrete space using a planning formalism (typically STRIPS) or rely on a discretization of continuous space [9]. ...
... Most work on online goal recognition for continuous domain applications considers only the path-planning problem, disregarding dynamics characteristics of the agents, so the plan consists only of a geometric collision-free path [12,9]. However, trajectory planning is a complex continuous motion planning problem where the computation of a collision-free path requires all dynamics and constraints of the agent. ...
... Our contribution is threefold. First, we develop an online inference method to compute the probability distribution of the goal hypotheses based on the work of Ramírez at al. [19] and Masters at al. [12] that obviates the need for calls to a planner at run-time. We base our inference method on the Euclidean distance between a pre-computed sub-optimal trajectory and observations of the real agent. ...
Preprint
While recent work on online goal recognition efficiently infers goals under low observability, comparatively less work focuses on online goal recognition that works in both discrete and continuous domains. Online goal recognition approaches often rely on repeated calls to the planner at each new observation, incurring high computational costs. Recognizing goals online in continuous space quickly and reliably is critical for any trajectory planning problem since the real physical world is fast-moving, e.g. robot applications. We develop an efficient method for goal recognition that relies either on a single call to the planner for each possible goal in discrete domains or a simplified motion model that reduces the computational burden in continuous ones. The resulting approach performs the online component of recognition orders of magnitude faster than the current state of the art, making it the first online method effectively usable for robotics applications that require sub-second recognition.
... There are at least two conventional approaches to model-based goal recognition. The first approach uses the ontic actions (physical actions that change the state of the world) of agents to estimate the probabilities of candidate goals (Ramírez & Geffner, 2010;Masters & Sardina, 2017a;Pereira et al., 2017). ...
... Goal recognition can be classified into three categories (Carberry, 2001): keyhole recognition, in which an agent does not intend to change its behaviour; intended recognition, whereby the agent attempts to reveal its real goal; and adversarial recognition when an agent hides its real goal. There are many keyhole goal recognition algorithms for honest, natural and rational behaviours (Ramírez & Geffner, 2009;Masters & Sardina, 2017a;Vered, Kaminka, & Biham, 2016;Singh et al., 2018), as well as for adversarial settings, such as (Keren, Gal, & Karpas, 2015;Masters & Sardina, 2019). ...
... is the likelihood of an intention i, defined in Equation 2 where β = 1 (Masters & Sardina, 2017a), κ is the normalising constant. P (O a | i) is the probability of observing actions O a if i was the intention of the agent, calculated using any method for model-based goal recognition (Vered et al., 2016). ...
Article
Eye gaze has the potential to provide insight into the minds of individuals, and this idea has been used in prior research to improve human goal recognition by combining human's actions and gaze. However, most existing research assumes that people are rational and honest. In adversarial scenarios, people may deliberately alter their actions and gaze, which presents a challenge to goal recognition systems. In this paper, we present new models for goal recognition under deception using a combination of gaze behaviour and observed movements of the agent. These models aim to detect when a person is deceiving by analysing their gaze patterns and use this information to adjust the goal recognition. We evaluated our models in two human-subject studies: (1) using data collected from 30 individuals playing a navigation game inspired by an existing deception study and (2) using data collected from 40 individuals playing a competitive game (Ticket To Ride). We found that one of our models (Modulated Deception Gaze+Ontic) offers promising results compared to the previous state-of-the-art model in both studies. Our work complements existing adversarial goal recognition systems by equipping these systems with the ability to tackle ambiguous gaze behaviours.
... Firstly, we proposed a method to solve DPP problems under specific time constraints based on count-based Qlearning (DPP_Q), which is applicable to discrete path-planning domains. DPP_Q is based on a traditional cost-difference-based goal recognition model [8], which is proposed to solve the goal recognition problems under discrete domains. Because this goal recognition model is full of the observation sequence, its posterior probability calculation for each goal is precise. ...
... Assuming agents are rational, it is possible to evaluate the types of goals they set. Building on this idea, Masters et al. proposed the traditional cost-difference-based goal recognition model [8], establishing the probability of relevant goals based on the cost difference between the optimal cost that matches the current observation and the optimal cost without considering any node of the observation sequence. ...
Article
Full-text available
Deceptive path planning (DPP) aims to find a path that minimizes the probability of the observer identifying the real goal of the observed before it reaches. It is important for addressing issues such as public safety, strategic path planning, and logistics route privacy protection. Existing traditional methods often rely on “dissimulation”—hiding the truth—to obscure paths while ignoring the time constraints. Building upon the theory of probabilistic goal recognition based on cost difference, we proposed a DPP method, DPP_Q, based on count-based Q-learning for solving the DPP problems in discrete path-planning domains under specific time constraints. Furthermore, to extend this method to continuous domains, we proposed a new model of probabilistic goal recognition called the Approximate Goal Recognition Model (AGRM) and verified its feasibility in discrete path-planning domains. Finally, we also proposed a DPP method based on proximal policy optimization for continuous path-planning domains under specific time constraints called DPP_PPO. DPP methods like DPP_Q and DPP_PPO are types of research that have not yet been explored in the field of path planning. Experimental results show that, in discrete domains, compared to traditional methods, DPP_Q exhibits better effectiveness in enhancing the average deceptiveness of paths. (Improved on average by 12.53% compared to traditional methods). In continuous domains, DPP_PPO shows significant advantages over random walk methods. Both DPP_Q and DPP_PPO demonstrate good applicability in path-planning domains with uncomplicated obstacles.
... Therefore, the goal that best explains the observed path is more likely to be the real goal. This can be used to generate a probability distribution over the candidate goal-set (Masters and Sardina 2017a). Inverse reinforcement learning (IRL) is the task of deducing a reward function given traces of the agent's behaviour (Ng and Russell 2000). ...
... Measures. For deceptiveness, we measure the real goal probability at each time-step, using Masters and Sardina (2017a) intention recognition model. We also measure the number of steps taken after the last deceptive point (LDP). ...
Article
This paper investigates deceptive reinforcement learning for privacy preservation in model-free and continuous action space domains. In reinforcement learning, the reward function defines the agent's objective. In adversarial scenarios, an agent may need to both maximise rewards and keep its reward function private from observers. Recent research presented the ambiguity model (AM), which selects actions that are ambiguous over a set of possible reward functions, via pre-trained Q-functions. Despite promising results in model-based domains, our investigation shows that AM is ineffective in model-free domains due to misdirected state space exploration. It is also inefficient to train and inapplicable in continuous action spaces. We propose the deceptive exploration ambiguity model (DEAM), which learns using the deceptive policy during training, leading to targeted exploration of the state space. DEAM is also applicable in continuous action spaces. We evaluate DEAM in discrete and continuous action space path planning environments. DEAM achieves similar performance to an optimal model-based version of AM and outperforms a model-free version of AM in terms of path cost, deceptiveness and training efficiency. These results extend to the continuous domain.
... It outputs either a sequence of future steps or a hierarchical plan (Schmidt et al., 1978;Kautz, 1987;Blaylock and Allen, 2006;Wiseman and Shieber, 2014;Chakraborti et al., 2017). Recent advancements have applied plan recognition to a variety of real-world domains, including education (Amir and Gal, 2013;Uzan et al., 2015), cyber security (Geib and Goldman, 2001;Bisson et al., 2011;Mirsky et al., 2017b) and more (Masters and Sardina, 2017;Vered and Kaminka, 2017). Although all of these domains have a lot in common in terms of the problem being solved and the components of a recognition problem, there is no single standard representation that allows for a comparison of these works, as they use different models to represent the possible plans an actor can take in the environment (Carberry, 2001;Sukthankar et al., 2014;Mirsky et al., 2021). ...
... There are several approaches to represent a domain in plan recognition. Some recent advent of work on plan recognition as planning takes as input a planning domain, usually described in STRIPS (Standford Research Institute Problem Solver) (Fikes and Nilsson, 1971), a set of possible goals and selects one of the goals (Ramırez and Geffner, 2010;Sohrabi et al., 2016;Freedman and Zilberstein, 2017;Masters and Sardina, 2017;Pereira et al., 2017;Shvo et al., 2017;Vered and Kaminka, 2017). In this work, we focus on PL-based plan recognition (Blaylock and Allen, 2006;Sukthankar and Sycara, 2008;Kabanza et al., 2013;Geib, 2017;Barták et al., 2018). ...
Article
Full-text available
Plan recognition deals with reasoning about the goals and execution process of an actor, given observations of its actions. It is one of the fundamental problems of AI, applicable to many domains, from user interfaces to cyber-security. Despite the prevalence of these approaches, they lack a standard representation, and have not been compared using a common testbed. This paper provides a first step towards bridging this gap by providing a standard plan library representation that can be used by hierarchical, discrete-space plan recognition and evaluation criteria to consider when comparing plan recognition algorithms. This representation is comprehensive enough to describe a variety of known plan recognition problems and can be easily used by existing algorithms in this class. We use this common representation to thoroughly compare two known approaches, represented by two algorithms, SBR and Probabilistic Hostile Agent Task Tracker (PHATT). We provide meaningful insights about the differences and abilities of these algorithms, and evaluate these insights both theoretically and empirically. We show a tradeoff between expressiveness and efficiency: SBR is usually superior to PHATT in terms of computation time and space, but at the expense of functionality and representational compactness. We also show how different properties of the plan library affect the complexity of the recognition process, regardless of the concrete algorithm used. Lastly, we show how these insights can be used to form a new algorithm that outperforms existing approaches both in terms of expressiveness and efficiency.
... Several works have followed elaborating or extending Ramirez and Geffner's set-up, or grounding it to specific interesting settings (e.g., navigation). Here, we shall adopt the most recent elaboration by Masters and Sardina [23][24][25], which refined the original set-up to achieve a simpler and computationally less demanding GR account, and one able to handle irrational agent behavior parsimoniously without counter-intuitive outcomes. Concretely, taking optc(S, τ , G) to denote the optimal cost of reaching goal G from state S by embedding the sequence of observations τ , 2 we first define the cost difference of reaching G from S via observations τ as follows: ...
... So, considering ω(τ , α G ) to be an alignment weight (defined below) of τ against the model α G learned for goal G, we follow Masters and Sardina [23] in using a true Boltzmann distribution instead of a sigmoidal, and re-write Equation (1) as follows: ...
Conference Paper
Full-text available
The problem of probabilistic goal recognition consists of automatically inferring a probability distribution over a range of possible goals of an autonomous agent based on the observations of its behavior. The state-of-the-art approaches for probabilistic goal recognition assume the full knowledge about the world the agent operates in and possible agent's operations in this world. In this paper, we propose a framework for solving the probabilistic goal recognition problem using process mining techniques for discovering models that describe the observed behavior and diagnosing deviations between the discovered models and observations. The framework imitates the principles of observational learning, one of the core mechanisms of social learning exhibited by humans, and relaxes the above assumptions. It has been implemented in a publicly available tool. The reported experimental results confirm the effectiveness and efficiency of the approach, both for rational and irrational agents' behaviors.
... In this paper, the authors show that, for some domains, the use of multiple high-quality plans along with this novel probabilistic framework yields better results than using only one plan [78]. Masters and Sardiña [48,49] propose a fast and accurate goal recognition approach that works strictly in the context of path-planning, providing a novel probabilistic framework for goal recognition in path planning, which is basically a revised and improved version of the probabilistic framework of Ramírez and Geffner [78]. This novel probabilistic framework for path-planning shows that it is possible to compute the probability distribution over the goals much simpler and faster than the one proposed by Ramírez and Geffner [78], considering only a single observation, namely the current state. ...
... with Euclidean space, such as the one introduced by Masters and Sardiña in [48,49]. An interesting extension for this work would be addressing another imperfect aspect in this type of model, such as approximate cost functions. ...
Preprint
Goal recognition is the problem of recognizing the intended goal of autonomous agents or humans by observing their behavior in an environment. Over the past years, most existing approaches to goal and plan recognition have been ignoring the need to deal with imperfections regarding the domain model that formalizes the environment where autonomous agents behave. In this thesis, we introduce the problem of goal recognition over imperfect domain models, and develop solution approaches that explicitly deal with two distinct types of imperfect domains models: (1) incomplete discrete domain models that have possible, rather than known, preconditions and effects in action descriptions; and (2) approximate continuous domain models, where the transition function is approximated from past observations and not well-defined. We develop novel goal recognition approaches over imperfect domains models by leveraging and adapting existing recognition approaches from the literature. Experiments and evaluation over these two types of imperfect domains models show that our novel goal recognition approaches are accurate in comparison to baseline approaches from the literature, at several levels of observability and imperfections.
... Goal recognition evaluates the posterior P(G|O) over the possible goal set G given observation O; further, P(G|O) is inversely used to quantify different forms of deception defined over each individual node. Further, we exploit the recent finding stated in [8], which has also been applied in [5], that goal recognition in path-planning does not need refer to any other historical observation given the observed agent's starting point and its current location. This finding enables that at any node, the posterior probability distribution upon all possible goals, along with the following deception values, could be precalculated at any time, and has nothing to do with the path taken to reach that node. ...
... As we have talked about in the last section, Masters and Sardina [8] proved that in path-planning domain, goal recognition needs not refer to any other historical observation given the observed agent's source and current location. In other words, only the final observation in the sequence that matters: ...
Article
Full-text available
Deceptive path-planning is the task of finding a path so as to minimize the probability of an observer (or a defender) identifying the observed agent’s final goal before the goal has been reached. It is one of the important approaches to solving real-world challenges, such as public security, strategic transportation, and logistics. Existing methods either cannot make full use of the entire environments’ information, or lack enough flexibility for balancing the path’s deceptivity and available moving resource. In this work, building on recent developments in probabilistic goal recognition, we formalized a single real goal magnitude-based deceptive path-planning problem followed by a mixed-integer programming based deceptive path maximization and generation method. The model helps to establish a computable foundation for any further imposition of different deception concepts or strategies, and broadens its applicability in many scenarios. Experimental results showed the effectiveness of our methods in deceptive path-planning compared to the existing one.
... Other investigations of PRP exist. [Masters and Sardina, 2017]provided a simpler formula than that of[2010]achieving identical results in half the time, still in discrete environments. Sohrabi et. ...
... Each plan is likewise a trajectory in the same space, as plans are modeled by their effects. Thus generating a plan m g that perfectly matches the observations is done by composing it from two parts: @BULLET A plan prefix, (denoted m − g ) is built by concatenating all observations in O into a single trajectory ([Masters and Sardina, 2017]have shown that the same plan prefix may be generated for all possible trajectories ). @BULLET A plan suffix (denoted m + g ) is generated by calling the planner, to generate a trajectory from the last observed point in the prefix m − g (the ending point of the last observation in O) to the goal g. ...
Article
Goal recognition is the problem of inferring the goal of an agent, based on its observed actions. An inspiring approach - plan recognition by planning (PRP) - uses off-the-shelf planners to dynamically generate plans for given goals, eliminating the need for the traditional plan library. However, existing PRP formulation is inherently inefficient in online recognition, and cannot be used with motion planners for continuous spaces. In this paper, we utilize a different PRP formulation which allows for online goal recognition, and for application in continuous spaces. We present an online recognition algorithm, where two heuristic decision points may be used to improve run-time significantly over existing work. We specify heuristics for continuous domains, prove guarantees on their use, and empirically evaluate the algorithm over hundreds of experiments in both a 3D navigational environment and a cooperative robotic team task.
... Assuming agent rationality allows for the evaluation of goal types. Masters et al. [24] proposed a traditional cost-difference-based goal recognition model that, estimates goal probabilities based on cost disparities between optimal paths matching current observations and those that ignore specific nodes in the observation sequence. ...
Article
Full-text available
Deceptive path planning (DPP) aims to find routes that reduce the chances of observers discovering the real goal before its attainment, which is essential for addressing public safety, strategic path planning, and preserving the confidentiality of logistics routes. Currently, no single metric is available to comprehensively evaluate the performance of deceptive paths. This paper introduces two new metrics, termed “Average Deception Degree” (ADD) and “Average Deception Intensity” (ADI) to measure the overall performance of a path. Unlike traditional methods that focus solely on planning paths from the start point to the endpoint, we propose a reverse planning approach in which paths are considered from the endpoint back to the start point. Inverting the path from the endpoint back to the start point yields a feasible DPP solution. Based on this concept, we extend the existing πd1~4 method to propose a new approach, e_πd1~4, and introduce two novel methods, Endpoint DPP_Q and LDP DPP_Q, based on the existing DPP_Q method. Experimental results demonstrate that e_πd1~4 achieves significant improvements over πd1~4 (an overall average improvement of 8.07%). Furthermore, Endpoint DPP_Q and LDP DPP_Q effectively address the issue of local optima encountered by DPP_Q. Specifically, in scenarios where the real and false goals have distinctive distributions, Endpoint DPP_Q and LDP DPP_Q show notable enhancements over DPP_Q (approximately a 2.71% improvement observed in batch experiments on 10 × 10 maps). Finally, tests on larger maps from Moving-AI demonstrate that these improvements become more pronounced as the map size increases. The introduction of ADD, ADI and the three new methods significantly expand the applicability of πd1~4 and DPP_Q in more complex scenarios.
... Several MBGR approaches focus on shortening the Inference Time by using pre-processing. For example, in metric-based goal recognition, distances between potential states are computed in the Domain Learning Time (Smith et al., 2015) or in the Goals Adaptation Time (Masters & Sardina, 2017;Vered & Kaminka, 2017;Mirsky et al., 2019). ...
Preprint
Full-text available
Traditionally, Reinforcement Learning (RL) problems are aimed at optimization of the behavior of an agent. This paper proposes a novel take on RL, which is used to learn the policy of another agent, to allow real-time recognition of that agent's goals. Goal Recognition (GR) has traditionally been framed as a planning problem where one must recognize an agent's objectives based on its observed actions. Recent approaches have shown how reinforcement learning can be used as part of the GR pipeline, but are limited to recognizing predefined goals and lack scalability in domains with a large goal space. This paper formulates a novel problem, "Online Dynamic Goal Recognition" (ODGR), as a first step to address these limitations. Contributions include introducing the concept of dynamic goals into the standard GR problem definition, revisiting common approaches by reformulating them using ODGR, and demonstrating the feasibility of solving ODGR in a navigation domain using transfer learning. These novel formulations open the door for future extensions of existing transfer learning-based GR methods, which will be robust to changing and expansive real-time environments.
... Following the method by Masters and Sardiña [33], the computed alignment weights are used to compute the posterior probabilities of the agent pursuing each candidate goal. Given an observed trace τ, the probability of the agent pursuing goal G, from a set of candidate goals G, that is Pr(G | τ), is obtained as shown in Eq. (2). ...
... The costs c(O, g) and c(O, g) can be calculated using classical planning systems. Many studies have utilized automated planning techniques to conduct research on goal recognition, building on the foundational computational principles outlined above [10,18,22]. In the context of the domain hypothesis, Ramírez and Geffner (2011) and Oh et al. (2011) [23] investigated the stochastic variability of planning domains Dp, utilizing Markov models to represent this variability. ...
Article
Full-text available
The problem of goal recognition involves inferring the high-level task goals of an agent based on observations of its behavior in an environment. Current methods for achieving this task rely on offline comparison inference of observed behavior in discrete environments, which presents several challenges. First, accurately modeling the behavior of the observed agent requires significant computational resources. Second, continuous simulation environments cannot be accurately recognized using existing methods. Finally, real-time computing power is required to infer the likelihood of each potential goal. In this paper, we propose an advanced and efficient real-time online goal recognition algorithm based on deep reinforcement learning in continuous domains. By leveraging the offline modeling of the observed agent’s behavior with deep reinforcement learning, our algorithm achieves real-time goal recognition. We evaluate the algorithm’s online goal recognition accuracy and stability in continuous simulation environments under communication constraints.
... Since the idea of Plan Recognition as Planning was introduced by [13], many approaches have adopted this paradigm [14], [15], [20], [18], [17], [9], [11], [3]. It was recognized relatively soon that the initial PRAP approaches are computationally demanding, as they require computing entire plans. ...
Preprint
Full-text available
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios, it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible. However, many early approaches in the area of Plan Recognition As Planning, require quite large amounts of computation time to calculate a solution. Mainly to address this issue, recently, Pereira et al. developed an approach that is based on planning landmarks and is much more computationally efficient than previous approaches. However, the approach, as proposed by Pereira et al., also uses trivial landmarks (i.e., facts that are part of the initial state and goal description are landmarks by definition). In this paper, we show that it does not provide any benefit to use landmarks that are part of the initial state in a planning landmark based goal recognition approach. The empirical results show that omitting initial state landmarks for goal recognition improves goal recognition performance.
... Therefore, the goal that best explains the observed path is more likely to be the real goal. This can be used to generate a probability distribution over the candidate goal-set (Masters and Sardina 2017a). Inverse reinforcement learning (IRL) is the task of deducing a reward function given traces of the agent's behaviour (Ng and Russell 2000). ...
Preprint
Full-text available
This paper investigates deceptive reinforcement learning for privacy preservation in model-free and continuous action space domains. In reinforcement learning, the reward function defines the agent's objective. In adversarial scenarios, an agent may need to both maximise rewards and keep its reward function private from observers. Recent research presented the ambiguity model (AM), which selects actions that are ambiguous over a set of possible reward functions, via pre-trained Q-functions. Despite promising results in model-based domains, our investigation shows that AM is ineffective in model-free domains due to misdirected state space exploration. It is also inefficient to train and inapplicable in continuous action space domains. We propose the deceptive exploration ambiguity model (DEAM), which learns using the deceptive policy during training, leading to targeted exploration of the state space. DEAM is also applicable in continuous action spaces. We evaluate DEAM in discrete and continuous action space path planning environments. DEAM achieves similar performance to an optimal model-based version of AM and outperforms a model-free version of AM in terms of path cost, deceptiveness and training efficiency. These results extend to the continuous domain.
... Other domain-theory methods are very relevant to our approach. Plan recognition as planning (PRP) (Ramírez and Geffner 2009;Freedman and Zilberstein 2017;Masters and Sardina 2017) uses domain theories with planners to dynamically generate hypotheses for plan recognition. The (2009) formulation relied on modified planners and could not probabilistically rank the hypotheses. ...
Article
Plan recognition is the task of inferring the plan of an agent, based on an incomplete sequence of its observed actions. Previous formulations of plan recognition commit early to discretizations of the environment and the observed agent's actions. This leads to reduced recognition accuracy. To address this, we first provide a formalization of recognition problems which admits continuous environments, as well as discrete domains. We then show that through mirroring---generalizing plan-recognition by planning---we can apply continuous-world motion planners in plan recognition. We provide formal arguments for the usefulness of mirroring, and empirically evaluate mirroring in more than a thousand recognition problems in three continuous domains and six classical planning domains.
... • First, the optimal (i.e., rational) plan to reach the goal, while matching the observations (i.e., observed actions are part of the plan). • Second, the optimal plan for the same goal that is either unconstrained to match the observations (Ramirez and Geffner, 2009;Sohrabi et al., 2016;Freedman and Zilberstein, 2017;Masters and Sardina, 2017;Shvo and McIlraith, 2017;Vered and Kaminka, 2017;Kaminka et al., 2018;Masters and Sardina, 2018;Masters and Sardina, 2019), or is constrained to not match the observations (Ramirez and Geffner, 2010). ...
Article
Full-text available
Recently, we are seeing the emergence of plan- and goal-recognition algorithms which are based on the principle of rationality . These avoid the use of a plan library that compactly encodes all possible observable plans, and instead generate plans dynamically to match the observations. However, recent experiments by Berkovitz (Berkovitz, The effect of spatial cognition and context on robot movement legibility in human-robot collaboration, 2018) show that in many cases, humans seem to have reached quick (correct) decisions when observing motions which were far from rational (optimal), while optimal motions were slower to be recognized. Intrigued by these findings, we experimented with a variety of rationality-based recognition algorithms on the same data. The results clearly show that none of the algorithms reported in the literature accounts for human subject decisions, even in this simple task. This is our first contribution. We hypothesize that humans utilize plan-recognition in service of goal recognition, i.e., match observations to known plans, and use the set of recognized plans to conclude as to the likely goals. To test this hypothesis, a second contribution in this paper is the introduction of a novel offline recognition algorithm. While preliminary, the algorithm accounts for the results reported by Berkovitz significantly better than the existing algorithms. Moreover, the proposed algorithm marries rationality-based and plan-library based methods seamlessly.
... A number of approaches have been proposed to improve plan recognition performance using ideas such as goal graph (Hong, 2001) analysis, pruning heuristics (Yolanda et al., 2015;Vered & Kaminka, 2017;Masters & Sardina, 2017), landmark (Hoffmann et al., 2004) detection (Pereira et al., 2017;Pozanco et al., 2018), and even meta reasoning about the time required to recognize symbolic plans and whether to take an action that could disambiguate plans (Fagundes et al., 2014). Unfortunately, these approaches still have several major drawbacks. ...
Article
Full-text available
Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.
... A number of computational approaches to model key abilities for collaborating with others have been proposed. Examples include joint attention [24], goal recognition [49,32], online planning [52], or collaborative discourse [42]. Most importantly for our work, significant advancements have been made in developing computational models of interpreting an action of another agent in terms of the intentions, beliefs, or emotional states that may have caused said action. ...
Preprint
Full-text available
Working together on complex collaborative tasks requires agents to coordinate their actions. Doing this explicitly or completely prior to the actual interaction is not always possible nor sufficient. Agents also need to continuously understand the current actions of others and quickly adapt their own behavior appropriately. Here we investigate how efficient, automatic coordination processes at the level of mental states (intentions, goals), which we call belief resonance, can lead to collaborative situated problem-solving. We present a model of hierarchical active inference for collaborative agents (HAICA). It combines efficient Bayesian Theory of Mind processes with a perception-action system based on predictive processing and active inference. Belief resonance is realized by letting the inferred mental states of one agent influence another agent's predictive beliefs about its own goals and intentions. This way, the inferred mental states influence the agent's own task behavior without explicit collaborative reasoning. We implement and evaluate this model in the Overcooked domain, in which two agents with varying degrees of belief resonance team up to fulfill meal orders. Our results demonstrate that agents based on HAICA achieve a team performance comparable to recent state of the art approaches, while incurring much lower computational costs. We also show that belief resonance is especially beneficial in settings were the agents have asymmetric knowledge about the environment. The results indicate that belief resonance and active inference allow for quick and efficient agent coordination, and thus can serve as a building block for collaborative cognitive agents.
... A number of computational approaches to model key abilities for collaborating with others have been proposed. Examples include joint attention [27], goal recognition [28,29], online planning [30], or collaborative discourse [31]. Most importantly for our work, significant advancements have been made in developing computational models of interpreting an action of another agent in terms of the intentions, beliefs, or emotional states that may have caused said action. ...
Article
Full-text available
Working together on complex collaborative tasks requires agents to coordinate their actions. Doing this explicitly or completely prior to the actual interaction is not always possible nor sufficient. Agents also need to continuously understand the current actions of others and quickly adapt their own behavior appropriately. Here we investigate how efficient, automatic coordination processes at the level of mental states (intentions, goals), which we call belief resonance, can lead to collaborative situated problem-solving. We present a model of hierarchical active inference for collaborative agents (HAICA). It combines efficient Bayesian Theory of Mind processes with a perception–action system based on predictive processing and active inference. Belief resonance is realized by letting the inferred mental states of one agent influence another agent’s predictive beliefs about its own goals and intentions. This way, the inferred mental states influence the agent’s own task behavior without explicit collaborative reasoning. We implement and evaluate this model in the Overcooked domain, in which two agents with varying degrees of belief resonance team up to fulfill meal orders. Our results demonstrate that agents based on HAICA achieve a team performance comparable to recent state-of-the-art approaches, while incurring much lower computational costs. We also show that belief resonance is especially beneficial in settings where the agents have asymmetric knowledge about the environment. The results indicate that belief resonance and active inference allow for quick and efficient agent coordination and thus can serve as a building block for collaborative cognitive agents.
... The cost difference formula has since been analysed by others (Escudero-Martin et al., 2015;Masters and Sardina, 2017a;Masters and Sardina, 2019a) and we adopt a less computationally demanding construction than the original, proved by Masters and Sardina (2019a) to return identical results in all but one corner case: ...
Article
Full-text available
The “science of magic” has lately emerged as a new field of study, providing valuable insights into the nature of human perception and cognition. While most of us think of magic as being all about deception and perceptual “tricks”, the craft—as documented by psychologists and professional magicians—provides a rare practical demonstration and understanding of goal recognition. For the purposes of human-aware planning, goal recognition involves predicting what a human observer is most likely to understand from a sequence of actions. Magicians perform sequences of actions with keen awareness of what an audience will understand from them and—in order to subvert it—the ability to predict precisely what an observer’s expectation is most likely to be. Magicians can do this without needing to know any personal details about their audience and without making any significant modification to their routine from one performance to the next. That is, the actions they perform are reliably interpreted by any human observer in such a way that particular (albeit erroneous) goals are predicted every time. This is achievable because people’s perception, cognition and sense-making are predictably fallible. Moreover, in the context of magic, the principles underlying human fallibility are not only well-articulated but empirically proven. In recent work we demonstrated how aspects of human cognition could be incorporated into a standard model of goal recognition, showing that—even though phenomena may be “fully observable” in that nothing prevents them from being observed—not all are noticed, not all are encoded or remembered, and few are remembered indefinitely. In the current article, we revisit those findings from a different angle. We first explore established principles from the science of magic, then recontextualise and build on our model of extended goal recognition in the context of those principles. While our extensions relate primarily to observations, this work extends and explains the definitions, showing how incidental (and apparently incidental) behaviours may significantly influence human memory and belief. We conclude by discussing additional ways in which magic can inform models of goal recognition and the light that this sheds on the persistence of conspiracy theories in the face of compelling contradictory evidence.
... Measures We measured: (1) the total path cost, which is the inverse of the discounted reward; (2) the probability assigned to the true reward function, calculated using a naïve intention recognition algorithm [19,36]; and (3) the simulation value of the paths from Masters and Sardina [20]: ...
Preprint
In this paper, we study the problem of deceptive reinforcement learning to preserve the privacy of a reward function. Reinforcement learning is the problem of finding a behaviour policy based on rewards received from exploratory behaviour. A key ingredient in reinforcement learning is a reward function, which determines how much reward (negative or positive) is given and when. However, in some situations, we may want to keep a reward function private; that is, to make it difficult for an observer to determine the reward function used. We define the problem of privacy-preserving reinforcement learning, and present two models for solving it. These models are based on dissimulation -- a form of deception that `hides the truth'. We evaluate our models both computationally and via human behavioural experiments. Results show that the resulting policies are indeed deceptive, and that participants can determine the true reward function less reliably than that of an honest agent.
... Goal Recognition: We use a single-observation costbased goal recognition (Masters and Sardina 2018) to rank goals according to their probability of being the goal that the pilot is trying to reach. Using their single-observation formula, we find a cost difference that assumes that the pilot is rational. ...
Article
We introduce Detection and Recognition of Airplane GOals with Navigational Visualization (DRAGON-V), a visualization system that uses probabilistic goal recognition to infer and display the most probable airport runway that a pilot is approaching. DRAGON-V is especially useful in cases of miscommunication, low visibility, or lack of airport familiarity which may result in a pilot deviating from the assigned taxiing route. The visualization system conveys relevant information, and updates according to the airplane's current geolocation. DRAGON-V aims to assist air traffic controllers in reducing incidents of runway incursions at airports.
... Equally, GRD can be applied to the problem of finding behaviors that obfuscate an agent's goal. This is relevant to both adversarial agents (Kabanza et al., 2010;Keren et al., 2015;Masters & Sardina, 2017) and privacy-preserving agents (Keren et al., 2016b). Specifically, Keren et al. (2016b) show that agents can use GRD tools to identify WCD paths that lead to their destination while keeping it ambiguous as long as possible. ...
Article
Goal recognition design (GRD) facilitates understanding the goals of acting agents through the analysis and redesign of goal recognition models, thus offering a solution for assessing and minimizing the maximal progress of any agent in the model before goal recognition is guaranteed. In a nutshell, given a model of a domain and a set of possible goals, a solution to a GRD problem determines (1) the extent to which actions performed by an agent within the model reveal the agent’s objective; and (2) how best to modify the model so that the objective of an agent can be detected as early as possible. This approach is relevant to any domain in which rapid goal recognition is essential and the model design can be controlled. Applications include intrusion detection, assisted cognition, computer games, and human-robot collaboration. A GRD problem has two components: the analyzed goal recognition setting, and a design model specifying the possible ways the environment in which agents act can be modified so as to facilitate recognition. This work formulates a general framework for GRD in deterministic and partially observable environments, and offers a toolbox of solutions for evaluating and optimizing model quality for various settings. For the purpose of evaluation we suggest the worst case distinctiveness (WCD) measure, which represents the maximal cost of a path an agent may follow before its goal can be inferred by a goal recognition system. We offer novel compilations to classical planning for calculating WCD in settings where agents are bounded-suboptimal. We then suggest methods for minimizing WCD by searching for an optimal redesign strategy within the space of possible modifications, and using pruning to increase efficiency. We support our approach with an empirical evaluation that measures WCD in a variety of GRD settings and tests the efficiency of our compilation-based methods for computing it. We also examine the effectiveness of reducing WCD via redesign and the performance gain brought about by our proposed pruning strategy.
... Some recent advent of work on goal recognition as planning takes as input a planning domain, usually described in PDDL, a set of possible goals. Its output is one of the goals or a distribution over all possible goals [Ramırez and Geffner, 2010;Sohrabi et al., 2016;Freedman and Zilberstein, 2017;Shvo et al., 2017;Vered and Kaminka, 2017;Masters and Sardina, 2017]. The benefit of this approach is that it is model-free, in the sense that possible plan executions are implicit. ...
Chapter
Goal recognition is the task of inferring the goal of an actor given its observed actions. Attack graphs are a common representation of assets, vulnerabilities, and exploits used for analysis of potential intrusions in computer networks. This paper introduces new goal recognition algorithms on attack graphs. The main challenges involving goal recognition in cyber security include dealing with noisy and partial observations as well as the need for fast, near-real-time performance. To this end we propose improvements to existing planning-based algorithms for goal recognition, reducing their time complexity and allowing them to handle noisy observations. We also introduce two new metric-based algorithms for goal recognition. Experimental results show that the metric based algorithms improve performance when compared to the planning based algorithms, in terms of accuracy and runtime, thus enabling goal recognition to be carried out in near-real-time. These algorithms can potentially improve both risk management and alert correlation mechanisms for intrusion detection.
... Some recent advent of work on goal recognition as planning takes as input a planning domain, usually described in PDDL, a set of possible goals. Its output is one of the goals or a distribution over all possible goals [Ramırez and Geffner, 2010;Sohrabi et al., 2016;Freedman and Zilberstein, 2017;Shvo et al., 2017;Vered and Kaminka, 2017;Masters and Sardina, 2017]. The benefit of this approach is that it is model-free, in the sense that possible plan executions are implicit. ...
Poster
Full-text available
Attack graphs are a common domain representation used as the basis for static analysis of potential attacks and defenses in cybersecurity. So far they have not been used for online goal recognition. This paper bridges this gap by using attack graphs as a basis for goal recognition algorithms to potentially improve intrusion detection and incident response. It provides novel methods to deal with noise and partial observability that are common properties of intrusion detection scenarios in the real world. It compares the efficacy of several goal recognition paradigms, including those using planning algorithms to guide the search, as well as distance-based metrics using attack graphs as an underlying domain representation. Results on a real network of a large scale academic organization show that there is a clear tradeoff between the run-time of the algorithm and its ability to recognize attacks given noisy and partial information. However, using our proposed techniques, most of the computation efforts are invested in preprocessing, thus enabling the online component to perform well in real-time.
... In (Vered, Kaminka, & Biham, 2016), Vered et al. introduce the concept of mirroring to develop an online goal recognition approach for continuous domains. Masters and Sardiña ((2017)) propose a fast and accurate goal recognition approach for path-planning, providing a new probabilistic framework for goal recognition. In (Vered & Kaminka, 2017), Vered and Kaminka develop a heuristic approach for online goal recognition that deals with continuous domains. ...
Preprint
The task of recognizing goals and plans from missing and full observations can be done efficiently by using automated planning techniques. In many applications, it is important to recognize goals and plans not only accurately, but also quickly. To address this challenge, we develop novel goal recognition approaches based on planning techniques that rely on planning landmarks. In automated planning, landmarks are properties (or actions) that cannot be avoided to achieve a goal. We show the applicability of a number of planning techniques with an emphasis on landmarks for goal and plan recognition tasks in two settings: (1) we use the concept of landmarks to develop goal recognition heuristics; and (2) we develop a landmark-based filtering method to refine existing planning-based goal and plan recognition approaches. These recognition approaches are empirically evaluated in experiments over several classical planning domains. We show that our goal recognition approaches yield not only accuracy comparable to (and often higher than) other state-of-the-art techniques, but also substantially faster recognition time over such techniques.
Chapter
Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios, it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible. However, many early approaches in the area of Plan Recognition As Planning, require quite large amounts of computation time to calculate a solution. Mainly to address this issue, recently, Pereira et al. developed an approach that is based on planning landmarks and is much more computationally efficient than previous approaches. However, the approach, as proposed by Pereira et al., considers trivial landmarks (i.e., facts that are part of the initial state and goal description are landmarks by definition) for goal recognition. In this paper, we show that it does not provide any benefit to use landmarks that are part of the initial state in a planning landmark based goal recognition approach. The empirical results show that omitting initial state landmarks for goal recognition improves goal recognition performance.
Conference Paper
Being a fully algorithmic procedure, symbolic controller synthesis comes undoubtedly with weighty advantages over other established synthesis procedures. In fact, the returned controllers provably enforce the given specification to the control loop making verification steps obsolete. However, the curse-of-dimensionality prevents this scheme to be applied to industrial problems. Applications to real experiments are indeed rare. In this note, we demonstrate how to utilize symbolic optimal control in order to control miniature quadcopters on the level of mission guidance. Specifically, a firefighting scenario using a Crazyflie 2.1 drone is considered, which involves reach-avoid and reach-and-stay control tasks. Furthermore, we present a runtime monitor, automatically derived from the synthesized symbolic controller. Based on the methodologies of plan recognition, this monitor is observing the drone’s flightpath and inferring the current controller mode. Thus, it is able to predict the upcoming manoeuvres of the drone.
Article
We study the idea of “deception by motion” through a two-player dynamic game played between a Mover who must reach its goal to retrieve resources, and an Eater who can consume resources from two candidate goals. The Mover seeks to minimize the resource consumption at the true goal it must reach, while the Eater tries to maximize it without knowing which one the true goal is. Unlike existing works on deceptive motion control that measures the deceptiveness through the quality of inference made by a distant observer (an estimator), we incorporate its actions to directly measure the efficacy of deception through the outcome of the game. We identify a pair of equilibrium strategies and demonstrate that if the observing agent optimizes for the worst-case scenario, hiding the intention (deception by ambiguity) is effective, whereas trying to fake the true goal (deception by exaggeration) is not.
Chapter
Mobile robots have been widely used in military and other fields. By observing the path of the mobile robot, it is easy to infer the destination of its path. However, in some special situations, the user of the mobile robot does not want to reveal its purpose information, such as carrying out military attacks, sneak reconnaissance missions, etc. In order to protect the mobile robot's path information from being leaked, we combine the reinforcement learning path planning method to propose a computationally efficient deceptive reinforcement learning mobile robot path planning method. The path planned by this method balances the path cost and the target deception weight, making it difficult for observers observing the mobile robot path to predict the real target of the mobile robot, while ensuring a lower path cost for the mobile robot.KeywordsTopology optimizationAdditive msanufacturingVoxelization modelingDelaunay triangulation
Article
Most approaches for goal recognition rely on specifications of the possible dynamics of the actor in the environment when pursuing a goal. These specifications suffer from two key issues. First, encoding these dynamics requires careful design by a domain expert, which is often not robust to noise at recognition time. Second, existing approaches often need costly real-time computations to reason about the likelihood of each potential goal. In this paper, we develop a framework that combines model-free reinforcement learning and goal recognition to alleviate the need for careful, manual domain design, and the need for costly online executions. This framework consists of two main stages: Offline learning of policies or utility functions for each potential goal, and online inference. We provide a first instance of this framework using tabular Q-learning for the learning stage, as well as three measures that can be used to perform the inference stage. The resulting instantiation achieves state-of-the-art performance against goal recognizers on standard evaluation domains and superior performance in noisy environments.
Chapter
The transition to Computer-Aided Systems Engineering (CASE) changed engineers’ day-to-day tasks in many disciplines such as mechanical or electronic ones. System engineers are still looking for the right set of tools to embrace this opportunity. Indeed, they deal with many kinds of data which evolve a lot during the development life cycle. Model-Based Systems Engineering (MBSE) should be an answer to that but failed to convince and to be accepted by system engineers and architects. The complexity of creating, editing, and annotating models of systems engineering takes its root from different sources: high abstraction levels, static representations, complex interfaces, and the time-consuming activities to keep a model and its associated diagrams consistent. As a result, system architects still heavily rely on traditional methods (whiteboards, papers, and pens) to outline a problem and its solution, and then they use modeling expert users to digitize informal data in modeling tools. In this chapter, we present an approach based on automated plan recognition to capture sketches of systems engineering models and to incrementally formalize them using specific representations. We present a first implementation of our approach with AI plan recognition, and we detail an experiment on applying plan recognition to systems engineering.
Article
Contemporary cost-based goal-recognition assumes rationality: that observed behaviour is more or less optimal. Probabilistic goal recognition systems, however, explicitly depend on some degree of sub-optimality to generate probability distributions. We show that, even when an observed agent is only slightly irrational (sub-optimal), state-of-the-art systems produce counter-intuitive results (though these may only become noticeable when the agent is highly irrational). We provide a definition of rationality appropriate to situations where the ground truth is unknown, define a rationality measure (RM) that quantifies an agent's expected degree of sub-optimality, and define an innovative self-modulating probability distribution formula for goal recognition. Our formula recognises sub-optimality and adjusts its level of confidence accordingly, thereby handling irrationality—and rationality—in an intuitive, principled manner. Building on that formula, moreover, we strengthen a previously published result, showing that “single-observation” recognition in the path-planning domain achieves identical results to more computationally expensive techniques, where previously we claimed only to achieve equivalent rankings though values differed.
Article
Intention recognition is the process of using behavioural cues, such as deliberative actions, eye gaze, and gestures, to infer an agent's goals or future behaviour. In artificial intelligence, one approach for intention recognition is to use a model of possible behaviour to rate intentions as more likely if they are a better ‘fit’ to actions observed so far. In this paper, we draw from literature linking gaze and visual attention, we propose a novel model of online human intention recognition that combines gaze and model-based AI planning to build probability distributions over a set of possible intentions. In human-behavioural experiments (n=40) involving a multi-player board game, we demonstrate that adding gaze-based priors to model-based intention recognition improved the accuracy of intention recognition by 22% (p<0.05), determined those intentions ≈90 seconds earlier (p<0.05), and at no additional computational cost. We also demonstrate that, when evaluated in the presence of semi-rational or deceptive gaze behaviours, the proposed model is significantly more accurate (9% improvement) (p<0.05) compared to a model-based or gaze only approaches. Our results indicate that the proposed model could be used to design novel human-agent interactions in cases when we are unsure whether a person is honest, deceitful, or semi-rational.
Article
Recognizing goals and plans from complete or partial observations can be efficiently achieved through automated planning techniques. In many applications, it is important to recognize goals and plans not only accurately, but also quickly. To address this challenge, we develop novel goal recognition approaches based on planning techniques that rely on planning landmarks. In automated planning, landmarks are properties (or actions) that cannot be avoided to achieve a goal. We show the applicability of a number of planning techniques with an emphasis on landmarks for goal recognition tasks in two settings: (1) we use the concept of landmarks to develop goal recognition heuristics; and (2) we develop a landmark-based filtering method to refine existing planning-based goal and plan recognition approaches. These recognition approaches are empirically evaluated in experiments over several classical planning domains. We show that our goal recognition approaches yield not only accuracy comparable to (and often higher than) other state-of-the-art techniques, but also result in substantially faster recognition time over existing techniques.
ResearchGate has not been able to resolve any references for this publication.