Richard Dazeley

Richard Dazeley
  • PhD
  • Professor at Deakin University

About

114
Publications
33,419
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,721
Citations
Introduction
Professor Richard Dazeley currently works at the School of Information Technology, Deakin University. Richard is a highly experienced researcher in multiobjective, interactive, deep, safe and explainable RL. His current project is 'Implementing safe AI using multiobjective reinforcement learning'.
Current institution
Deakin University
Current position
  • Professor
Additional affiliations
June 2018 - present
Deakin University
Position
  • Professor (Associate)
July 2011 - September 2015
Federation University
Position
  • Associate Dean (Learning and Teaching)
January 2007 - present
Federation University
Education
March 2002 - March 2006
University of Tasmania
Field of study
  • Machine Learning, Artificial Intelligence

Publications

Publications (114)
Preprint
Point Transformers (PoinTr) have shown great potential in point cloud completion recently. Nevertheless, effective domain adaptation that improves transferability toward target domains remains unexplored. In this paper, we delve into this topic and empirically discover that direct feature alignment on point Transformer's CNN backbone only brings li...
Preprint
Full-text available
Apologies are a powerful tool used in human-human interactions to provide affective support, regulate social processes, and exchange information following a trust violation. The emerging field of AI apology investigates the use of apologies by artificially intelligent systems, with recent research suggesting how this tool may provide similar value...
Preprint
Full-text available
Emerging research in Pluralistic Artificial Intelligence (AI) alignment seeks to address how intelligent systems can be designed and deployed in accordance with diverse human needs and values. We contribute to this pursuit with a dynamic approach for aligning AI with diverse and shifting user preferences through Multi Objective Reinforcement Learni...
Preprint
Full-text available
Reinforcement learning (RL) is a valuable tool for the creation of AI systems. However it may be problematic to adequately align RL based on scalar rewards if there are multiple conflicting values or stakeholders to be considered. Over the last decade multi-objective reinforcement learning (MORL) using vector rewards has emerged as an alternative t...
Article
Full-text available
In the last two decades there has been an increase in research and discussion regarding curriculum in Higher Education (HE). The literature in this field tends to focus on curriculum change at either the whole institution or individual program or unit level. Formal writing on HE curriculum also does not offer a framework that openly draws upon cont...
Preprint
Full-text available
In many critical Machine Learning applications, such as autonomous driving and medical image diagnosis, the detection of out-of-distribution (OOD) samples is as crucial as accurately classifying in-distribution (ID) inputs. Recently Outlier Exposure (OE) based methods have shown promising results in detecting OOD inputs via model fine-tuning with a...
Preprint
In human-AI coordination scenarios, human agents usually exhibit asymmetric behaviors that are significantly sparse and unpredictable compared to those of AI agents. These characteristics introduce two primary challenges to human-AI coordination: the effectiveness of obtaining sparse rewards and the efficiency of training the AI agents. To tackle t...
Preprint
Full-text available
In this era of advanced manufacturing, it's now more crucial than ever to diagnose machine faults as early as possible to guarantee their safe and efficient operation. With the massive surge in industrial big data and advancement in sensing and computational technologies, data-driven Machinery Fault Diagnosis (MFD) solutions based on machine/deep l...
Preprint
In the last two decades there has been an increase in research and discussion regarding curriculum in Higher Education (HE). The literature in this field tends to focus on curriculum change at either the whole institution or individual program or unit level. Formal writing on HE curriculum also does not offer a framework that openly draws upon cont...
Article
Full-text available
This paper describes a language wrapper for the NetHack Learning Environment (NLE) [1]. The wrapper replaces the non-language observations and actions with comparable language versions. The NLE offers a grand challenge for AI research while MiniHack [2] extends this potential to more specific and configurable tasks. By providing a language interfac...
Article
Full-text available
For an Artificially Intelligent (AI) system to maintain alignment between human desires and its behaviour, it is important that the AI account for human preferences. This paper proposes and empirically evaluates the first approach to aligning agent behaviour to human preference via an apologetic framework. In practice, an apology may consist of an...
Article
Full-text available
Broad-XAI moves away from interpreting individual decisions based on a single datum and aims to provide integrated explanations from multiple machine learning algorithms into a coherent explanation of an agent’s behaviour that is aligned to the communication needs of the explainee. Reinforcement Learning (RL) methods, we propose, provide a potentia...
Article
Full-text available
Deep Reinforcement Learning (DeepRL) methods have been widely used in robotics to learn about the environment and acquire behaviours autonomously. Deep Interactive Reinforcement 2 Learning (DeepIRL) includes interactive feedback from an external trainer or expert giving advice to help learners choose actions to speed up the learning process. Howeve...
Conference Paper
Full-text available
Real-world sequential decision-making tasks are usually complex , and require trade-offs between multiple-often conflicting-objectives. However, the majority of research in reinforcement learning (RL) and decision-theoretic planning assumes a single objective, or that multiple objectives can be handled via a predefined weighted sum over the objecti...
Preprint
Full-text available
The use of interactive advice in reinforcement learning scenarios allows for speeding up the learning process for autonomous agents. Current interactive reinforcement learning research has been limited to real-time interactions that offer relevant user advice to the current state only. Moreover, the information provided by each interaction is not r...
Preprint
Full-text available
Deep Q-Networks algorithm (DQN) was the first reinforcement learning algorithm using deep neural network to successfully surpass human level performance in a number of Atari learning environments. However, divergent and unstable behaviour have been long standing issues in DQNs. The unstable behaviour is often characterised by overestimation in the...
Preprint
Full-text available
Explainable artificial intelligence is a research field that tries to provide more transparency for autonomous intelligent systems. Explainability has been used, particularly in reinforcement learning and robotic scenarios, to better understand the robot decision-making process. Previous work, however, has been widely focused on providing technical...
Article
Full-text available
Real-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear co...
Article
Full-text available
A common approach to address multiobjective problems using reinforcement learning methods is to extend model-free, value-based algorithms such as Q-learning to use a vector of Q-values in combination with an appropriate action selection mechanism that is often based on scalarisation. Most prior empirical evaluation of these approaches has focused o...
Article
Full-text available
Neural networks are effective function approximators, but hard to train in the reinforcement learning (RL) context mainly because samples are correlated. In complex problems, a neural RL approach is often able to learn a better solution than tabular RL, but generally takes longer. This paper proposes two methods, Discrete-to-Deep Supervised Policy...
Article
Full-text available
In this paper, we consider a convex quadratic multiobjective optimization problem, where both the objective and constraint functions involve data uncertainty. We employ a deterministic approach to examine robust optimality conditions and find robust (weak) Pareto solutions of the underlying uncertain multiobjective problem. We first present new nec...
Article
Full-text available
Interactive reinforcement learning proposes the use of externally sourced information in order to speed up the learning process. When interacting with a learner agent, humans may provide either evaluative or informative advice. Prior research has focused on the effect of human-sourced advice by including real-time feedback on the interactive reinfo...
Preprint
Full-text available
In this study, we used grammatical evolution to develop a customised particle swarm optimiser by incorporating adaptive building blocks. This makes the algorithm self-adaptable to the problem instance. Our objective is to provide the means to automatically generate novel population-based meta-heuristics by scoring the building blocks. We propose a...
Preprint
Full-text available
The recent paper `"Reward is Enough" by Silver, Singh, Precup and Sutton posits that the concept of reward maximisation is sufficient to underpin all intelligence, both natural and artificial. We contest the underlying assumption of Silver et al. that such reward can be scalar-valued. In this paper we explain why scalar rewards are insufficient to...
Preprint
Full-text available
Deep Reinforcement Learning (DeepRL) methods have been widely used in robotics to learn about the environment and acquire behaviors autonomously. Deep Interactive Reinforcement Learning (DeepIRL) includes interactive feedback from an external trainer or expert giving advice to help learners choosing actions to speed up the learning process. However...
Article
Full-text available
A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, while re...
Article
Full-text available
Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time. Current interactive reinforcement learning research has been limited to real-time interactions that offer relevant user advice to the current state only. Additionall...
Article
Full-text available
Robotic systems are more present in our society everyday. In human–robot environments, it is crucial that end-users may correctly understand their robotic team-partners, in order to collaboratively complete a task. To increase action understanding, users demand more explainability about the decisions by the robot in particular situations. Recently,...
Preprint
Full-text available
Broad Explainable Artificial Intelligence moves away from interpreting individual decisions based on a single datum and aims to provide integrated explanations from multiple machine learning algorithms into a coherent explanation of an agent's behaviour that is aligned to the communication needs of the explainee. Reinforcement Learning (RL) methods...
Preprint
Full-text available
Explainable reinforcement learning allows artificial agents to explain their behavior in a human-like manner aiming at non-expert end-users. An efficient alternative of creating explanations is to use an introspection-based method that transforms Q-values into probabilities of success used as the base to explain the agent's decision-making process....
Article
Full-text available
An increasing number of complex problems have naturally posed significant challenges in decision-making theory and reinforcement learning practices. These problems often involve multiple conflicting reward signals that inherently cause agents’ poor exploration in seeking a specific goal. In extreme cases, the agent gets stuck in a sub-optimal solut...
Article
Full-text available
Reinforcement learning refers to a machine learning paradigm in which an agent interacts with the environment to learn how to perform a task. The characteristics of the environment may change over time or be affected by disturbances not controlled, avoiding the agent finding a proper policy. Some approaches attempt to address these problems, as int...
Preprint
Full-text available
Over the last few years there has been rapid research growth into eXplainable Artificial Intelligence (XAI) and the closely aligned Interpretable Machine Learning (IML). Drivers for this growth include recent legislative changes and increased investments by industry and governments, along with increased concern from the general public. People are a...
Article
Over the last few years there has been rapid research growth into eXplainable Artificial Intelligence (XAI) and the closely aligned Interpretable Machine Learning (IML). Drivers for this growth include recent legislative changes and increased investments by industry and governments, along with increased concern from the general public. People are a...
Article
Full-text available
In real-world applications, data can be represented using different units/scales. For example, weight in kilograms or pounds and fuel-efficiency in km/l or l/100 km. One unit can be a linear or non-linear scaling of another. The variation in metrics due to the non-linear scaling makes Anomaly Detection (AD) challenging. Most existing AD algorithms...
Article
The concept of impact-minimisation has previously been proposed as an approach to addressing the safety concerns that can arise from utility-maximising agents. An impact-minimising agent takes into account the potential impact of its actions on the state of the environment when selecting actions, so as to avoid unacceptable side-effects. This paper...
Preprint
Full-text available
Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination....
Article
Full-text available
Interactive reinforcement learning methods utilise an external information source to evaluate decisions and accelerate learning. Previous work has shown that human advice could significantly improve learning agents’ performance. When evaluating reinforcement learning algorithms, it is common to repeat experiments as parameters are altered or to gai...
Preprint
Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time. Current interactive reinforcement learning research has been limited to interactions that offer relevant advice to the current state only. Additionally, the informat...
Article
Full-text available
This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks. We develop a high-performance MODRL framework that supports both single-policy and multi-policy strategies, as well as both linear and non-linear approaches to action selection. The experimental results on two benchmark probl...
Preprint
Full-text available
Reinforcement learning is an approach used by intelligent agents to autonomously learn new skills. Although reinforcement learning has been demonstrated to be an effective learning approach in several different contexts, a common drawback exhibited is the time needed in order to satisfactorily learn a task, especially in large state-action spaces....
Article
Full-text available
Robots are extending their presence in domestic environments every day, it being more common to see them carrying out tasks in home scenarios. In the future, robots are expected to increasingly perform more complex tasks and, therefore, be able to acquire experience from different sources as quickly as possible. A plausible approach to address this...
Preprint
Full-text available
Research on humanoid robotic systems involves a considerable amount of computational resources, not only for the involved design but also for its development and subsequent implementation. For robotic systems to be implemented in real-world scenarios, in several situations, it is preferred to develop and test them under controlled environments in o...
Preprint
Full-text available
Robots are extending their presence in domestic environments every day, being more common to see them carrying out tasks in home scenarios. In the future, robots are expected to increasingly perform more complex tasks and, therefore, be able to acquire experience from different sources as quickly as possible. A plausible approach to address this is...
Preprint
Full-text available
A long-term goal of reinforcement learning agents is to be able to perform tasks in complex real-world scenarios. The use of external information is one way of scaling agents to more complex problems. However, there is a general lack of collaboration or interoperability between different approaches using external information. In this work, we propo...
Preprint
Full-text available
Robotic systems are more present in our society every day. In human-robot interaction scenarios, it is crucial that end-users develop trust in their robotic team-partners, in order to collaboratively complete a task. To increase trust, users demand more understanding about the decisions by the robot in particular situations. Recently, explainable r...
Preprint
Neural networks are effective function approximators, but hard to train in the reinforcement learning (RL) context mainly because samples are correlated. For years, scholars have got around this by employing experience replay or an asynchronous parallel-agent system. This paper proposes Discrete-to-Deep Supervised Policy Learning (D2D-SPL) for trai...
Preprint
Full-text available
We report a previously unidentified issue with model-free, value-based approaches to multiobjective reinforcement learning in the context of environments with stochastic state transitions. An example multiobjective Markov Decision Process (MOMDP) is used to demonstrate that under such conditions these approaches may be unable to discover the policy...
Chapter
Reinforcement learning (RL) is a learning approach based on behavioral psychology used by artificial agents to learn autonomously by interacting with their environment. An open issue in RL is the lack of visibility and understanding for end-users in terms of decisions taken by an agent during the learning process. One way to overcome this issue is...
Chapter
Reinforcement learning techniques for solving complex problems are resource-intensive and take a long time to converge, prompting a need for methods that encourage faster learning. In this paper we show our successful application of actor-critic reinforcement learning to the air combat simulation domain and how reward structures affect the learning...
Thesis
Full-text available
Reinforcement Learning (RL) has seen increasing interest over the past few years, partially owing to breakthroughs in the digestion and application of external information. The use of external information results in improved learning speeds and solutions to more complex domains. This thesis, a collection of five key contributions, demonstrates that...
Data
These are simple artificial datasets used to compare the performance of neural regression algorithms on data where the output is a function of the input, and data where the output is not a function of the input (i.e. where the number of outputs may vary, depending on the value of input). These are the datasets used in our forthcoming paper "Non-Fun...
Article
This work identifies an important, previously unaddressed issue for regression based on neural networks – learning to accurately approximate problems where the output is not a function of the input (i.e. where the number of outputs required varies across input space). Such non-functional regression problems arise in a number of applications, and ca...
Chapter
Integrated Prudence Analysis has been proposed as a method to maximize the accuracy of rule based systems. The paper presents evaluation results of the three Prudence methods on public datasets which demonstrate that combining attribute-based and structural Prudence produces a net improvement in Prudence Accuracy.
Article
Full-text available
As the capabilities of artificial intelligence systems improve, it becomes important to constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of ethical, legal and safety-based frameworks have been proposed as a basis for designing these constraints. Despite their variations, these frameworks share the common c...
Article
Background: This regional pilot site 'end-user attitudes' study explored nurses' experiences and impressions of using the Nurses' Behavioural Assistant (NBA) (a knowledge-based, interactive ehealth system) to assist them to better respond to behavioural and psychological symptoms of dementia (BPSD) and will be reported here. Methods: Focus group...
Conference Paper
Conventional Knowledge-Based Systems (KBS) have no way of detecting or signalling when their knowledge is insufficient to handle a case. Consequently, these systems may produce an uninformed conclusion when presented with a case beyond their current knowledge (brittleness) which results in the KBS giving incorrect conclusions due to insufficient kn...
Article
For reinforcement learning tasks with multiple objectives, it may be advantageous to learn stochastic or non-stationary policies. This paper investigates two novel algorithms for learning non-stationary policies which produce Pareto-optimal behaviour (w-steering and Q-steering), by extending prior work based on the concept of geometric steering. Em...
Article
Despite growing interest over recent years in applying reinforcement learning to multiobjective problems, there has been little research into the applicability and effectiveness of exploration strategies within the multiobjective context. This work considers several widely-used approaches to exploration from the single-objective reinforcement learn...
Poster
Full-text available
CALL FOR PAPERS AAMAS 2017 Workshop on Multi-Objective Decision Making (MODeM 2017)
Conference Paper
There has been little research into multiobjective reinforcement learning (MORL) algorithms using stochastic or non-stationary policies, even though such policies may Pareto-dominate deterministic stationary policies. One approach is steering which forms a non-stationary combination of deterministic stationary base policies. This paper presents two...
Conference Paper
Full-text available
We argue that multi-objective methods are underrepresented in RL research, and present three scenarios to justify the need for explicitly multi-objective approaches. Key to these scenarios is that although the utility the user derives from a policy — which is what we ultimately aim to optimize — is scalar, it is sometimes impossible, undesirable or...
Conference Paper
Full-text available
Value-based approaches to reinforcement learning (RL) maintain a value function that measures the long term utility of a state or state-action pair. A long standing issue in RL is how to create a finite representation in a continuous, and therefore infinite, state environment. The common approach is to use function approximators such as tile coding...
Article
Full-text available
Sequential decision-making problems with multiple objectives arise naturally in practice and pose unique challenges for research in decision-theoretic planning and learning, which has largely focused on single-objective settings. This article surveys algorithms designed for sequential decision-making problems with multiple objectives. Though there...
Article
Full-text available
Authorship Analysis aims to extract information about the authorship of documents from features within those documents. Typically, this is performed as a classification task with the aim of identifying the author of a document, given a set of documents of known authorship. Alternatively, unsupervised methods have been developed primarily as visuali...
Article
Aliases play an important role in online environments by facilitating anonymity, but also can be used to hide the identity of cybercriminals. Previous studies have investigated this alias matching problem in an attempt to identify whether two aliases are shared by an author, which can assist with identifying users. Those studies create their traini...
Article
Our approach to the author identification task uses existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. This approach came in third for this year's competition, using a relatively simple scheme of weights by training set accuracy. LNG models create profiles, consisting of a list of character n-grams t...
Article
Full-text available
Authorship analysis on phishing websites enables the investigation of phishing attacks, beyond basic analysis. In authorship analysis, salient features from documents are used to determine properties about the author, such as which of a set of candidate authors wrote a given document. In unsupervised authorship analysis, the aim is to group documen...
Conference Paper
Full-text available
Most commercial Fraud Detection components of Internet banking systems use some kind of hybrid setup usually comprising a Rule-Base and an Artificial Neural Network. Such rule bases have been criticised for a lack of innovation in their approach to Knowledge Acquisition and maintenance. Furthermore, the systems are brittle; they have no way of know...
Conference Paper
Rated Multiple Classification Ripple Down Rules (RM) and Ripple Down Models (RDM) are two of the successful prudent RDR approaches published. To date, there has not been a published, dedicated comparison of the two. This paper presents a systematic preliminary evaluation and analysis of the two techniques. The tests and results reported in this pap...
Conference Paper
Full-text available
It is well known that classification models produced by the Ripple Down Rules are easier to maintain and update. They are compact and can provide an explanation of their reasoning making them easy to understand for medical practitioners. This article is devoted to an empir-ical investigation and comparison of several ensemble methods based on Rippl...
Article
Unsupervised Authorship Analysis (UAA) aims to cluster documents by authorship without knowing the authorship of any documents. An important factor in UAA is the method for calculating the distance between documents. This choice of the authorship distance method is considered more critical to the end result than the choice of cluster analysis algor...
Article
Authorship attribution methods aim to determine the author of a document, by using information gathered from a set of documents with known authors. One method of performing this task is to create profiles containing distinctive features known to be used by each author. In this paper, a new method of creating an author or document profile is present...
Article
Full-text available
Wireless sensor networks (WSNs) are widely used in battle fields, logistic applications, healthcare, habitat monitoring, environmental monitoring, home security, and variety of other areas. The existing routing algorithms focus on the delivery of data packets to the sink using the shortest path; however, calculating the shortest path is not a cost-...
Article
BitTorrent is a widely used protocol for peer-to-peer (P2P) file sharing, including material which is often suspected to be infringing content. However, little systematic research has been undertaken to establish to measure the true extent of illegal file sharing. In this paper, we propose a new methodology for measuring the extent of infringing co...
Article
Prudence analysis (PA) is a relatively new, practical and highly innovative approach to solving the problem of brittleness in knowledge based system (KBS) development. PA is essentially an online validation approach where as each situation or case is presented to the KBS for inferencing the result is simultaneously validated. Therefore, instead of...
Article
Full-text available
While a number of algorithms for multiobjective reinforcement learning have been proposed, and a small number of applications developed, there has been very little rigorous empirical evaluation of the performance and limitations of these algorithms. This paper proposes standard methods for such empirical evaluation, to act as a foundation for futur...
Conference Paper
Full-text available
Wireless sensor devices are used for monitoring patients with serious medical conditions. Communication of content-sensitive and context sensitive datasets is crucial for the survival of patients so that informed decisions can be made. The main limitation of sensor devices is that they work on a fixed threshold to notify the relevant Healthcare Pro...
Conference Paper
Full-text available
Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented t...
Conference Paper
Ripple Down Rules (RDR) is a maturing collection of methodologies for the incremental development and maintenance of medium to large rule-based knowledge systems. While earlier knowledge based systems relied on extensive modeling and knowledge engineering, RDR instead takes a simple no-model approach that merges the development and maintenance stag...
Conference Paper
This article investigates internet commerce security applications of a novel combined method, which uses unsupervised consensus clustering algorithms in combination with supervised classification methods. First, a variety of independent clustering algorithms are applied to a randomized sample of data. Second, several consensus functions and sophist...
Article
Full-text available
Authorship attribution is a growing field, moving from beginnings in linguistics to recent advances in text mining. Through this change came an increase in the capability of authorship attribution methods both in their accuracy and the ability to consider more difficult problems. Research into authorship attribution in the 19th century considered i...
Article
The highly sophisticated and rapidly evolving area of internet commerce security presents many novel challenges for the organization of discourse in reasoning communities. This chapter suggests appropriate reasoning methods and demonstrates how establishing reasoning communities of security experts and enabling productive group discourse among them...
Conference Paper
Full-text available
Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for d...
Article
Full-text available
Optimization of multiple classifiers is an important problem in data mining. We introduce additional structure on the class sets of the classifiers using string rewriting systems with a convenient matrix representation. The aim of the present paper is to develop an efficient algorithm for the optimization of the number of errors of individual class...
Conference Paper
Full-text available
Increasingly, researchers and developers of knowledge based systems (KBS) have been incorporating the notion of context. For instance, Repertory Grids, Formal Concept Analysis (FCA) and Ripple-Down Rules (RDR) all integrate either implicit or explicit contextual information. However, these methodologies treat context as a static entity, neglecting...
Conference Paper
Prudence analysis (PA) is a relatively new, practical and highly innovative approach to solving the problem of brittleness in knowledge based systems (KBS). PA is essentially an online validation approach, where as each situation or case is presented to the KBS for inferencing the result is simultaneously validated. This paper introduces a new appr...
Conference Paper
Full-text available
Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting objectives. This paper argues for designing MORL systems to produce a set of solutions approximating the Pareto front, and shows that the common MORL technique of scalarisation has fundamental limitations when used to find Pareto-optimal policies. The work...
Conference Paper
Many researchers and developers of knowledge based systems (KBS) have been incorporating the notion of context. However, they generally treat context as a static entity, neglecting many connectionists’ work in learning hidden and dynamic contexts, which aids generalization. This paper presents a method that models hidden context within a symbolic d...

Network

Cited By