A hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming

Conference Paper · May 2010with2 Reads
DOI: 10.1109/ICNSC.2010.5461483 · Source: IEEE Xplore
Conference: Networking, Sensing and Control (ICNSC), 2010 International Conference on
In this paper we propose a hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming (ADP). The key idea of this architecture is to integrate a reference network to provide the internal reinforcement representation (secondary reinforcement signal) to interact with the operation of the learning system. Such a reference network serves an important role to build the internal goal representations. Furthermore, motivated by recent research in neurobiological and psychology research, the proposed ADP architecture can be designed in a hierarchical way, in which different levels of internal reinforcement signals can be developed to represent multi-level goals for the intelligent system. Detailed system level architecture, learning and adaptation principle, and simulation results are presented in this work to demonstrate the effectiveness of this work.
    • "A multiagent system (MAS) is a system that has multiple interacting autonomous agents, and there are increasing numbers of application domains that are more suitable to be solved by a multiagent system instead of a centralized single agent [88], [100]. MORL in multiagent systems is a very important research topic, due to the multiobjective nature of many practical multiagent systems. "
    [Show abstract] [Hide abstract] ABSTRACT: Reinforcement learning (RL) is a powerful paradigm for sequential decision-making under uncertainties, and most RL algorithms aim to maximize some numerical value which represents only one long-term objective. However, multiple long-term objectives are exhibited in many real-world decision and control systems, so recently there has been growing interest in solving multiobjective reinforcement learning (MORL) problems where there are multiple conflicting objectives. The aim of this paper is to present a comprehensive overview of MORL. The basic architecture, research topics, and naïve solutions of MORL are introduced at first. Then, several representative MORL approaches and some important directions of recent research are comprehensively reviewed. The relationships between MORL and other related research are also discussed, which include multiobjective optimization, hierarchical RL, and multiagent RL. Moreover, research challenges and open problems of MORL techniques are suggested.
    Article · Mar 2015
  • [Show abstract] [Hide abstract] ABSTRACT: The development of an intelligent electric grid of future, a smart grid, has attracted significant amount of attention recently from academia, industry, and government as well. Among many efforts toward this objective, computational intelligence research could provide important technical support to help the society to accomplish this goal. In this paper, I present a high level discussion on the vision of smart grid and how computational intelligence research can provide critical technical support to this vision. Specifically, wide area situational awareness and adaptive dynamic programming (ADP) based intelligent control are used as two examples in this work to illustrate the potential technical contributions of computational intelligence research toward the long-term objective of a smart grid. Numerous recent activities in the society on smart grid as well as future challenges and opportunities in this field are also highlighted and discussed in this paper.
    Conference Paper · Aug 2010
  • [Show abstract] [Hide abstract] ABSTRACT: In this paper, we investigate the application of adaptive dynamic programming (ADP) for a real industrial-based control problem. Our focus includes two aspects. First, we consider the multiple-input and multiple-output (MIMO) ADP design for online learning and control. Specifically, we consider the action network with multiple outputs as control signals to be sent to the system under control, which provides the capability of this approach to be more applicable to real engineering problems with multiple control variables. Second, we apply this approach to a real industrial application problem to control the tension and height of the looper system in a hot strip mill system. Our intention is to demonstrate the adaptive learning and control performance of the ADP with such a real system. Our results demonstrate the effectiveness of this approach.
    Conference Paper · May 2011 · IEEE Transactions on Systems, Man, and Cybernetics: Systems
Show more

  • undefined · undefined
  • undefined · undefined
  • undefined · undefined