A hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming
Dept. of Electr., Comput., & Biomed. Eng., Univ. of Rhode Island, Kingston, RI, USADOI: 10.1109/ICNSC.2010.5461483 Conference: Networking, Sensing and Control (ICNSC), 2010 International Conference on
Source: IEEE Xplore
In this paper we propose a hierarchical learning architecture with multiple-goal representations based on adaptive dynamic programming (ADP). The key idea of this architecture is to integrate a reference network to provide the internal reinforcement representation (secondary reinforcement signal) to interact with the operation of the learning system. Such a reference network serves an important role to build the internal goal representations. Furthermore, motivated by recent research in neurobiological and psychology research, the proposed ADP architecture can be designed in a hierarchical way, in which different levels of internal reinforcement signals can be developed to represent multi-level goals for the intelligent system. Detailed system level architecture, learning and adaptation principle, and simulation results are presented in this work to demonstrate the effectiveness of this work.
- [Show abstract] [Hide abstract]
ABSTRACT: The development of an intelligent electric grid of future, a smart grid, has attracted significant amount of attention recently from academia, industry, and government as well. Among many efforts toward this objective, computational intelligence research could provide important technical support to help the society to accomplish this goal. In this paper, I present a high level discussion on the vision of smart grid and how computational intelligence research can provide critical technical support to this vision. Specifically, wide area situational awareness and adaptive dynamic programming (ADP) based intelligent control are used as two examples in this work to illustrate the potential technical contributions of computational intelligence research toward the long-term objective of a smart grid. Numerous recent activities in the society on smart grid as well as future challenges and opportunities in this field are also highlighted and discussed in this paper.
- [Show abstract] [Hide abstract]
ABSTRACT: In this paper, we investigate the application of adaptive dynamic programming (ADP) for a real industrial-based control problem. Our focus includes two aspects. First, we consider the multiple-input and multiple-output (MIMO) ADP design for online learning and control. Specifically, we consider the action network with multiple outputs as control signals to be sent to the system under control, which provides the capability of this approach to be more applicable to real engineering problems with multiple control variables. Second, we apply this approach to a real industrial application problem to control the tension and height of the looper system in a hot strip mill system. Our intention is to demonstrate the adaptive learning and control performance of the ADP with such a real system. Our results demonstrate the effectiveness of this approach.
Conference Paper: Adaptive dynamic programming with balanced weights seeking strategy[Show abstract] [Hide abstract]
ABSTRACT: In this paper we propose to integrate the recursive Levenberg-Marquardt method into the adaptive dynamic programming (ADP) design for improved learning and adaptive control performance. Our key motivation is to consider a balanced weight updating strategy with the consideration of both robustness and convergence during the online learning process. Specifically, a modified recursive Levenberg-Marquardt (LM) method is integrated into both the action network and critic network of the ADP design, and a detailed learning algorithm is proposed to implement this approach. We test the performance of our approach based on the triple link inverted pendulum, a popular benchmark in the community, to demonstrate online learning and control strategy. Experimental results and comparative study under different noise conditions demonstrate the effectiveness of this approach.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.