Fig 3 - uploaded by Manuel Mazzara
Content may be subject to copyright.
results of running in POMDP with pseudoset size 100 and relearning gap 100, single run on the left plot and averaged by 10 runs on the right one
Source publication
Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple...
Similar publications
Catastrophic forgetting is of special importance in reinforcement learning, as the data distribution is generally non-stationary over time. We study and compare several pseudorehearsal approaches for Q-learning with function approximation in a pole balancing task. We have found that pseudorehearsal seems to assist learning even in such very simple...
Catastrophic forgetting has a serious impact in reinforcement learning, as the data distribution is
generally sparse and non-stationary over time. The purpose of this study is to investigate whether
pseudorehearsal can increase performance of an actor-critic agent with neural-network based policy
selection and function approximation in a pole balan...
Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found tha...
Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found tha...
Citations
... PR is a simple and computationally efficient method for solving CF problem which is proven to be successful in unsupervised learning [17], supervised learning problems [21], [16] and sometimes in reinforcement learning as well [22], [14], [23]. It is interesting to note that the results of Baddeley suggest, that the widely studied ill conditioning might not be the main bottleneck of reinforcement learning while CF may be. ...
Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease forgetting.
... PR is a simple and computationally efficient method for solving CF problem which is proven to be successful in unsupervised learning [17], supervised learning problems [20], [16] and sometimes in reinforcement learning as well [21], [14], [22]. It is interesting to note that the results of Baddeley suggest, that the widely studied ill conditioning might not be the main bottleneck of reinforcement learning while CF may be. ...
Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudore-hearsal can assist learning and decrease forgetting.
... PR is a simple and computationally efficient method for solving CF problem which is proven to be successful in unsupervised learning [17], supervised learning problems [21], [16] and sometimes in reinforcement learning as well [22], [14], [23]. It is interesting to note that the results of Baddeley suggest, that the widely studied ill conditioning might not be the main bottleneck of reinforcement learning while CF may be. ...
Catastrophic forgetting has a significant negative impact in reinforcement learning. The purpose of this study is to investigate how pseudorehearsal can change performance of an actor-critic agent with neural-network function approximation. We tested agent in a pole balancing task and compared different pseudorehearsal approaches. We have found that pseudorehearsal can assist learning and decrease forgetting.
... We have shown that in Q-learning algorithms pseudorehearsal can improve performance significantly. [1] and now want to test it on more interesting and complex actor-critic algorithm. Actor-critic methods are one of the types of reinforcement learning model-based algorithms based on TD-learning. ...
Catastrophic forgetting has a serious impact in reinforcement learning, as the data distribution is generally sparse and non-stationary over time. The purpose of this study is to investigate whether pseudorehearsal can increase performance of an actor-critic agent with neural-network based policy selection and function approximation in a pole balancing task and compare different pseudorehearsal approaches. We expect that pseudorehearsal assists learning even in such very simple problems, given proper initialization of the rehearsal parameters.
... Mazzara have already shown that in Q-learning algorithms pseudorehearsal can improve performance significantly. [1] and now I expect it to show similar result for more interesting and complex actor-critic algorithm. ...
... Pseudorehearsal is a simple and computationally efficient method for solving catastrophic forgetting problem which is proven to be successful in unsupervised learning [20], supervised learning problems [23] [18] and sometimes in reinforcement learning as well [1] [8] [24]. It is interesting to note that the results of Baddeley suggest that the widely studied ill conditioning might not be the main bottleneck of reinforcement learning while catastrophic forgetting may. ...
... All elements of experiment -agent, environment, neural networks and pseudorehearsal rules are programs in C++ with using additional external library Eigen for more convenient linear algebraic computations. In addition to code necessary for thesis it has some artifacts connected with Q-learning agent, as it was firstly used for my previous paper [1] and was initially designed in such way that it would be easy task to implement different reinforcement learning algorithms and environments with minimum changes. ...
Catastrophic forgetting has a serious impact in reinforcement learning, as the data distribution is
generally sparse and non-stationary over time. The purpose of this study is to investigate whether
pseudorehearsal can increase performance of an actor-critic agent with neural-network based policy
selection and function approximation in a pole balancing task and compare different pseudorehearsal
approaches. It is expected that pseudorehearsal assists learning even in such very simple problems, given
proper initialization of the rehearsal parameters.
... We have shown that in Q-learning algorithms pseudorehearsal can improve performance significantly. [1] and now want to test it on more interesting and complex actor-critic algorithm. Actor-critic methods are one of the types of reinforcement learning model-based algorithms based on TD-learning. ...
Catastrophic forgetting has a serious impact in reinforcement learning, as the data distribution is generally sparse and non-stationary over time. The purpose of this study is to investigate whether pseudorehearsal can increase performance of an actor-critic agent with neural-network based policy selection and function approximation in a pole balancing task and compare different pseudorehearsal approaches. We expect that pseudorehearsal assists learning even in such very simple problems, given proper initialization of the rehearsal parameters.
Neural networks can achieve excellent results in a wide variety of applications. However, when they attempt to sequentially learn, they tend to learn the new task while catastrophically forgetting previous ones. We propose a model that overcomes catastrophic forgetting in sequential reinforcement learning by combining ideas from continual learning in both the image classification domain and the reinforcement learning domain. This model features a dual memory system which separates continual learning from reinforcement learning and a pseudo-rehearsal system that “recalls” items representative of previous tasks via a deep generative network. Our model sequentially learns Atari 2600 games without demonstrating catastrophic forgetting and continues to perform above human level on all three games. This result is achieved without: demanding additional storage requirements as the number of tasks increases, storing raw data or revisiting past tasks. In comparison, previous state-of-the-art solutions are substantially more vulnerable to forgetting on these complex deep reinforcement learning tasks.