Figure 2 - uploaded by Astrid Jackson
Content may be subject to copyright.
Process of extracting the successor states by utilizing the transition case base. (a) Retrieve similar states (s 1 ,. .. , s n ) of kNN. (b) Determine the actions previously performed in these states. (c) Reuse by identifying similar actions (a 1 ,. .. , a k ) using cosine similarity. (d) Select s i+1 as possible successor states.

Process of extracting the successor states by utilizing the transition case base. (a) Retrieve similar states (s 1 ,. .. , s n ) of kNN. (b) Determine the actions previously performed in these states. (c) Reuse by identifying similar actions (a 1 ,. .. , a k ) using cosine similarity. (d) Select s i+1 as possible successor states.

Source publication
Conference Paper
Full-text available
Reinforcement learning (RL) is a popular choice for solving robotic control problems. However, applying RL techniques to controlling humanoid robots with high degrees of freedom remains problematic due to the difficulty of acquiring sufficient training data. The problem is compounded by the fact that most real-world problems involve continuous stat...

Contexts in source publication

Context 1
... process is detailed in Figure 3: Estimation of the state transition probabilities for a given state and a given action. The forward algorithm of the HMM is used to calculate the probability for each one-step sequence s i , s i+1 , that was identified utilizing the case base (Figure 2(d)). Algorithm 1. ...
Context 2
... the potential successor states must be identified. This is accomplished by performing case re- trieval on the case base, which collects the states similar to s i into the set C S (see Figure 2(a)). Depending on the do- main, the retrieval method is either a k-nearest neighbor or a radius-based neighbor algorithm using the Euclidean dis- tance as the metric. ...
Context 3
... on the do- main, the retrieval method is either a k-nearest neighbor or a radius-based neighbor algorithm using the Euclidean dis- tance as the metric. The retrieved cases are then updated into the set C A by the reuse stage (see Figure 2(b) and 2(c)). C A consists of all those cases c k = s k , a k , ∆s k ∈ C S whose action a k are cosine similar to a i , which is the case if d cos (a k , a i ) ≥ ρ. ...
Context 4
... A consists of all those cases c k = s k , a k , ∆s k ∈ C S whose action a k are cosine similar to a i , which is the case if d cos (a k , a i ) ≥ ρ. At this point all successor states s i+1 can be predicted by the calculation of the vector addition s i + ∆s k (see Figure 2(d)). ...

Similar publications

Thesis
Full-text available
In this work, we explore the application of Artificial Intelligence and Statistics techniques to the problem of acoustic recognition of bird species based on their song production and contrast these results with the traditional approaches based on Hidden Markov Models and Neural Networks. Previous work has shown that large collections of spectral a...
Preprint
Full-text available
In robotics, there is need of an interactive and expedite learning method as experience is expensive. Robot Learning from Demonstration (RLfD) enables a robot to learn a policy from demonstrations performed by teacher. RLfD enables a human user to add new capabilities to a robot in an intuitive manner, without explicitly reprogramming it. In this w...