Figure 1 - uploaded by Zvezdan Lončarević
Content may be subject to copyright.
Source publication
Reinforcement learning refers to powerful algorithms for solving goal related problems by maximizing the reward over many time steps. By incorporating them into the dynamic movement primitives (DMPs) which are now widely used parametric representations in robotics, movements obtained from a single human demonstration can be adapted so that a robot...
Context in source publication
Similar publications
The dynamic window approach (DWA) serves as a pivotal collision avoidance strategy for mobile robots, meticulously guiding a robot to its target while ensuring a safe distance from any perceivable obstacles in the vicinity. While the DWA has seen various enhancements and applications, its foundational computational process has predominantly remaine...
Citations
... One of the common methods for autonomous refinement of skills is reinforcement learning (RL), which offers a framework and a set of tools for the design of sophisticated and hard-to-engineer behaviors [24]. RL refers to powerful algorithms used for solving problems by maximizing a reward over many time steps [33]. Reward gets maximized by iterative trials that are used in order to explore the different variations (different parameters also known as a search space) of the already learned knowledge and observing corresponding reward in such cases. ...
In order to be able to operate in an everyday human environments, humanoid robots will have to be able to autonomously adapt their actions, using among other things reinforcement learning methods. However, determination of an appropriate reward function for reinforcement learning remains a complex problem even for domain experts. In the thesis we investigate the possibility of utilizing a simple, qualitatively determined reward, which enables the extension of current algorithms to include human-agent into the learning process, while keeping their original formulation intact. Even with a working human reward system that is able to lead a robot to successful skill refinement, the current methods of RL are not appropriate for practical use on increased complexity high degree-of-freedom robots due to the high number of parameters that should be learned. Different methods of reducing the dimensionality have been proposed in the literature. The most suitable approach for our work is to use of special neural networks called autoencoders. They are capable of extracting only the important features of each action and thus reducing the dimensionality of the parameters that need to be found by reinforcement learning. As with all neural networks, the biggest problem that prevents their more extensive use is how to obtain a large enough database for learning. We test the number of required database samples and architecture of the network that would enable us to achieve the desired precision of trajectories while still keeping the number of parameters low. We extend this problem into real-world problems and analyze the possibilities of database extension without executing a huge amount of actions on the real system. We use the generalization method for this purpose and inspect the influence of the error introduced by such methods.