-
[show abstract]
[hide abstract]
ABSTRACT: We propose using the kernel dimension reduction (KDR) to extract a low-dimensional feature space for humanoid locomotion tasks. Although humanoids have many degrees of freedom, task relevant feature spaces can be much smaller than the number of dimension of the original state space. We consider an application of the proposed approach to improve the locomotive performance of humanoid robots using an extracted low-dimensional state space. To improve the locomotive performance, we use a reinforcement learning (RL) framework. While RL is a useful non-linear optimizer, it is usually difficult to apply RL to real robotic systems - due to the large number of iterations required to acquire suitable policies. In this study, we use the extracted low-dimensional feature space for RL so that the learning system can improve task performance quickly. The kernel dimension reduction method allows us to extract the feature space even if the task relevant mapping is non-linear. This is an essential property to improve humanoid locomotive performance since stepping or walking dynamics involves highly nonlinear dynamics. We show that we can improve stepping and walking policies by using a RL method on an extracted feature space by using KDR.
Robotics and Automation, 2008. ICRA 2008. IEEE International Conference on; 06/2008
-
[show abstract]
[hide abstract]
ABSTRACT: We propose to improve the locomotive performance of humanoid robots by using approximated biped stepping and walking dynamics with reinforcement learning (RL). Although RL is a useful non-linear optimizer, it is usually difficult to apply RL to real robotic systems - due to the large number of iterations required to acquire suitable policies. In this study, we first approximated the dynamics by using data from a real robot, and then applied the estimated dynamics in RL in order to improve stepping and walking policies. Gaussian processes were used to approximate the dynamics. By using Gaussian processes, we could estimate a probability distribution of a target function with a given covariance function. Thus, RL can take the uncertainty of the approximated dynamics into account throughout the learning process. We show that we can improve stepping and walking policies by using a RL method with the approximated models both in simulated and real environments. Experimental validation on a real humanoid robot of the proposed
Intelligent Robots and Systems, 2007. IROS 2007. IEEE/RSJ International Conference on; 12/2007
-
[show abstract]
[hide abstract]
ABSTRACT: We propose a model-based reinforcement learning (RL) algorithm for biped walking in which the robot learns to appropriately modulate an observed walking pattern. Via-points are detected from the observed walking trajectories using the minimum jerk criterion. The learning algorithm controls the via-points based on a learned model of the Poincare map of the periodic walking pattern. The model maps from a state in the single support phase and the controlled via-points to a state in the next single support phase. We applied this approach to both a simulated robot model and an actual biped robot. We show that successful walking policies were acquired.
IEEE Robotics & amp amp Automation Magazine 07/2007; · 1.99 Impact Factor
-
[show abstract]
[hide abstract]
ABSTRACT: We propose a model-based reinforcement learning algorithm for biped walking in which the robot learns to appropriately modulate an observed walking pattern. Via-points are detected from the observed walking trajectories using the minimum jerk criterion. The learning algorithm modulates the via-points as control actions to improve walking trajectories. This decision is based on a learned model of the Poincaré map of the periodic walking pattern. The model maps from a state in the single support phase and the control actions to a state in the next single support phase. We applied this approach to both a simulated robot model and an actual biped robot. We show that successful walking policies are acquired.
Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on; 05/2005
-
[show abstract]
[hide abstract]
ABSTRACT: We explore the use of computational optimal control techniques for automated construction of policies in complex dynamic environments. Our implementation of dynamic programming is performed in a reduced dimensional subspace of a simulated four-DOF biped robot with point feet. We show that a computed solution to this problem can be generated and yield empirically stable walking that can handle various types of disturbances.
Robotics and Automation, 2005. ICRA 2005. Proceedings of the 2005 IEEE International Conference on; 05/2005
-
[show abstract]
[hide abstract]
ABSTRACT: We have developed a robust control policy design and method for high-dimensional state spaces by using differential dynamic programming with a minimax criterion. As an example, we applied our method to a simulated five link biped robot. The results show lower joint torques using the optimal control policy compared to torques generated by a hand-tuned PD servo controller. Results also show that the simulated biped robot can successfully walk with unknown disturbances that cause controllers generated by standard differential dynamic programming and the hand-tuned PD servo to fail. Learning to compensate for modeling error and previously unknown disturbances in conjunction with robust control design is also demonstrated. We applied the proposed method to a real biped robot to optimize swing leg trajectories.
Intelligent Robots and Systems, 2003. (IROS 2003). Proceedings. 2003 IEEE/RSJ International Conference on; 11/2003
-
[show abstract]
[hide abstract]
ABSTRACT: We propose a model-based reinforcement learning algorithm for biped walking in which the robot learns to appropriately place the swing leg. This decision is based on a learned model of the Poincare map of the periodic walking pattern. The model maps from a state at the middle of a step and foot placement to a state at next middle of a step. We also modify the desired walking cycle frequency based on online measurements. We present simulation results, and are currently implementing this approach on an actual biped robot.
Robotics and Automation, 2004. Proceedings. ICRA '04. 2004 IEEE International Conference on;