Programming by Demonstration of Robot Manipulators
To my family
Örebro Studies in Technology 34
Programming by Demonstration of
© Alexander Skoglund, 2009
Title: Programming by Demonstration of Robot Manipulators
Publisher: Örebro University 2009
Editor: Jesper Johanson
Printer: Intellecta Infolog, V Frölunda 05/2009
If a non-expert wants to program a robot manipulator he needs a natural inter-
face that does not require rigorous robot programming skills. Programming-by-
demonstration (PbD) is an approach which enables the user to program a robot
by simply showing the robot how to perform a desired task. In this approach,
the robot recognizes what task it should perform and learn how to perform it
by imitating the teacher.
One fundamental problem in imitation learning arises from the fact that
embodied agents often have different morphologies. Thus, a direct skill transfer
from human to a robot is not possible in the general case. Therefore, a system-
atic approach to PbD is needed, which takes the capabilities of the robot into
account–regarding both perception and body structure. In addition, the robot
should be able to learn from experience and improve over time. This raises the
question of how to determine the demonstrator’s goal or intentions. It is shown
that this is possible–to some degree–to infer from multiple demonstrations.
This thesis address the problem of generation of a reach-to-grasp motion
that produces the same results as a human demonstration. It is also of interest
to learn what parts of a demonstration provide important information about
The major contribution is the investigation of a next-state-planner using
a fuzzy time-modeling approach to reproduce a human demonstration on a
robot. It is shown that the proposed planner can generate executable robot
trajectories based on a generalization of multiple human demonstrations. The
notion of hand-states is used as a common motion language between the human
and the robot. It allows the robot to interpret the human motions as its own,
and it also synchronizes reaching with grasping. Other contributions include
the model-free learning of human to robot mapping, and how an imitation
metric can be used for reinforcement learning of new robot skills.
The experimental part of this thesis presents the implementation of PbD
of pick-and-place-tasks on different robotic hands/grippers. The different plat-
forms consist of manipulators and motion capturing devices.
Keywords: programming-by-demonstration, imitation learning, hand-state,
next-state-planner, fuzzy time-modeling approach.
First, I would like to thank my supervisors, for their help, our discussions and
most importantly their scientific guidance. My supervisor Rainer Palm, for the
valuable inspiring conversations, teaching me about robotics, and introducing
me to important methods. My assisting supervisor, Boyko Iliev, has been a great
source of knowledge and cooperation as well as inspiration. I am very thankful
to my supervisors for proofreading and commenting on several versions of each
chapter, and correcting embarrassing mistakes.
I would also like to thank Johan Tegin at the Royal Institute of Technology,
Stockholm, for both the assistance in paper writing, valuable feedback, general
discussions, and for providing access to the KTHand which was a most use-
ful tool. I thank Jacopo Aleotti at Parma University for the collaboration we
had during his stay in Örebro in 2004. Dimitar Dimitrov should be acknowl-
edged for his outstanding knowledge in robotic simulation and control, and
for always being helpful and sharing his time. Without Krzysztof Charusta, my
final experiments would not have been possible–thank you! I’m also happy to
have worked with Tom Duckett, Achim Lilienthal, Bourhane Kadmiry and Ivan
Kalaykov during my years as a Ph.D. student.
Our research engineers Per Sporrong and Bo-Lennert Silfverdahl should be
acknowledged for their help in the lab with our robots and motion capturing
systems. All the Ph.D. students at AASS should also be acknowledged for the
great social environment they create, both during–and often after–work.
Finally, I would like to thank my wife, Johanna, for all her love, for her
support, and for being my best friend, although I am not alway present. And
my daughter Juni for her love, laughs and warm welcome; although she thinks
I work with “sopor”1because I bring them out almost every morning.
Örebro, March 25:th, 2009
1Swedish for garbage.
S. Schaal, C. G. Atkeson, and S. Vijayakumar. Schalable techniques from non-
parametric statistics for real-time robot learning. Applied Intellicence, 17(1):
L. Sciavicco and B. Siciliano. Modeling and Control of Robot Manipulators.
The McGraw-Hill Companies, Inc., 1996.
J. M. Selig. Introductory Robotics. Prentice Hall International (UK), 1992.
R. Shadmehr and S. P. Wise. Computational Neurobiology of Reaching and
Pointing - A Foundation for Motor Learning. Computational Neuroscience.
The MIT Press, 2005.
A. Sharkey. On combining artificial neural nets. In Connection Science, vol-
ume 8, pages 299–313, 1996.
SIGGRAPH Computer Graphics, 19(3):245–254, July 1985.
Animating rotation with quaternion curves.
G. Simmons and Y. Demiris. Optimal robot arm control using the minimum
variance model. Journal of Robotic Systems, 22(11):677–690, 2005. doi:
A. Skoglund, B. Iliev, B. Kadmiry, and R. Palm. Programming by demonstration
of pick-and-place tasks for industrial manipulators using task primitives. In
IEEE International Symposium on Computational Intelligence in Robotics
and Automation, pages 368–373, Jacksonville, Florida, June 20-23 2007.
A. Skoglund, T. Duckett, B. Iliev, A. Lilienthal, and R. Palm. Programming by
demonstration of robotic manipulators in non-stationary environments. In
Proceedings of 2006 IEEE International Conference on Robotics and Au-
A. Skoglund, B. Iliev, and R. Palm. A hand state approach to imitation with
a next-state-planner for industrial manipulators. In Proceedings of the 2008
International Conference on Cognitive Systems, pages 130–137, University
of Karlsruhe, Karlsruhe, Germany, April 2-4 2008.
A. Skoglund, J. Tegin, B. Iliev, and R. Palm. Programming-by-demonstration
of reach to grasp tasks in hand-state space. In Submitted to the 14:th Inter-
national Conference on Advanced Robotics, Munich, Germany, June 22-26
R. S. Sutton. Learning to predict by the methods of temporal differences. Ma-
chine Learning, 3:9–44, 1988.
R. S. Sutton. Dyna, an Integrated Architecture for Learning, Planning, and
Reacting. In SIGART Bulletin, volume 2, pages 160–163, 1991.
R. S. Sutton and A. G. Barto. Reinforcement Learning: an Introduction. The
MIT Press, 1998.
T. Takagi and M. Sugeno. Fuzzy identification of systems and its applications to
modeling and control. IEEE Transactions on Systems, Man, and Cybernetics,
SMC-15(1):116–132, January/February 1985.
J. Tani, M. Ito, and Y. Sugita. Self-organization of distributedly represented
multiple behavior schemata in mirror systems: Reviews of robot experi-
ments using rnnpb.
Neural Networks, 17(8–9):1273–1289, 2004.
J. Tegin, S. Ekvall, D. Kragic, B. Iliev, and J. Wikander. Demonstration based
learning and control for automatic grasping. In International Conference on
Advanced Robotics, Jeju, Korea, Aug 2007.
J. Tegin, J. Wikander, and B. Iliev. A sub e1000 robot hand for grasping –
design, simulation and evaluation. In International Conference on Climbing
and Walking Robots and the Support Technologies for Mobile Machines,
Coimbra, Portugal, Sep 2008.
J. Tegin, S. Ekvall, D. Kragic, J. Wikander, and B. Iliev. Demonstration based
learning and control for automatic grasping. Journal of Intelligent Service
Robotics, 2(1):23–30, 2009. doi: 10.1007/s11370-008-0026-3.
A. Ude. Trajectory generation from noisy positions of object features for teach-
ing robot paths. Robotics and Autonomous Systems, 11(2):113–127, 1993.
S. Vijayakumar and S. Schaal. Locally weighted projection regression: An O(n)
algorithm for incremental real time learning in high dimensional space. In
Proceeding of Seventh International Conference on Machine Learning, pages
S. Vijayakumar, A. D’Souza, and S. Schaal. Incremental online learning in
Neural Computation, 17(12):2602–2634, 2005.
C. C. Watkins. Learning from Delayed Rewards. PhD thesis, Cambridge Uni-
M. Weigelt, R. Cohen, and D. A. Rosenbaum. Returning home: location mem-
ory versus posture memory in object manipulation. Experimental Brain Re-
search, 179(2):191–198, 2007. doi: 10.1007/s00221-006-0780-4.
J. Weng, J. McClelland, A. Pentland, O. Sporns, I. Stockman, M. Sur, and
E. Thelen. Autonomous mental development by robots and animals. Sci-
ence, 291(5504):599–600, Jan 2001. doi: 10.1126/science.291.5504.599.
D. Wolpert and M. Kawato. Multiple paired forward and inverse models for
motor control. Neural Networks, 11(7–8):1317–1329, 1998.
D. M. Wolpert and Z. Ghahramani. Computational principles of movement
neuroscience. Nature Neuroscience, 3(11):1212–1218, Nov 2000.
D. M. Wolpert, R. C. Miall, and M. Kawato. Internal models in the cerebellum.
Trends in Cognitive Sciences, 2(9):338–347, 1998.
K. Yamane and Y. Nakamura. Dynamics filter - concept and implementation of
online motion generator for human figures. IEEE Transactions on Robotics
and Automation, 19(3):421–432, 2003. doi: 10.1109/TRA.2003.810579.
T. R. Zentall. Imitation: definitions, evidence, and mechanisms. Animal Cog-
nition, 9(4):335–353, 2007. doi: 10.1007/s10071-006-0039-2.
K. J. Åström and T. Hägglund. PID Controllers: Theory, Design and Tun-
ing. The International Society for Measurement and Control, second edition,
Publications in the series
Örebro Studies in Technology
1. Bergsten, Pontus (2001) Observers and Controllers for Tagaki-Sugeno
Fuzzy Systems. Doctoral Dissertation.
2. Iliev, Bokyo (2002) Minimum-Time Sliding Mode Control of Robot
Manipulators. Licentiate Thesis.
3. Spännar, Jan (2002) Grey Box Modelling for Temperature Estimation.
4. Persson, Martin (2002) A Simulation Environment for Visual Servoing.
5. Boustedt, Katarina (2002) Flip Chip for High Volume and Low Cost –
Materials and Production Technology. Licentiate Thesis.
6. Biel, Lena (2002) Modeling of Perceptual Systems – A Sensor Fusion
Model with Active Perception. Licentiate Thesis.
7. Otterskog, Magnus (2002) Produktionstest av mobiltelefonantenner i
mod-växlande kammare. Licentiate Thesis.
8. Tolt, Gustav (2004) Fuzzy-Similarity-Based Low-level Image Processing.
9. Loutfi, Amy (2003) Communicating Perceptions: Grounding Symbols to
Artificial Olfactory Signals. Licentiate Thesis.
10. Iliev, Boyko (2004) Minimum-time Sliding Mode Control of Robot
Manipulators. Doctoral Dissertation.
11. Pettersson, Ola (2004) Model-Free Execution Monitoring in
Behavior-Based Mobile Robotics. Doctoral Dissertation.
12. Överstam, Henrik (2004) The Interdependence of Plastic Behaviour and
Final Properties of Steel Wire, Analysed by the Finite Element Method.
13. Jennergren, Lars (2004) Flexible Assembly of Ready-to-Eat Meals.
14. Li, Jun (2004) Towards Online Learning of Reactive Behaviors in
Mobile Robotics. Licentiate Thesis.
15. Lindquist, Malin (2004) Electronic Tongue for Water Quality
Assessment. Licentiate Thesis.
16. Wasik, Zbigniew (2005) A Behavior-Based Control System for Mobile
Manipulation. Doctoral Dissertation.
17. Berntsson, Tomas (2005) Replacement of Lead Baths with Environment
Friendly Alternative Heat Treatment Processes in Steel Wire Production.
18. Tolt, Gustav (2005) Fuzzy Similarity-based Image Processing. Doctoral
19. Munkevik, Per (2005) Artificial sensory evaluation – appearance-based
analysis of ready meals. Licentiate Thesis.
20. Buschka, Par (2005) An Investigation of Hybrid Maps for Mobile
Robots. Doctoral Dissertation.
21. Loutfi, Amy (2006) Odour Recognition using Electronic Noses in
Robotic and Intelligent Systems. Doctoral Dissartation 2006.
22. Gillström, Peter (2006) Alternatives to Pickling; Preparation of Carbon
and Low Alloyed Steel Wire Rod. Doctoral Dissertation.
23. Li, Jun (2006) Learning Reactive Behaviors with Constructive Neural
Networks in Mobile Robotics. Doctoral Dissertation.
24. Otterskog, Magnus (2006) Propagation Environment Modeling Using
Scattered Field Chamber. Doctoral Dissertation.
25. Lindquist, Malin (2007) Electronic Tongue for Water Quality
Assessment. Doctoral Dissertation.
26. Cielniak, Grzegorz (2007) People Tracking by Mobile Robots using
Thermal and Colour Vision. Doctoral Dissertation.
27. Boustedt, Katarina (2007) Flip Chip for High Frequency Applications –
Materials Aspects. Doctoral Dissertation.
28. Soron, Mikael (2007) Robot System for Flexible 3D Friction Stir
Welding. Doctoral Dissertation.
29. Larsson, Sören (2007) An industrial robot as carrier of a laser profile
scanner – Motion control, data capturing and path planning. Doctoral
30. Persson, Martin (2008) Semantic Mapping Using Virtual Sensors and
Fusion of Aerial Images with Sensor Data from Ground Vehicle.
31. Andreasson, Henrik (2008) Local Visual Features base Localisation and
Mapping by Mobile Robots. Doctoral Dissertation.
32. Bouguerra, Abdelbaki (2008) Robust Execution of Robot Task-Plans: A
Knowledge-based Approach. Doctoral Dissertation.
33. Lundh, Robert (2009) Robots that Helps Each Other:
Self-Configuration of Distributed Robot Systems. Doctoral Dissertation.
34. Skoglund, Alexander (2009) Programming by Demonstration of Robot
Manipulators. Doctoral Dissertation.