BookPDF Available

The Playful Machine - Theoretical Foundation and Practical Realization of Self-Organizing Robots



Autonomous robots may become our closest companions in the near future. While the technology for physically building such machines is already available today, a problem lies in the generation of the behavior for such complex machines. Nature proposes a solution: young children and higher animals learn to master their complex brain-body systems by playing. Can this be an option for robots? How can a machine be playful? The book provides answers by developing a general principle---homeokinesis, the dynamical symbiosis between brain, body, and environment---that is shown to drive robots to self- determined, individual development in a playful and obviously embodiment- related way: a dog-like robot starts playing with a barrier, eventually jumping or climbing over it; a snakebot develops coiling and jumping modes; humanoids develop climbing behaviors when fallen into a pit, or engage in wrestling-like scenarios when encountering an opponent. The book also develops guided self-organization, a new method that helps to make the playful machines fit for fulfilling tasks in the real world. The book provides two levels of presentation. Students and scientific researchers interested in the field of robotics, self-organization and dynamical systems theory may be satisfied by the in-depth mathematical analysis of the principle, the bootstrapping scenarios, and the emerging behaviors. But the book additionally comes with a robotics simulator inviting also the non- scientific reader to simply enjoy the fabulous world of playful machines by performing the numerous experiments.
A preview of the PDF is not available

Chapters (17)

Robots and their relation to mankind have taken a long and diversified development. Starting from the romantic desire to have a workmate and/or playmate centuries ago, the modern history of robot control starts with the birth of artificial intelligence (AI) about 50 years ago. In the hype of that time, robots were considered as machines under total control of an artificial intelligence thought to understand the world and the physics of the body well enough in order to control the robot by a set of rules defining its behavior.
Self-organization in the sense used in natural sciences means the spontaneous creation of patterns in space and/or time in dissipative systems consisting of many individual components. Central in this context is the notion of emergence meaning the spontaneous creation of structures or functions that are not directly explainable from the interactions between the constituents of the system. This chapter presents at first several examples of prominent self-organizing systems in nature with the aim to identify the underlying mechanisms. While self-organization in natural systems shares a common scheme, self-organization in machines is more diversified. An exception is swarm robotics because of the similarity to a system of many constituents interacting via local laws as encountered in physics (particles), biology (insects), and technology (robots). This chapter aims at providing a common basis for a translation of self-organization effects to single robots considered as complex physical systems consisting of many constituents that are constraining each other in an intensive manner.
This chapter aims at providing a basic understanding of the sensorimotor loop as a feedback system. First we will give some insights into the richness of behavior resulting from simple closed loop control structures in a robotic system called the Barrel. This richness is a lesson we can learn from dynamical systems theory: even very simple systems can produce highly complicated behavior. Nearly everything is possible in such a feedback system that is provided with enough energy from outside. Surprisingly, this is accomplished even with extremely simple, fixed controllers, to which we will restrict ourselves here. In later chapters we will see how the homeokinetic principle makes theses systems adaptive and drives them towards specific working regimes of moderate complexity, loosely speaking somewhere between order and chaos.
We have seen in Chap. 3 that there are specific working regimes in closed sensorimotor loops where the agents exhibit interesting behaviors. The challenge is now to develop general principles so that the agent finds these regions by itself. One essential point at this level of autonomy is the ability to survive in hostile situations, which, as a first prerequisite, requires a certain stability against external perturbations. An example of this is homeostasis, one of the prominent self-regulation scenarios in living beings. This chapter introduces Ashby’s homeostat as a concrete example from cybernetics and develops a general principle of self-regulation as a first step towards a general basis for the self-organization of behavior.
In this chapter we will introduce the concept of homeokinesis, formulate it in mathematical terms, and develop a first understanding of its functionality. The preceding chapter on homeostasis made clear that the objective of “keeping things under control” cannot lead to a system which has a drive of its own to explore its behavioral options in a self-determined manner. This is not surprising since the homeostatic objective drives the controller to minimize the future effects of unpredictable perturbations. This chapter uses a different objective, the so called time-loop error, derives learning rules by gradient descending that error and discusses first consequences of the new approach. Minimizing the time-loop error is shown to generate a dynamical entanglement between state and parameter dynamics that has been termed homeokinesis since it realizes a dynamical regime jointly involving the physical, the neural, and the synaptic dynamics of the brain-body system.
Homeokinesis realizes the self-organization of artificial brain-body systems by gradient descending the time-loop error, a quantity that is truly internal to the robot since it is defined exclusively in terms of its sensorimotor dynamics. Homeokinesis can therefore be considered as a self-supervised learning procedure with the special effect of making the brain-body system self-referential. We will study this phenomenon here in an idealized one-dimensional world in order to identify key features of our self-referential dynamical systems independently of any specific embodiment effects. In particular, we will gain some insight into the entanglement of state and parameter dynamics and investigate the way how the latter induces behavioral variability.
This chapter is a continuation of the preceding chapter to two-dimensional systems. We will identify new features of our self-referential dynamical system again independently of any specific embodiment effects. The additional dimension opens the possibility for state oscillations, where the entanglement of state and parameter dynamics will lead to interesting phenomena and induces behavioral variability. The most prominent effects are driving oscillatory systems into a second order hysteresis (by a self-organized frequency sweeping effect) and getting into resonance with latent oscillatory modes of the controlled system.
In this chapter we will demonstrate the performance of the homeokinetic control system when applied to physical robots. We will recognize many of the effects observed in idealized world conditions, shown in the previous chapters, but most dominantly witness new features originating from the interaction of the learning dynamics with the respective embodiment. Among them are non-trivial sensorimotor coordination, excitation of resonance modes, the adaptation to different environments - all emerging from the unspecific homeokinetic learning rules. The entanglement effect is seen to make emerging motion patterns transient so that the behavioral options are explored and a playful behavior is observed. In order to keep things simple enough for analysis we consider here only low-dimensional systems and leave the high-dimensional ones for Chap. 10.
This chapter discusses several aspects concerning the simultaneous learning of controller and internal model. We start with discussing the bootstrapping dilemma arising in this context and the consequences of insufficient sampling. It appears that homeokinetic learning solves these problems naturally, which we illustrate in several examples. Further, we extend the implementation of the internal model by a sensor-branch. This is seen to increase the applicability of the homeokinetic controller because it allows for situations where the sensor values are subject to an action-independent dynamics. The extended model is prone to an ambiguity in the learning process, which can lead to instabilities. The problem can be resolved if the time-loop error is used as an additional objective for the model learning.
This chapter contains many applications of homeokinetic learning to high-dimensional robotic systems. The examples chosen for investigation and proposed as experiments to the reader comprise various robots ranging from dog-like, to snake-like up to humanoid robots in different environmental situations. The aim of the experiments is to understand how the controller can learn to “feel” the specific physical properties of the body in its environment and manages to get in a kind of functional resonance with the physical system. In order to better bring out the characteristics of homeokinetic learning in these systems, we use a kind of physical scaffolding, for instance suspending the Humanoid like a bungee jumper, putting it in the Rhoenrad, or hanging it at the high bar. Interestingly, in all situations the robots develop whole-body motion patterns that seemingly are related to the specific environmental situation: the Dog starts playing with a barrier eventually jumping or climbing over it; the Snake develops coiling and jumping modes; we observe emerging climbing behaviors of a Humanoid like trying to get out of a pit; and wrestling like scenarios if a Humanoid is encountering a companion. Eventually, in our robotic zoo all kinds of robots are brought together so that homeokinesis can prove its robustness against heavy interactions with other robots or dynamical objects. Essentially this chapter provides a phenomenological overview and invites to play around with numerous simulations to see the “playful machine” in action.
Homeokinesis has been introduced and analyzed in the preceding chapters on the basis of the time-loop error (TLE). This chapter presents an alternative approach to the general homeokinetic objective by introducing a new representation of the sensorimotor dynamics. This new representation corrects the state dynamics for the predictable changes in the sensor values so that the transformed state is constant except for the interactions with the unknown part of the dynamics. The single-step interaction term will be seen to be identical to the TLE so that the learning dynamics is not altered. However, besides giving an additional motivation for the TLE, this chapter will extend the considerations to the case of several time steps and will eventually consider infinite time horizons making contact with the global Lyapunov exponents and chaos theory.
We introduce here an in the following chapters guided self-organization as the combination of specific goals with self-organizing control. As a first realization we propose in this chapter the guidance with supervised learning signals. First, we investigate how these signals can be incorporated into the learning dynamics and present then a simple scenario with direct motor teaching signals.We find that the homeokinetic controller explores around the given motor patterns and thus may find a more suitable behavior for the particular body. Second, we transfer this into a teaching at the level of sensor signals, which is very natural in our setup. Thismechanism of guidance builds the basis for higher level guiding mechanisms as discussed in Chap. 13.
Many desired behaviors are distinguished by a certain structure in the motor or sensor activity. In particular the phase relation between different motors or sensors capture a lot of this structure. We will now propose a way to embed these relations as soft constrains to the learning system, such that we break certain symmetries and let desired behaviors emerge. Starting from the guidance by teaching we introduce the concept of crossmotor teaching that allows to specify abstract relations between motor channels. First we study simple pairwise relations and shape the behavior of the TwoWheeled robot to drive mostly straight by a relation between both motor neurons. Then we will consider a high-dimensional robot-the Armband and demonstrate fast locomotion behaviors from scratch by guided self-organization.
In this chapter we investigate how to guide the self-organization process by providing an online reward or punishment. The starting point for the following considerations is that the homeokinetic controller explores the behavioral space of the controlled system and that those behaviors which are well predictable will persist longer than others. The idea we pursue in this chapter is to regulate the lifetimes of the transient according to the reward or punishment. The mechanism is applied to the Spherical with two goals, fast motion and curved rolling.
This chapter presents a unified algorithm implementing the homeokinetic learning rules including a number of extensions partly discussed already in earlier chapters of this book. We continue with some guidelines and tips on how to use the homeokinetic “brain.” We discuss techniques and special methods to make the self-supervised learning of embodied systems more reliable from the practical point of view. This includes the regularization procedures for the singularities in the time-loop error and different norms of the error for the gradient descent. The internal complexity of the controller and the model is extended by the generalization to multilayer networks. Apart from that the computational complexity of the learning algorithm will be reduced essentially by easing non-trivial matrix inversions. This is important for truly autonomous hardware realizations.
In this chapter we describe our robot simulator. We start with the overall structure of the software package containing the controller framework, the physics simulator and external tools. The controller framework makes it very easy to develop and test our algorithms, be it in simulations or with real robots. The physics simulator can handle rigid bodies with fixed geometric representation that are connected by actuated joints. Particular efforts have been undertaken to develop an elaborated treatment of physical object interactions including friction, elasticity, and slip. The chapter also briefly discusses the generation of virtual creatures, the user interface and the most important features of the LpzRobots simulation environment.
This book started with a question. Now, more than 300 pages and, if you were eager, nearly 30 experiments later, let us try to draw a balance. Have we solved the problem of getting robots into controlled activities without telling them what to do, without giving a task, a goal, or any other external pressure for development? The answer to this question is a clear yes. We have formulated a general principle—homeokinesis—that was seen to provide, in a natural and unbiased way, the desired drive to activity while still “keeping things under control.”
... A line of work on self-organizing robot control [26,27,28,14] studies how to obtain a variety of coordinated behaviors from generic principles. A key idea is that an exploring policy should create a feedback loop with the environment whose dynamics is at the edge of chaos-self-amplifying small perturbations but keeping intermittently coherent dynamics. ...
... A key idea is that an exploring policy should create a feedback loop with the environment whose dynamics is at the edge of chaos-self-amplifying small perturbations but keeping intermittently coherent dynamics. The concept of Homeokinesis [27] balances instability with predictability while predictive information maximization [28] uses information theory to create a learning rule for producing structured but variate behavior. Differential extrinsic plasticity [14] (DEP) is a simplified and biologically plausible implementation of the same idea, manifested in a closed-loop control network with rapidly changing weights to create active behaviors with high correlation between the degrees of freedom. ...
Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by the finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able to induce state-space covering exploration within seconds of interaction. By integrating DEP into RL, we achieve fast learning of reaching and locomotion in musculoskeletal systems, outperforming current approaches in all considered tasks in sample efficiency and robustness.
... These latter neuroplasticity-generated spontaneous behaviors, detailed in the book "The Playful Machine" (Der and Martius, 2012) and in related papers Martius, 2015, 2017), can drive simulated agents to explore and react to their environments in a manner that is highly suggestive of natural behaviors without building in any goals or higher-level planning of any sort. The most recent iteration of this research uses a particular neuroplasticity scheme called Differential Extrinsic Plasticity (DEP) (Der and Martius, 2015;Pinneri and Martius, 2018) to generate intriguing behaviors that are tightly coupled with the environment: a four-legged creature will appear to search for and find ways to climb over a fence; a humanoid will eventually clamber out of a hole it is trapped in. ...
Full-text available
The neuroplasticity rule Differential Extrinsic Plasticity (DEP) has been studied in the context of goal-free simulated agents, producing realistic-looking, environmentally-aware behaviors, but no successful control mechanism has yet been implemented for intentional behavior. The goal of this paper is to determine if “short-circuited DEP,” a simpler, open-loop variant can generate desired trajectories in a robot arm. DEP dynamics, both transient and limit cycles are poorly understood. Experiments were performed to elucidate these dynamics and test the ability of a robot to leverage these dynamics for target reaching and circular motions.
... Hence, it would probably be wise to let the robot decide for itself what to do next. In this way, if the actions are a direct result of the robot's inherent physical properties and the environment, its behavior is likely to be recognized as plausible [16], [17], i.e. it looks and feels realistic to humans. ...
Conference Paper
Full-text available
Social robots can be an alternative to pets for people who cannot, do not want to, or are not allowed to keep animals. But there are only a few robots that move toward fulfilling the function of a pet. We present flatcat, a new minimalist pet-like social robot that reacts to human touch in a way not seen with such robots before and that does not mimic existing animals. Here, we describe its mechanical, electrical, and software design and present early user reactions. With this robot we strive to provide an immersive tactile experience through cognitive sensorimotor loops and aim to maintain user interest through variations in robot behavior driven by intrinsic motivation.
... We are not fixated on a particular category of machines, but we do recognise the diversity of urban machines that consist of both non-digital machinery operating on conventional mechanisms and modern devices integrated with digital computing systems. We also include the spectrum of multi-sensory, autonomous robots varying across levels of functional sophistication and replication that reflects or mimics the human body (Der and Martius, 2012;Kanniah, 2014). Although not all complex machines are robots, it is valid to consider all robots and drones deployed, functioning, and automated in public spaces as urban machines. ...
Full-text available
In this book, we compare and contrast the various forms of play that occur in urban environments or are dedicated to their design and planning, with the notion of the playable city. In a playable city, the sensors, actuators, and digital communication networks that form the backbone of smart city infrastructure are used to create novel interfaces and interventions intended to inject fun and playfulness into the urban environment, both as a simple source of pleasure and as a means of facilitating and fostering urban and social interactions.
... We are not fixated on a particular category of machines, but we do recognise the diversity of urban machines that consist of both non-digital machinery operating on conventional mechanisms and modern devices integrated with digital computing systems. We also include the spectrum of multi-sensory, autonomous robots varying across levels of functional sophistication and replication that reflects or mimics the human body (Der and Martius, 2012;Kanniah, 2014). Although not all complex machines are robots, it is valid to consider all robots and drones deployed, functioning, and automated in public spaces as urban machines. ...
Full-text available
Within the paradigm of the smart and playable city, the urban landscape and street furniture have provided a fertile platform for pragmatic and hedonic goals of urban liveability through technology augmentation. Smart street furniture has grown from being a novelty to become a common sight in metropolitan cities, co-opted for improving the efficiency of services. However, as we consider technologies that are increasingly smarter, with human-like intelligence, we navigate towards uncharted waters when discussing the consequences of their integration with the urban landscape. The implications of a new genre of street furniture embedded with artificial intelligence, where the machine has autonomy and is an active player itself, are yet to be fully understood. In this article, we analyse the evolving design of public benches along the axes of smartness and disruption to understand their qualities as playful, urban machines in public spaces. We present a concept-driven speculative design case study, as an exploration of a smart, sensing, and disruptive urban machine for playful placemaking. With the emergence of artificial intelligence, we expand on the potential of urban machines to partake an increasingly active role as co-creators of play and playful placemaking in the cities of tomorrow.
... The robot has six legs, each of which has a thoraco-coxal (TC) joint allowing forward and backward motion, a coxa-trochanteral (CTr) joint allowing elevation and depression motion, as well as a femur-tibia (FTi) joint allowing extension and flexion motions, respectively [see Fig. 1(b)]. Here, we use a modular robot control environment (MOROCO) [27] and the physically realistic robot simulator LPZRobots [28] to test our proposed controller in AMOS (see Section I in the supplementary material for the specification of AMOS). ...
Full-text available
Bayesian filters have been considered to help refine and develop theoretical views on spatial cell functions for self-localization. However, extending a Bayesian filter to reproduce insect-like navigation behaviors (e.g., home searching) remains an open and challenging problem. To address this problem, we propose an embodied neural controller for self-localization, foraging, backward homing (BH), and home searching of an advanced mobility sensor (AMOS)-driven insect-like robot. The controller, comprising a navigation module for the Bayesian self-localization and goal-directed control of AMOS and a locomotion module for coordinating the 18 joints of AMOS, leads to its robust insect-like navigation behaviors. As a result, the proposed controller enables AMOS to perform robust foraging, BH, and home searching against various levels of sensory noise, compared to conventional controllers. Its implementation relies only on self-localization and heading perception, rather than global positioning and landmark guidance. Interestingly, the proposed controller makes AMOS achieve spiral searching patterns comparable to those performed by real insects. We also demonstrated the performance of the controller for real-time indoor and outdoor navigation in a real insect-like robot without any landmark and cognitive map.
... Walking animals demonstrate impressive self-organized locomotion and adaptation to body property changes by skillfully manipulating their complicated and redundant musculoskeletal systems (Taga et al., 1991;Dickinson et al., 2000;Der and Martius, 2012;Grabowska et al., 2012). Many studies have clarified that adaptive interlimb coordination plays a crucial role in this achievement (Aoi et al., 2017;Mantziaris et al., 2017). ...
Full-text available
Walking animals demonstrate impressive self-organized locomotion and adaptation to body property changes by skillfully manipulating their complicated and redundant musculoskeletal systems. Adaptive interlimb coordination plays a crucial role in this achievement. It has been identified that interlimb coordination is generated through dynamical interactions between the neural system, musculoskeletal system, and environment. Based on this principle, two classical interlimb coordination mechanisms (continuous phase modulation and phase resetting) have been proposed independently. These mechanisms use decoupled central pattern generators (CPGs) with sensory feedback, such as ground reaction forces (GRFs), to generate robot locomotion autonomously without predefining it (i.e., self-organized locomotion). A comparative study was conducted on the two mechanisms under decoupled CPG-based control implemented on a quadruped robot in simulation. Their characteristics were compared by observing their CPG phase convergence processes at different control parameter values. Additionally, the mechanisms were investigated when the robot faced various unexpected situations, such as noisy feedback, leg motor damage, and carrying a load. The comparative study reveals that the phase modulation and resetting mechanisms demonstrate satisfactory performance when they are subjected to symmetric and asymmetric GRF distributions, respectively. This work also suggests a strategy for the appropriate selection of adaptive interlimb coordination mechanisms under different conditions and for the optimal setting of their control parameter values to enhance their control performance.
Full-text available
Currently, the autonomy of artificial systems, robotic systems in particular, is certainly one of the most debated issues, both from the perspective of technological development and its social impact and ethical repercussions. While theoretical considerations often focus on scenarios far beyond what can be concretely hypothesized from the current state of the art, the term autonomy is still used in a vague or too general way. This reduces the possibilities of a punctual analysis of such an important issue, thus leading to often polarized positions (naive optimism or unfounded defeatism). The intent of this paper is to clarify what is meant by artificial autonomy, and what are the prerequisites that can allow the attribution of this characteristic to a robotic system. Starting from some concrete examples, we will try to indicate a way towards artificial autonomy that can hold together the advantages of developing adaptive and versatile systems with the management of the inevitable problems that this technology poses both from the viewpoint of safety and ethics. Our proposal is that a real artificial autonomy, especially if expressed in the social context, can only be achieved through interdependence with other social actors (human and otherwise), through continuous exchanges and interactions which, while allowing robots to explore the environment, guarantee the emergence of shared practices, behaviors, and ethical principles, which otherwise could not be imposed with a top-down approach, if not at the price of giving up the same artificial autonomy.
Full-text available
As robots increasingly become part of our everyday lives, questions arise with regards to how to approach them and how to understand them in social contexts. The Western history of human–robot relations revolves around competition and control, which restricts our ability to relate to machines in other ways. In this study, we take a relational approach to explore different manners of socializing with robots, especially those that exceed an instrumental approach. The nonhuman subjects of this study are built to explore non-purposeful behavior, in an attempt to break away from the assumptions of utility that underlie the hegemonic human–machine interactions. This breakaway is accompanied by ‘learning to be attuned’ on the side of the human subjects, which is facilitated by continuous relations at the level of everyday life. Our paper highlights this ground for the emergence of meanings and questions that could not be subsumed by frameworks of control and domination. The research-creation project Machine Ménagerie serves as a case study for these ideas, demonstrating a relational approach in which the designer and the machines co-constitute each other through sustained interactions, becoming attuned to one another through the performance of research. Machine Ménagerie attempts to produce affective and playful—if not unruly—nonhuman entities that invite interaction yet have no intention of serving human social or physical needs. We diverge from other social robotics research by creating machines that do not attempt to mimic human social behaviours.
ResearchGate has not been able to resolve any references for this publication.