Conference PaperPDF Available

A Robotic Head Neuro-controller Based on Biologically-Inspired Neural Models.

Authors:

Abstract and Figures

This paper presents the application of a neural approach in the control of a 7-DOF robotic head. The inverse kinematics problem is addressed, for the control of the gaze fixation point of two cameras mounted on the robotic head. The proposed approach is based on a biologically-inspired model, which replicates the human brain capability of creating associations between motor and sensory data, by learning. The model is implemented here by self organizing neural maps. During learning, the system creates relations between the motor data associated to endogenous movements performed by the robotic head and the sensory consequences of such motor actions, i.e. the final position of the gaze fixation point. The learnt relations are stored in the neural map structure and are then used, after learning, for generating motor commands aimed at reaching a given fixation point. The approach proposed here allows to solve the inverse kinematics and joint redundancy problems for the ARTS robotic head, with good accuracy and robustness. Experimental trials confirmed the system capability to control the gaze direction and fixation point and also to manage the redundancy of the robotic head in reaching the target fixation point even with additional constraints, such as a clamped joint or two symmetric joint angles (e.g. eye joints).
Content may be subject to copyright.
A Robotic Head Neuro-controller Based on
Biologically-Inspired Neural Models
Gioel Asuni, Giancarlo Teti, Cecilia Laschi, Eugenio Guglielmelli, Paolo Dario
ARTS Lab (Advanced Robotics Technology and System Laboratory)
Scuola Superiore Sant’Anna
Piazza Martiri della Libert`
a 33, 56127 Pisa, Italy
{asuni, teti, cecilia, eugenio, dario}@arts.sssup.it
Abstract This paper presents the application of a neural
approach in the control of a 7-DOF robotic head. The
inverse kinematics problem is addressed, for the control
of the gaze xation point of two cameras mounted on
the robotic head. The proposed approach is based on a
biologically-inspired model, which replicates the human brain
capability of creating associations between motor and sensory
data, by learning. The model is implemented here by self-
organizing neural maps. During learning, the system creates
relations between the motor data associated to endogenous
movements performed by the robotic head and the sensory
consequences of such motor actions, i.e. the nal position of
the gaze xation point. The learnt relations are stored in
the neural map structure and are then used, after learning,
for generating motor commands aimed at reaching a given
xation point. The approach proposed here allows to solve
the inverse kinematics and joint redundancy problems for
the ARTS robotic head, with good accuracy and robustness.
Experimental trials conrmed the system capability to control
the gaze direction and xation point and also to manage the
redundancy of the robotic head in reaching the target xation
point even with additional constraints, such as a clamped joint
or two symmetric joint angles (e.g. eye joints).
Index Terms biorobotics, robotic head control, sensory-
motor coordination, neural control.
I. INTRODUCTION
In this work, a model based on neural networks has been
proposed to solve the inverse kinematics problem for a 7-
DOF robotic head. This head can support two cameras and
direct the xation point in the 3D space by a combination
of the neck movements (4 DOFs) and of the eye movements
(1 common tilt and 2 separate pans, allowing vergence).
The proposed model implements a mapping between the
direction of gaze, in the external spatial reference system
of the head, and its internal reference system, i.e. the joint
space. This mapping solves the inverse kinematics problem,
by computing a joint conguration for the head, to obtain a
given xation point in the external spatial reference system.
Inverse kinematics is one of the basic problems in
developing robot controllers. Traditionally, solutions to
the inverse kinematics problem are obtained by different
techniques based on mathematical computational models,
such as inverse transform or iterative methods.
Sometimes, these methods may suffer from drawbacks,
especially when the number of degrees of freedom in-
creases: inverse transform does not always guarantee a
closed-form solution, while iterative methods may not
converge and may be computationally expensive.
Neural network approaches, which provide robustness
and adaptability, represent an alternative solution to the
inverse kinematics problem, especially when the number of
degrees of freedom to be controlled is high, or the external
spatial reference system is not easy to be modelled, such
as in visuo-motor coordination tasks.
Two main approaches have been proposed for the use
of neural networks to solve the problem of inverse kine-
matics. The rst, based on mathematical models, considers
the articial neural networks as a tool to solve nonlin-
ear mappings without, or in some cases with a limited
or partial, knowledge of the robotic structure [1], [2].
The second approach builds the mapping between motor
commands and sensory input on the basis of repeated
sensory–motor loops; the nal map is then used to generate
appropriate motor commands to drive the robot towards
the target sensory input. These latter methods make use
of self-organizing maps [3] to build the internal mapping:
their self-organization and topological preservation features
make them well-suited for capturing mechanical constrains;
moreover, they show a good capability of processing arti-
cial sensory signals, according to the analogous mapping
features of the somatosensory cortex by which they are
inspired.
These abilities could be useful in closed-loop control
systems which have to deal with somatosensory signals
coming from anthropomorphic articial sensors such as
proprioceptive and tactile, in order to allow the genera-
tion of proper sensorial inputs towards the motor areas
[4]. In the past, visuo-motor coordination problems have
been extensively and quite successfully approached using
Kohonen’s maps [5], [6]. Nevertheless, this type of network
may suffer from the necessity of a priori knowledge of
the probability distribution of the input domain in order to
choose a proper cardinality and structure of the net, in order
to avoid over – or under – tting problems. Furthermore,
they are not suitable to approach dynamic environments or
continuous learning [7]. In general, approaches involving
self-organizing techniques assign a sub-area of the input
domain (e.g. joint space) to a single neuron, according
to a specied resolution. This methodology divides the
whole space in a set of equally probabilistic sub-spaces,
disregarding the correlations between the tasks space and
Proceedings of the 2005 IEEE
International Conference on Robotics and Automation
Barcelona, Spain, April 2005
0-7803-8914-X/05/$20.00 ©2005 IEEE.
2373
the sensory domain in which they are performed. Growing
Neural Gas (GNG), proposed by Bernd Fritzke [8], is
an incremental neural model able to learn the topological
relation of input patterns. Unlike other methods, this model
avoids the need to pre-specify the network size. On the
contrary, from a minimal network size, a growth process
takes place, which is continued until a condition is satised.
II. DESCRIPTION OF THE PROPOSED MODEL
The proposed model is based on a self–organizing neural
network that learns the coordination of motor actions
with sensory feedbacks. It generates trajectories by solving
the inverse kinematics problem and the joint redundancy
problems for a robotic head.
The neural network starts with very little information
about the kinematics structure of the head, i.e. the number
of joints and their ranges of motion. In an initial learn-
ing phase, through endogenous action–perception loops, it
generates the associative information needed to build the
transformation between a spatial map, which encodes gaze
directions in the external space, and a motor map, which
encodes joint rotations. That process is applied repeatedly,
involving many locations in the head workspace, during the
learning phase. In the performance phase, after learning, the
learned transformation is used in the opposite way: given a
gaze target direction, the model provides the joint rotations
that drive the current gaze in the target direction.
The proposed architecture for correlating external ref-
erence system to the internal reference system takes in-
spiration from the
DIRECT
model (Direction-to-Rotation
Effector Control Transform) proposed in [9]. The
DIRECT
model includes descriptions of the role of sensorimotor
cortex in learning visuo-motor coordination, correlating
proprioceptive and visual feedback for the generation of
joint movements. It implements a coordinate transformation
from spatial directions to joints rotations. The
DIRECT
embodies a solution of the classical motor equivalence
problem emphasizing some peculiar human-like character-
istics, such as the successful reaching of targets despite
of the different operational conditions from those taken
into account during learning. Fig. 1 shows an overall block
diagram of the
DIRECT
model.
The proposed model is composed of four modules as
shown in Fig. 2:
The
Spatial Position Map
(SPM) that contains an
appropriate coding of the gaze xation points (inter-
section of the lines of sight outgoing from the eyes)
in the external spatial reference system;
The
Motor Position Map
(MPM) that contains an
appropriate coding of the robot proprioception in the
internal motor reference system (i.e. the joint space);
The
Integration Map
(IM) that contains a mapping
between the SPM and the MPM and provides the
activations to rcells in order to follow a given
direction;
The
Motor Area
(MA) that contains three groups of
cells named respectively x,rand acells.
Vision Feedback
SPATIAL
PRESENT
POSITION
SPATIAL
TARGET
SPATIOMOTOR
VECTOR(PPVsm)
PRESENT
POSITION
POSITION
DIRECTION
SPATIAL
POSITION
DIRECTION RANDOM
ENDOGENOUS
GENERATOR
MOTOR
PRESENT
POSITION
MAP (PPMm)
(ERG)
MOTOR
DIRECTION
MOTOR
PRESENT
POSITION
VECTOR (PPVm)
VECTOR (TPVs)VECTOR (PPVs)
VECTOR (DVs)
MAP (PDMms)
VECTOR (DVm)
End Effector Target
Proprioceptive Feedback
Fig. 1. The overall block diagram of the
DIRECT
model (redrawn from
[9])
Fig. 2. The implementation of the proposed neural model.
The main processing steps that allow external target gaze
xation point (TGFP) to guide changes in the current gaze
xation point (CGFP) during foveation tasks are:
1) computation of a spatial difference vector by com-
paring the TGFP with the CGFP measured in the
same coordinate system. The spatial difference vector
species the spatial displacement needed to bring the
CGFP into TGFP;
2) computation of the joint angle changes, or rotations,
needed to move the CGFP along the spatial differ-
ence vector toward the TGFP. The computation of the
appropriate joint rotations requires information about
both the direction of the spatial difference vector and
the CGFP;
3) integration of the joint angle increments or decre-
ments to compute angular values for all the joints in
order to control the CGFP.
More precisely:
dening hcas the coding for the CGFP and htfor
the TGFP, the spatial difference vector is dened as
follows: dv =hthc;
2374
the Integration Map contains a map of cells, each of
which is maximally sensitive to a particular spatial
direction in a particular position of joint space: the
input for the
IM
is composed of the vector dv from
the
SPM
and of a coding of the current positions in
the motor reference system from the
MPM
;
the group of cells rencodes a set of joint rotation
commands to be integrated by the apopulations, as in
Fig. 2. This group of cells rreceives inputs from two
sources: xcells and
IM
cells. The training phase in
the proposed model is achieved through autonomously
generated repetitions of an action-perception loop:
during the motor babbling phase, the model endoge-
nously generates random motor commands and learns
sensory–motor coordination by correlating proprio-
ceptive feedback and visual feedback (gaze xation
point). More specically, during the training phase,
the xcells activate rcells with random values, while
during the running phase the contribution provided by
the xcells is null, while the contribution to the rcells
is provided only by the signals bkckzki deriving from
the active sites kof
IM
. Thus, the ractivities may be
approximated by the following equation:
ri=xi+
k
bkckzki i=1,...,n;(1)
where xiis the Endogenous Random Generator (ERG)
activation (only present during the learning phase),
zki is the adaptive weights from cell kin
IM
to
unit iof r,bkis an appropriate constant, and ckis
1 if the cell kin
IM
is activated, else is 0 ris a
population of units devoted to the generation of the
joint movements. Their number is equal to the joints of
the head multiplied by 2 according to the use of a pair
of muscle (agonist and antagonist) for each joint [10].
Each unit of rreceives connections from all the cells
of
IM
. The adaptive weight between a generic
IM
cell
kand the unit ibelonging to ris modied according to
the following learning equation (Grossberg’s ”outstar”
law [11]):
d
dt zki =γck(δzki +ri)(2)
where γis a small learning rate parameter and δis a
decay rate parameter;
ais a population of units that integrate signals from
corresponding rpopulations to produce an outow
command specifying a set of joint angles. Each ai
cell codes the angle of a particular joint and receives
input from an antagonistic pair corresponding to ith
joint angle. The net input integrated by the acells is
a function of the difference between the activities of
the two corresponding rcells. The updating rule used
is: d
dt ai=(rE
irI
i)g(rE
i,r
I
i
i)(3)
where aiis the angle command for ith joint, rE
iand
rI
iare the corresponding excitatory and inhibitory r
cell activities respectively and is an integration rate
parameter, while the gfunction is dened as follows:
g((e, i, ψi)=ψmax ψiif (ei)0;
ψiif (ei)<0.(4)
where ψiis the angle of the joint corresponding to
aiand ψmax is the maximum angle of this joint.
This update rule has been introduced in order to
favour head postures that have joints positions close
to the half range of motions, that can be consider the
”equilibrium points”.
III. IMPLEMENTATION OF THE PROPOSED MODEL
Arst implementation of the proposed model was pre-
sented in [12], for the case of the control of a generic
robot arm. In this work, the implementation of the model
has been improved with respect to that presented in [12].
In that work, the maps described in the previous section
were implemented by using Growing Neural Gas (GNG)
that have been used for nding topological structures that
closely reects the structure of the input distribution. The
output of these maps is given by the following equation:
Φ(ξ)=
ws1+iNs1(µwi)
|Ns1|+1 (5)
where ξis the input pattern to the map, µis an appropriate
constant <1,ws1is the weight vector associated to
the winner unit and Ns1is the set of direct topological
neighbors of the winner unit.
Respect to the previous implementation, a main variation
has been to use only one neural map (
Sensory–Motor
Map
) for coding all the informations of the
Motor Position
Map
, the
Spatial Position Map
and the
Integration Map
(see
Fig. 3).
Fig. 3. The implementation of the proposed neural model.
This modication has permitted a signicant reduction
in terms of required memory space but above all in
terms of computational time. The
Sensory–Motor Map
has been implemented using the self-organizing growing
neural networks (GNG) [13] for correlating the visual
perception (gaze xation point) and proprioception and
for their integration. Starting with two nodes and using
2375
a growth mechanism, the network builds a graph in which
nodes are considered neighbours if they are connected by
an edge. The reference weights of cells are modied during
the execution by a variant of the competitive Hebbian
learning [14]. A growth process is continued until an
ending condition is fullled.
Learning parameters are constant over time, in contrast
to other models which heavily rely on decaying parame-
ters. Only the weights of the winner unit and those of
its neighbors are allowed to learn in order to move the
CGFP towards the target. Thus, Equation (1) has been
implemented in the following way:
ri=xi+zwi +kNwνzki
|Nw|(6)
where wis the winner unit index and Nwis the set of
direct topological neighbors of the winner unit and νis an
appropriate constant less than 1.
A second improvement of the implementation has been
to take advantage of the topological structure of the neural
map during the task, that is to consider only receptive
elds activated more recently. This means that, at each
step, instead of considering all the cells of the neural map,
only the winner cell and its direct topological neighbors
are considered, thus decreasing the computational burden.
The last variation of the implementation has been to
allow to modify the weights reference vector of the GNG
not only during the learning phase but also during task
execution. This has allowed to improve the performance of
the model because the GNGs can adapt themselves to slow
changes of the input distribution, that is to move the nodes
so as to cover the new distribution.
Like the
DIRECT
model, the proposed model provides
robust performance of movement trajectories. In particular,
the model is also able to achieve a target point posing
constraints to some joints without requiring new learning
phase. For example, it is possible to clamp one or more
joints or to force two joints to have the same value.
IV. EXPERIMENTAL RESULTS
The implemented model has been tested and evaluated
by experimental trials, aiming at verifying the learning ca-
pabilities and the sensory–motor coordination functionality.
The experimental trials were conducted on the ARTS ro-
botic head, shown in Fig. 4. The robotic head is composed
of 7 DOFs:
4 DOFs for the neck, that allow rotation, ventral
and dorsal exion at two different levels and lateral
exion;
3 DOFs for the eyes, that allow a common tilt for both
eyes and 2 pans for the eyes separately.
Specications of the robotic head are reported in Table
I.
The model learns the inverse kinematics of the head
by mapping the ”directions” of the gaze xation point in
the Head Cartesian Reference System onto the joint space.
After learning, the model is able to generate a sequence
Fig. 4. The ARTS robotic Head: rotation axis(left), reference system
and gaze xation point(right).
TABLE I
SPECIFICATIONS OF THE ARTS ROBOTIC HEAD
Joint Range Velocity Acceleration
J0 - Roll 28÷3220/sec 200/sec2
J1 - Lower Pitch 32÷2620/sec 200/sec2
J2 - Yaw 108÷108170/sec 750/sec2
J3 - Upper Pitch 20÷30120/sec 750/sec2
J4 - Eye Pitch 25÷53400/sec 4500/sec2
J5 - Left Eye 45÷45600/sec 10000/sec2
J6 - Right Eye 45÷45600/sec 10000/sec2
of joint congurations for the head in order to achieve a
given xation point as shown in Fig. 5.
Fig. 5. The overall block diagram of the proposed model.
The learning phase has been executed off-line using the
direct kinematics of the head in order to provide the 3D
coordinates of the gaze xation point in the head reference
system (see Fig. 4). 14395 random movements in the joint
space, to which different gaze xation points correspond,
have been used to train the system. The total time needed
to train the system is about 15 minutes, using a Intel
Pentium 4 processor, and the cardinality obtained for the
nal
Sensory–Motor Map
is 3928 units.
Three types of experimental trials have been executed:
normal task, task with clamped joint and task with sym-
metric angles for the joints eyes.
Fig. 6 and 7 shows the behavior of the error distance and
the trajectory of the joints during a normal xtation task.
The quite monotonic trend shown by the graph illustrated
in Fig. 7 is a general behavior found in all the trials. We
2376
can underline how the monotonic trend and the absence of
notable oscillations ensure a linear movement of the CGFP
towards TGFP, which corresponds to a minimal waste of
energy according to the results reported in [15].
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30
Distance (mm)
Steps*10
Distance between the current position and the target position
Fig. 6. Distance between the current gaze xation point and the target
during a normal task.
-10
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30
Joint angles (deg)
Steps*10
Joint trajectories
Joint 0
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Fig. 7. Joint trajectory during the normal task.
Fig. 8 shows the head initial posture and the nal posture
in the normal task.
Fig. 8. Initial(left) and nal(right) postures of the robotic head in the
normal task.
Fig. 9 and 10 show the performance of the system in
terms of the graph of the error distance from the TGFP and
the joint trajectories in a xation task with a clamped joint.
This task shows the adaptability of the neural network, as
the net provides a new joint conguration able to reach the
target even using one joint less.
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30 35
Distance (mm)
Steps*10
Distance between the current position and the target position
Fig. 9. Distance between the current gaze xation point and the target
during a foveation task with a clamped joint.
-10
0
10
20
30
40
50
60
70
80
0 5 10 15 20 25 30 35
Joint angles (deg)
Steps*10
Joint trajectories
Joint 0
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Fig. 10. Joint trajectory during the foveation task with a clamped joint.
By clamping joint 0 (roll) it is possible to obtain more
human-like postures for the head. Fig. 11 shows the head
foveating the same point with no clamped joint and with
joint 0 clamped.
Fig. 11. Final posture of the robotic head in the normal task(left) and
in the task with a clamped joint zero(right).
2377
Finally, Fig. 12 and 13 show the graph of the error
distance from the TGFP and the joint trajectories in a
foveation task with symmetric angles for joints eyes.
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30
Distance (mm)
Steps*10
Distance between the current position and the target position
Fig. 12. Distance between the current gaze xation point and the target
in a foveation task with symmetric angles for eye joints.
-10
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30
Joint angles (deg)
Steps*10
Joint trajectories
Joint 0
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Fig. 13. Joint trajectory in a foveation task with symmetric angles for
eye joints.
As in the previous example, by constraining the joints
of the eyes to have symmetric values we obtain a more
anthropomorphic behaviour for the head. Fig. 14 shows the
head foveating the same point with no constraints and with
symmetric angles for joints eyes. This functionality could
be used, for example, to implement a VOR (Vestibo-Ocular
Reex), i.e. the mechanism for stabilizing gaze during head
motion.
Fig. 14. ARTS robotic head nal posture in a normal task(left) and in
a task with symmetric angles for eye joints(right).
V. C ONCLUSIONS
This paper presented a neural controller for a reduntant
robotic head with 7 DOFs.
After a learning phase of about 15 minutes, the system
is capable of controlling gaze movements of the head
to follow a given target spatial direction using different
combinations of the joints. In particular, it is able to gaze
points in the 3D space, providing a solution to the well-
known inverse kinematics problem. It is also able to gaze a
target point with additional constraints, such as some joints
locked or with symmetric angles for eye joints.
Experimental trials with the ARTS robotic head show
how more human-like postures can be obtained by acting
on such constraints. The use of self-organizing maps has
shown to be effective in managing the inverse kinematics
problem, when approached as a sensory-motor mapping
problem. The use of growing neural gas has further in-
creased the performance in computational terms and in the
representation of the input space.
REFERENCES
[1] S. Lin and A. A. Goldenberg. Neural network control of mobile
manipulators. IEEE Trans. on Neural Networks, 12(5), Sep 2001.
[2] H. D. Patino, R. Carelli, and B. Kuchen. Neural network control of
mobile manipulators. IEEE Trans. on Neural Networks, 13(2), Mar
2002.
[3] T. Kohonen. Self-organizing maps. Springer-Verlag, second edition,
1997.
[4] F. Leoni, M. Guerrini, C. Laschi, D. Taddeucci, P. Dario, and
A. Starita. Implementing robotic grasping tasks using a biological
approach. In Proceeding of the International Conference on Robotics
and Automation. Leuven, Belgium, pages 16–20, May 1998.
[5] M. Kuperstein and J. Rubinstein. Implementation of an adaptive
neural controller for sensory-motor coordination. IEEE Control
Systems Magazine, 9(3):25–30, 1989.
[6] J. A. Walter and Schulten K. J. Implementation of self-organizing
neural networks for visuo-motor control of an industrial robot. IEEE
Transactions on Neural Networks, 4(1):86–95, 1993.
[7] S. Marsland, J. Shapiro, and Nehmzow U. A self-organizing
networks that grows when required. Neural Networks, 15:1041–
1058, 2002.
[8] B. Fritzke. Growing cell structures a self-organizing network for
unsupervised and supervised learning. Neural Networks, 7(9):1441–
1460, 1994. ICSI TR-93-026.
[9] D. Bullock, S. Grossberg, and F. H. Guenther. A self-organizing
neural model of motor equivalent reaching and tool use by a
multijoint arm. Journal of Cognitive Neuroscience, 5(4):408–435,
1993.
[10] E. R. Kandel, J. H. Schwartz, and T. M. Jessell. Principles of neural
science. McGraw Hill., fourth edition, 2000.
[11] S. Grossberg. Classical and instrumental learning by neural net-
works. Progress in Theoretical Biology, 3:51–141, 1974.
[12] G. Asuni, F. Leoni, E. Guglielmelli, A. Starita, and P. Dario. A
neuro-controller for robotic manipulators based on biologically-
inspired visuo-motor co-ordination neural models. In Proceedings
of the 1st International IEEE EMBS Conference on Neural Engi-
neering, Capri Island, Italy, pages 450–453, Mar 2003.
[13] B. Fritzke. A growing neural gas network learns topologies. In
G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, Advances in
Neural Information Processing Systems 7, MIT Press, Cambridge
MA, pages 625–632, 1995.
[14] T. M. Martinetz. Competitive hebbian learning rule forms perfectly
topology preserving maps. In ICANN93: International Conference
on Articial Neural Networks, Springer, Amsterdam, pages 427–434,
1993.
[15] M. I. Jordan and D. M. Wolpert. Computational motor control In
M. Gazzaniga (Ed.), The Cognitive Neurosciences. Cambridge: MIT
Press, 2nd edition, 1999.
2378
... The proposed method is able to track an object moving in arbitrary trajectories by combining a proportional feedback loop and an adaptive feed-forward gain to predict the next state of the target. Later, Vannucci et al. [175], use an adaptive approach based on Asuni's neural controller [172]. The objective is to coordinate eye and head motions to achieve smooth pursuit of a moving object. ...
... Although gaze direction can, in general terms, be subdivided into the performance of different visual actions, we have decided to set it apart to highlight some works that stand out from the works previously mentioned. We can see, from Table 2, that the proposed solutions in this category are somewhat balanced between intelligent [172,173], predictive or adaptive [167,179] and classical approaches, [186,189], but a new category was also found, that we label Bio-inspired control. ...
... Intelligent control is employed in the works by Asuni et al. [172] and Yoo et al. [173]. Neural networks in the form of Selforganizing Neural Maps (SOM) are used in [172] for the control of the ARTS Lab sensorial head. ...
Article
We conducted a literature review on sensor heads for humanoid robots. A strong case is made on topics involved in human robot interaction. Having found that vision is the most abundant perception system among sensor heads for humanoid robots, we included a review of control techniques for humanoid active vision. We provide historical insight and inform on current robotic head design and applications. Information is chronologically organized whenever possible and exposes trends in control techniques, mechanical design, periodical advances and overall philosophy. We found that there are two main types of humanoid robot heads which we propose to classify as either non-expressive face robot heads or expressive face robot heads. We expose their respective characteristics and provide some ideas on design and vision control considerations for humanoid robot heads involved in human robot interactions.
... In this work we present a model based on a neurocontroller which was first introduced by Asuni and colleagues [11]. They proposed an approach based on the motor babbling technique in conjunction with a growing neural gas to achieve the goal of making a robotic head fixing a static target. ...
... Despite having some remarkable properties, their model had some limitations mainly in terms of performances (it takes around 250 control steps to reach a static target). We propose an adaptive controller, based on [11] able to accomplish a visual pursuit task involving eye-head coordination and prediction of a moving target. In this section we describe the general model of the controller, as depicted in Fig. 1. ...
... The goal of the pursuit controller is to move the current gaze fixation point (gf p) of the robot towards the target point. This controller is an improved version of [11]. The core part of this controller is the Sensory-Motor Map, which is a network able to respond to a stimulus by activating some specific units. ...
Conference Paper
Full-text available
Nowadays, increasingly complex robots are being designed. As the complexity of robots increases, traditional methods for robotic control fail, as the problem of finding the appropriate kinematic functions can easily become intractable. For this reason the use of neuro-controllers, controllers based on machine learning methods, has risen at a rapid pace. This kind of controllers are especially useful in the field of humanoid robotics, where it is common for the robot to perform hard tasks in a complex environment. A basic task for a humanoid robot is to visually pursue a target using eye-head coordination. In this work we present an adaptive model based on a neuro-controller for visual pursuit. This model allows the robot to follow a moving target with no delay (zero phase lag) using a predictor of the target motion. The results show that the new controller can reach a target posed at a starting distance of 1.2 meters in less than 100 control steps (1 second) and it can follow a moving target at low to medium frequencies (0.3 to 0.5 Hz) with zero-lag and small position error (less then 4 cm along the main motion axis). The controller also has adaptive capabilities, being able to reach and follow a target even when some joints of the robot are clamped.
... Several neural-network models have been proposed with the aim of detailing at a neural level the functioning of the circular-reaction principle (e.g. [3], [4], [5], [6], [7], [8], [9], [10], [11]). Notwithstanding the great relevance of these contributions, all these neural models learn to accomplish reaching tasks assuming that there are no obstacles in the environment in which the arm moves and that motor babbling can be used to associate final arm's postures with stimuli. ...
... This allows the system to learn to move in an environment containing obstacles and hence more realistic than the environment considered in previous works where motor babbling was used to acquire reaching capabilities (e.g. [3], [4], [5], [6], [7], [8], [9], [10], [11]). The model is only a first step of a broader research agenda and have many limits in its current implementation: (1) the model might have difficulties in performing linear trajectories when there is no obstacle between the initial hand's posture and the target [27] ; (2) the encoding of joints' proprioception is implemented with neurons having Gaussian activations, while empirical evidence suggests that they should be have a quasi-linear activation in relation to postures' parameters (cf. ...
Article
Full-text available
According to one of the most influential principles of mo-tor development theory, the circular-reaction hypothesis, infants perform exploratory random movements (motor babbling) to acquire efferent-reafferent associations later used to perform goal directed behavior. The models proposed so far to specify this principle learn to accomplish reach-ing tasks by using exploratory movements to associate final arm's pos-tures with stimuli. A limit of these models is that they cannot control the path followed by the hand to arrive to the target and so cannot cope with obstacles. This work proposes a model that starts to overcome this limitation, in particular it proposes a new neural-network architecture that uses motor babbling not to learn stimuli-final postures associations but to learn stimuli-trajectories associations through an Hebb rule. The system controls movement trajectories by regulating the parameters of two Time Based Generators that on their turn generate the sequence of desired positions of the arms' "hand". These types of associations render the system more flexible and capable of coping with obstacles. Prelimi-nary tests run with a 2D kinematic arm demonstrate the viability of the proposed approach.
... In 2014, Vannucci et al. [54] proposed an adaptive controller to realize eye-head coordination with prediction of a moving target. The control model is improved based on the methods in [55], which combine the growing neural gas and the motor babbling technique. These models or methods for the eye-head coordination and their features are summarized in Table IV. ...
Preprint
Biology can provide biomimetic components and new control principles for robotics. Developing a robot system equipped with bionic eyes is a difficult but exciting task. Researchers have been studying the control mechanisms of bionic eyes for many years and considerable models are available. In this paper, control model and its implementation on robots for bionic eyes are reviewed, which covers saccade, smooth pursuit, vergence, vestibule-ocular reflex (VOR), optokinetic reflex (OKR) and eye-head coordination. What is more, some problems and possible solutions in the field of bionic eyes are discussed and analyzed. This review paper can be used as a guide for researchers to identify potential research problems and solutions of the bionic eyes' motion control.
... One uses specific hardware, like accelerometers and gyroscope to estimate the 3D posture of the head, and complex control algorithms to compensate the oscillations. The use of inertial information was already proposed by several authors [5], [16], [17]. Typically this kind of techniques is used in binocular robot heads, where gaze is implemented through the coordination of the two eye movements. ...
Article
Full-text available
Visually-guided locomotion is important for au-tonomous robotics. However, there are several difficulties, for instance, the head shaking that results from the robot locomotion itself that constraints stable image acquisition and the possibility to rely on that information to act accordingly. In this article, we propose a controller architecture that is able to generate locomotion for a quadruped robot and to generate head motion able to minimize the head motion induced by locomotion itself. The movement controllers are biologically inspired in the concept of Central Pattern Generators (CPGs). CPGs are modelled based on nonlinear dynamical systems, coupled Hopf oscillators. This approach allows to explicitly specify parameters such as amplitude, offset and frequency of movement and to smoothly modulate the generated oscillations according to changes in these parameters. We take advantage of this particularity and propose a combined approach to generate head movement stabilization on a quadruped robot, using CPGs and a global optimization algorithm. The best set of parameters that generates the head movement are computed by the electromagnetism-like algorithm in order to reduce the head shaking caused by locomotion. Experimental results on a simulated AIBO robot demonstrate that the proposed approach generates head movement that does not eliminate but reduces the one induced by locomotion.
... Outstarlearning [3] is used in the DIRECT model for training the network and modifying the synaptic weights. More recently, Asuni et al. [4] were inspired by this approach and proposed a neural network, also based on outstar learning, that aims to control a robotic head for gazing points in the 3D space. ...
Conference Paper
Full-text available
In this paper, we present a spiking neural network architecture that autonomously learns to control a 4 degree-of-freedom robotic arm after an initial period of motor babbling. Its aim is to provide the joint commands that will move the end-effector in a desired spatial direction, given the joint configuration of the arm. The spiking neurons have been simulated according to Izhikevich's model, which exhibits biologically realistic behaviour and yet is computationally efficient. The architecture is a feed-forward network where the input layers encode the intended movement direction of the end-effector in spatial coordinates, as well as the information that is given by proprioception about the current joint angles of the arm. The motor commands are determined by decoding the firing patterns in the output layers. Both excitatory and inhibitory synapses connect the input and output layers, and their initial weights are set to random values. The network learns to map input stimuli to motor commands during a phase of repetitive action-perception cycles, in which Spike Timing-Dependent Plasticity (STDP) strengthens synapses between neurons that are correlated and weakens synapses between uncorrelated ones. The trained spiking neural network has been successfully tested on a kinematic model of the arm of an iCub humanoid robot.
Conference Paper
We developed a biologically plausible control algorithm to move the eyes of a six degrees of freedom robotic head in a human-like manner. Our neurocontroller, written with the neural simulator Nengo, integrates different biological neural models of eye movements, such as microsaccades, saccades, vestibular-ocular reflex, smooth pursuit and vergence. The coordination of the movements depends on the stream of sensory information acquired by two silicon retinas used as eyes and by an inertial measurement unit, which serves as a vestibular system. The eye movements generated by our neurocontroller resemble those of humans when exposed to the same visual input. This robotic platform can be used to investigate the most efficient exploration strategies used to extract salient features from either a static or dynamic visual scene. Future research should focus on technical enhancements and model refinements of the system.
Article
In this work, the authors propose a combined approach based on a controller architecture that is able to generate locomotion for a quadruped robot and a global optimization algorithm to generate head movement stabilization. The movement controllers are biologically inspired in the concept of Central Pattern Generators (CPGs) that are modelled based on nonlinear dynamical systems, coupled Hopf oscillators. This approach allows for explicitly specified parameters such as amplitude, offset and frequency of movement and to smoothly modulate the generated oscillations according to changes in these parameters. The overall idea is to generate head movement opposed to the one induced by locomotion, such that the head remains stabilized. Thus, in order to achieve this desired head movement, it is necessary to appropriately tune the CPG parameters. Three different global optimization algorithms search for this best set of parameters. In order to evaluate the resulting head movement, a fitness function based on the Euclidean norm is investigated. Moreover, a constraint-handling technique based on tournament selection was implemented.
Conference Paper
Full-text available
Studies of primate locomotion have shown that the head and eyes are stabilized in space through the vestibulo-collic and vestibulo ocular reflexes (VCR, VOR). The VOR is a reflex eye movement control system that stabilizes the image on the retina of the eye during head movements in space. This stabilization helps maintain objects of interest approximately fixed on the retina during locomotion. In this paper we present the design and implementation of an artificial vestibular system, which drives a fully articulated binocular vision system for quadrupedal robots to maintain accurate gaze. The complete robot head has 9 degrees of freedom (DOF): pitch, yaw, and roll for the head and 3 DOF for left and right cameras. The SONY AIBOreg quadruped robot has been modified with additional hardware to emulate the vestibular system and the vestibulo-ocular reflex in primates.
Article
Full-text available
The problem of forming perfectly topology preserving maps of feature manifolds is studied. First, through introducing “masked Voronoi polyhedra” as a geometrical construct for determining neighborhood on manifolds, a rigorous definition of the term “topology preserving feature map” is given. Starting from this definition, it is shown that a network G of neural units i, i = 1, …, N has to have a lateral connectivity structure A, A ij ∈ {0, 1}, i, j = 1,…, N which corresponds to the “induced Delaunay triangulation” of the synaptic weight vectors wi ∈ ℜDin order to form a perfectly topology preserving map of a given manifold M ⊆ ℜD of features v ∈ M. The lateral connections determine the neighborhood relations between the units in the network, which have to match the neighborhood relations of the features on the manifold. If all the weight vectors wi are distributed over the given feature manifold M, and if this distribution resolves the shape of M, it can be shown that Hebbian learning with competition leads to lateral connections i — j (Aij = 1) that correspond to the edges of the “induced Delaunay triangulation” and, hence, leads to a network structure that forms a perfectly topology preserving map of M, independent of M’s topology. This yields a means for constructing perfectly topology preserving maps of arbitrarily structured feature manifolds.
Conference Paper
A theory and the prototype of a neural controller called INFANT that learns sensory-motor coordination from its own experience are presented. INFANT adapts unforeseen changes in the geometry of the physical motor system and to the location, orientation, shape, and size of objects. It can learn to accurately grasp an elongated object without any information about the geometry of the physical sensory-motor system. This new neural controller relies on the self-consistency between sensory and motor signals to achieve unsupervised learning. It is designed to be generalized for coordinating any number of sensory inputs with limbs of any number of joints. INFANT is implemented with an image processor, stereo cameras, and a 5 degrees-of-freedom robot arm. Its average grasping accuracy after learning is 3% of the arm's length in position and 6 degrees in orientation.
Article
This chapter reviews various computational issues that arise in the study of motor control and motor learning. It describes feedback control, feedforward control, the problem of delay, observers, learning algorithms, motor learning, and reference models. It focuses on basic theoretical issues with broad applicability. The chapter develops some of the basic ideas in the control of dynamical systems, distinguishing between feedback control and feedforward control. In general, controlling a system involves finding an input to the system that will cause a desired behavior at its output. Feedback control and feedforward control can both be understood as techniques for inverting a dynamical system. The chapter discusses some mathematical representations for dynamical systems.
Article
This article reviews some of my main theoretical advances before 1973 in a self-contained and nontechnical exposition. Among other features, the article describes some predictions which still need to be tested.
Article
This paper describes a self-organizing neural model for eye-hand coordination. Called the DIRECT model, it embodies a solution of the classical motor equivalence problem. Motor equivalence computations allow humans and other animals to flexibly employ an arm with more degrees of freedom than the space in which it moves to carry out spatially defined tasks under conditions that may require novel joint configurations. During a motor babbling phase, the model endogenously generates movement commands that activate the correlated visual, spatial, and motor information that are used to learn its internal coordinate transformations. After learning occurs, the model is capable of controlling reaching movements of the arm to prescribed spatial targets using many different combinations of joints. When allowed visual feedback, the model can automatically perform, without additional learning, reaches with tools of variable lengths, with clamped joints, with distortions of visual input by a prism, and with unexpected perturbations. These compensatory computations occur within a single accurate reaching movement. No corrective movements are needed. Blind reaches using internal feedback have also been simulated. The model achieves its competence by transforming visual information about target position and end effector position in 3-D space into a body-centered spatial representation of the direction in 3-D space that the end effector must move to contact the target. The spatial direction vector is adaptively transformed into a motor direction vector, which represents the joint rotations that move the end effector in the desired spatial direction from the present arm configuration. Properties of the model are compared with psychophysical data on human reaching movements, neurophysiological data on the tuning curves of neurons in the monkey motor cortex, and alternative models of movement control.
Article
We present a new self-organizing neural network model that has two variants. The first variant performs unsupervised learning and can be used for data visualization, clustering, and vector quantization. The main advantage over existing approaches (e.g., the Kohonen feature map) is the ability of the model to automatically find a suitable network structure and size. This is achieved through a controlled growth process that also includes occasional removal of units. The second variant of the model is a supervised learning method that results from the combination of the above-mentioned self-organizing network with the radial basis function (RBF) approach. In this model it is possible—in contrast to earlier approaches—to perform the positioning of the RBF units and the supervised training of the weights in parallel. Therefore, the current classification error can be used to determine where to insert new RBF units. This leads to small networks that generalize very well. Results on the two-spirals benchmark and a vowel classification problem are presented that are better than any results previously published.