Content uploaded by Gioel Asuni
Author content
All content in this area was uploaded by Gioel Asuni on Apr 22, 2020
Content may be subject to copyright.
A Robotic Head Neuro-controller Based on
Biologically-Inspired Neural Models
Gioel Asuni, Giancarlo Teti, Cecilia Laschi, Eugenio Guglielmelli, Paolo Dario
ARTS Lab (Advanced Robotics Technology and System Laboratory)
Scuola Superiore Sant’Anna
Piazza Martiri della Libert`
a 33, 56127 Pisa, Italy
{asuni, teti, cecilia, eugenio, dario}@arts.sssup.it
Abstract— This paper presents the application of a neural
approach in the control of a 7-DOF robotic head. The
inverse kinematics problem is addressed, for the control
of the gaze fixation point of two cameras mounted on
the robotic head. The proposed approach is based on a
biologically-inspired model, which replicates the human brain
capability of creating associations between motor and sensory
data, by learning. The model is implemented here by self-
organizing neural maps. During learning, the system creates
relations between the motor data associated to endogenous
movements performed by the robotic head and the sensory
consequences of such motor actions, i.e. the final position of
the gaze fixation point. The learnt relations are stored in
the neural map structure and are then used, after learning,
for generating motor commands aimed at reaching a given
fixation point. The approach proposed here allows to solve
the inverse kinematics and joint redundancy problems for
the ARTS robotic head, with good accuracy and robustness.
Experimental trials confirmed the system capability to control
the gaze direction and fixation point and also to manage the
redundancy of the robotic head in reaching the target fixation
point even with additional constraints, such as a clamped joint
or two symmetric joint angles (e.g. eye joints).
Index Terms— biorobotics, robotic head control, sensory-
motor coordination, neural control.
I. INTRODUCTION
In this work, a model based on neural networks has been
proposed to solve the inverse kinematics problem for a 7-
DOF robotic head. This head can support two cameras and
direct the fixation point in the 3D space by a combination
of the neck movements (4 DOFs) and of the eye movements
(1 common tilt and 2 separate pans, allowing vergence).
The proposed model implements a mapping between the
direction of gaze, in the external spatial reference system
of the head, and its internal reference system, i.e. the joint
space. This mapping solves the inverse kinematics problem,
by computing a joint configuration for the head, to obtain a
given fixation point in the external spatial reference system.
Inverse kinematics is one of the basic problems in
developing robot controllers. Traditionally, solutions to
the inverse kinematics problem are obtained by different
techniques based on mathematical computational models,
such as inverse transform or iterative methods.
Sometimes, these methods may suffer from drawbacks,
especially when the number of degrees of freedom in-
creases: inverse transform does not always guarantee a
closed-form solution, while iterative methods may not
converge and may be computationally expensive.
Neural network approaches, which provide robustness
and adaptability, represent an alternative solution to the
inverse kinematics problem, especially when the number of
degrees of freedom to be controlled is high, or the external
spatial reference system is not easy to be modelled, such
as in visuo-motor coordination tasks.
Two main approaches have been proposed for the use
of neural networks to solve the problem of inverse kine-
matics. The first, based on mathematical models, considers
the artificial neural networks as a tool to solve nonlin-
ear mappings without, or in some cases with a limited
or partial, knowledge of the robotic structure [1], [2].
The second approach builds the mapping between motor
commands and sensory input on the basis of repeated
sensory–motor loops; the final map is then used to generate
appropriate motor commands to drive the robot towards
the target sensory input. These latter methods make use
of self-organizing maps [3] to build the internal mapping:
their self-organization and topological preservation features
make them well-suited for capturing mechanical constrains;
moreover, they show a good capability of processing arti-
ficial sensory signals, according to the analogous mapping
features of the somatosensory cortex by which they are
inspired.
These abilities could be useful in closed-loop control
systems which have to deal with somatosensory signals
coming from anthropomorphic artificial sensors such as
proprioceptive and tactile, in order to allow the genera-
tion of proper sensorial inputs towards the motor areas
[4]. In the past, visuo-motor coordination problems have
been extensively and quite successfully approached using
Kohonen’s maps [5], [6]. Nevertheless, this type of network
may suffer from the necessity of a priori knowledge of
the probability distribution of the input domain in order to
choose a proper cardinality and structure of the net, in order
to avoid over – or under – fitting problems. Furthermore,
they are not suitable to approach dynamic environments or
continuous learning [7]. In general, approaches involving
self-organizing techniques assign a sub-area of the input
domain (e.g. joint space) to a single neuron, according
to a specified resolution. This methodology divides the
whole space in a set of equally probabilistic sub-spaces,
disregarding the correlations between the tasks space and
Proceedings of the 2005 IEEE
International Conference on Robotics and Automation
Barcelona, Spain, April 2005
0-7803-8914-X/05/$20.00 ©2005 IEEE.
2373
the sensory domain in which they are performed. Growing
Neural Gas (GNG), proposed by Bernd Fritzke [8], is
an incremental neural model able to learn the topological
relation of input patterns. Unlike other methods, this model
avoids the need to pre-specify the network size. On the
contrary, from a minimal network size, a growth process
takes place, which is continued until a condition is satisfied.
II. DESCRIPTION OF THE PROPOSED MODEL
The proposed model is based on a self–organizing neural
network that learns the coordination of motor actions
with sensory feedbacks. It generates trajectories by solving
the inverse kinematics problem and the joint redundancy
problems for a robotic head.
The neural network starts with very little information
about the kinematics structure of the head, i.e. the number
of joints and their ranges of motion. In an initial learn-
ing phase, through endogenous action–perception loops, it
generates the associative information needed to build the
transformation between a spatial map, which encodes gaze
directions in the external space, and a motor map, which
encodes joint rotations. That process is applied repeatedly,
involving many locations in the head workspace, during the
learning phase. In the performance phase, after learning, the
learned transformation is used in the opposite way: given a
gaze target direction, the model provides the joint rotations
that drive the current gaze in the target direction.
The proposed architecture for correlating external ref-
erence system to the internal reference system takes in-
spiration from the
DIRECT
model (Direction-to-Rotation
Effector Control Transform) proposed in [9]. The
DIRECT
model includes descriptions of the role of sensorimotor
cortex in learning visuo-motor coordination, correlating
proprioceptive and visual feedback for the generation of
joint movements. It implements a coordinate transformation
from spatial directions to joints rotations. The
DIRECT
embodies a solution of the classical motor equivalence
problem emphasizing some peculiar human-like character-
istics, such as the successful reaching of targets despite
of the different operational conditions from those taken
into account during learning. Fig. 1 shows an overall block
diagram of the
DIRECT
model.
The proposed model is composed of four modules as
shown in Fig. 2:
•The
Spatial Position Map
(SPM) that contains an
appropriate coding of the gaze fixation points (inter-
section of the lines of sight outgoing from the eyes)
in the external spatial reference system;
•The
Motor Position Map
(MPM) that contains an
appropriate coding of the robot proprioception in the
internal motor reference system (i.e. the joint space);
•The
Integration Map
(IM) that contains a mapping
between the SPM and the MPM and provides the
activations to rcells in order to follow a given
direction;
•The
Motor Area
(MA) that contains three groups of
cells named respectively x,rand acells.
−
Vision Feedback
SPATIAL
PRESENT
POSITION
SPATIAL
TARGET
SPATIO−MOTOR
VECTOR(PPVsm)
PRESENT
POSITION
POSITION
DIRECTION
SPATIAL
POSITION−
DIRECTION RANDOM
ENDOGENOUS
GENERATOR
MOTOR
PRESENT
POSITION
MAP (PPMm)
(ERG)
MOTOR
DIRECTION
MOTOR
PRESENT
POSITION
VECTOR (PPVm)
VECTOR (TPVs)VECTOR (PPVs)
VECTOR (DVs)
MAP (PDMms)
VECTOR (DVm)
End Effector Target
Proprioceptive Feedback
Fig. 1. The overall block diagram of the
DIRECT
model (redrawn from
[9])
Fig. 2. The implementation of the proposed neural model.
The main processing steps that allow external target gaze
fixation point (TGFP) to guide changes in the current gaze
fixation point (CGFP) during foveation tasks are:
1) computation of a spatial difference vector by com-
paring the TGFP with the CGFP measured in the
same coordinate system. The spatial difference vector
specifies the spatial displacement needed to bring the
CGFP into TGFP;
2) computation of the joint angle changes, or rotations,
needed to move the CGFP along the spatial differ-
ence vector toward the TGFP. The computation of the
appropriate joint rotations requires information about
both the direction of the spatial difference vector and
the CGFP;
3) integration of the joint angle increments or decre-
ments to compute angular values for all the joints in
order to control the CGFP.
More precisely:
•defining hcas the coding for the CGFP and htfor
the TGFP, the spatial difference vector is defined as
follows: dv =ht−hc;
2374
•the Integration Map contains a map of cells, each of
which is maximally sensitive to a particular spatial
direction in a particular position of joint space: the
input for the
IM
is composed of the vector dv from
the
SPM
and of a coding of the current positions in
the motor reference system from the
MPM
;
•the group of cells rencodes a set of joint rotation
commands to be integrated by the apopulations, as in
Fig. 2. This group of cells rreceives inputs from two
sources: xcells and
IM
cells. The training phase in
the proposed model is achieved through autonomously
generated repetitions of an action-perception loop:
during the motor babbling phase, the model endoge-
nously generates random motor commands and learns
sensory–motor coordination by correlating proprio-
ceptive feedback and visual feedback (gaze fixation
point). More specifically, during the training phase,
the xcells activate rcells with random values, while
during the running phase the contribution provided by
the xcells is null, while the contribution to the rcells
is provided only by the signals bkckzki deriving from
the active sites kof
IM
. Thus, the ractivities may be
approximated by the following equation:
ri=xi+
k
bkckzki i=1,...,n;(1)
where xiis the Endogenous Random Generator (ERG)
activation (only present during the learning phase),
zki is the adaptive weights from cell kin
IM
to
unit iof r,bkis an appropriate constant, and ckis
1 if the cell kin
IM
is activated, else is 0 ris a
population of units devoted to the generation of the
joint movements. Their number is equal to the joints of
the head multiplied by 2 according to the use of a pair
of muscle (agonist and antagonist) for each joint [10].
Each unit of rreceives connections from all the cells
of
IM
. The adaptive weight between a generic
IM
cell
kand the unit ibelonging to ris modified according to
the following learning equation (Grossberg’s ”outstar”
law [11]):
d
dt zki =γck(−δzki +ri)(2)
where γis a small learning rate parameter and δis a
decay rate parameter;
•ais a population of units that integrate signals from
corresponding rpopulations to produce an outflow
command specifying a set of joint angles. Each ai
cell codes the angle of a particular joint and receives
input from an antagonistic pair corresponding to ith
joint angle. The net input integrated by the acells is
a function of the difference between the activities of
the two corresponding rcells. The updating rule used
is: d
dt ai=(rE
i−rI
i)g(rE
i,r
I
i,ψ
i)(3)
where aiis the angle command for ith joint, rE
iand
rI
iare the corresponding excitatory and inhibitory r
cell activities respectively and is an integration rate
parameter, while the gfunction is defined as follows:
g((e, i, ψi)=ψmax −ψiif (e−i)≥0;
ψiif (e−i)<0.(4)
where ψiis the angle of the joint corresponding to
aiand ψmax is the maximum angle of this joint.
This update rule has been introduced in order to
favour head postures that have joints positions close
to the half range of motions, that can be consider the
”equilibrium points”.
III. IMPLEMENTATION OF THE PROPOSED MODEL
Afirst implementation of the proposed model was pre-
sented in [12], for the case of the control of a generic
robot arm. In this work, the implementation of the model
has been improved with respect to that presented in [12].
In that work, the maps described in the previous section
were implemented by using Growing Neural Gas (GNG)
that have been used for finding topological structures that
closely reflects the structure of the input distribution. The
output of these maps is given by the following equation:
Φ(ξ)=
ws1+i∈Ns1(µwi)
|Ns1|+1 (5)
where ξis the input pattern to the map, µis an appropriate
constant <1,ws1is the weight vector associated to
the winner unit and Ns1is the set of direct topological
neighbors of the winner unit.
Respect to the previous implementation, a main variation
has been to use only one neural map (
Sensory–Motor
Map
) for coding all the informations of the
Motor Position
Map
, the
Spatial Position Map
and the
Integration Map
(see
Fig. 3).
Fig. 3. The implementation of the proposed neural model.
This modification has permitted a significant reduction
in terms of required memory space but above all in
terms of computational time. The
Sensory–Motor Map
has been implemented using the self-organizing growing
neural networks (GNG) [13] for correlating the visual
perception (gaze fixation point) and proprioception and
for their integration. Starting with two nodes and using
2375
a growth mechanism, the network builds a graph in which
nodes are considered neighbours if they are connected by
an edge. The reference weights of cells are modified during
the execution by a variant of the competitive Hebbian
learning [14]. A growth process is continued until an
ending condition is fulfilled.
Learning parameters are constant over time, in contrast
to other models which heavily rely on decaying parame-
ters. Only the weights of the winner unit and those of
its neighbors are allowed to learn in order to move the
CGFP towards the target. Thus, Equation (1) has been
implemented in the following way:
ri=xi+zwi +k∈Nwνzki
|Nw|(6)
where wis the winner unit index and Nwis the set of
direct topological neighbors of the winner unit and νis an
appropriate constant less than 1.
A second improvement of the implementation has been
to take advantage of the topological structure of the neural
map during the task, that is to consider only receptive
fields activated more recently. This means that, at each
step, instead of considering all the cells of the neural map,
only the winner cell and its direct topological neighbors
are considered, thus decreasing the computational burden.
The last variation of the implementation has been to
allow to modify the weights reference vector of the GNG
not only during the learning phase but also during task
execution. This has allowed to improve the performance of
the model because the GNGs can adapt themselves to slow
changes of the input distribution, that is to move the nodes
so as to cover the new distribution.
Like the
DIRECT
model, the proposed model provides
robust performance of movement trajectories. In particular,
the model is also able to achieve a target point posing
constraints to some joints without requiring new learning
phase. For example, it is possible to clamp one or more
joints or to force two joints to have the same value.
IV. EXPERIMENTAL RESULTS
The implemented model has been tested and evaluated
by experimental trials, aiming at verifying the learning ca-
pabilities and the sensory–motor coordination functionality.
The experimental trials were conducted on the ARTS ro-
botic head, shown in Fig. 4. The robotic head is composed
of 7 DOFs:
•4 DOFs for the neck, that allow rotation, ventral
and dorsal flexion at two different levels and lateral
flexion;
•3 DOFs for the eyes, that allow a common tilt for both
eyes and 2 pans for the eyes separately.
Specifications of the robotic head are reported in Table
I.
The model learns the inverse kinematics of the head
by mapping the ”directions” of the gaze fixation point in
the Head Cartesian Reference System onto the joint space.
After learning, the model is able to generate a sequence
Fig. 4. The ARTS robotic Head: rotation axis(left), reference system
and gaze fixation point(right).
TABLE I
SPECIFICATIONS OF THE ARTS ROBOTIC HEAD
Joint Range Velocity Acceleration
J0 - Roll −28◦÷32◦20◦/sec 200◦/sec2
J1 - Lower Pitch −32◦÷26◦20◦/sec 200◦/sec2
J2 - Yaw −108◦÷108◦170◦/sec 750◦/sec2
J3 - Upper Pitch −20◦÷30◦120◦/sec 750◦/sec2
J4 - Eye Pitch −25◦÷53◦400◦/sec 4500◦/sec2
J5 - Left Eye −45◦÷45◦600◦/sec 10000◦/sec2
J6 - Right Eye −45◦÷45◦600◦/sec 10000◦/sec2
of joint configurations for the head in order to achieve a
given fixation point as shown in Fig. 5.
Fig. 5. The overall block diagram of the proposed model.
The learning phase has been executed off-line using the
direct kinematics of the head in order to provide the 3D
coordinates of the gaze fixation point in the head reference
system (see Fig. 4). 14395 random movements in the joint
space, to which different gaze fixation points correspond,
have been used to train the system. The total time needed
to train the system is about 15 minutes, using a Intel
Pentium 4 processor, and the cardinality obtained for the
final
Sensory–Motor Map
is 3928 units.
Three types of experimental trials have been executed:
normal task, task with clamped joint and task with sym-
metric angles for the joints eyes.
Fig. 6 and 7 shows the behavior of the error distance and
the trajectory of the joints during a normal fixtation task.
The quite monotonic trend shown by the graph illustrated
in Fig. 7 is a general behavior found in all the trials. We
2376
can underline how the monotonic trend and the absence of
notable oscillations ensure a linear movement of the CGFP
towards TGFP, which corresponds to a minimal waste of
energy according to the results reported in [15].
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30
Distance (mm)
Steps*10
Distance between the current position and the target position
Fig. 6. Distance between the current gaze fixation point and the target
during a normal task.
-10
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30
Joint angles (deg)
Steps*10
Joint trajectories
Joint 0
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Fig. 7. Joint trajectory during the normal task.
Fig. 8 shows the head initial posture and the final posture
in the normal task.
Fig. 8. Initial(left) and final(right) postures of the robotic head in the
normal task.
Fig. 9 and 10 show the performance of the system in
terms of the graph of the error distance from the TGFP and
the joint trajectories in a fixation task with a clamped joint.
This task shows the adaptability of the neural network, as
the net provides a new joint configuration able to reach the
target even using one joint less.
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30 35
Distance (mm)
Steps*10
Distance between the current position and the target position
Fig. 9. Distance between the current gaze fixation point and the target
during a foveation task with a clamped joint.
-10
0
10
20
30
40
50
60
70
80
0 5 10 15 20 25 30 35
Joint angles (deg)
Steps*10
Joint trajectories
Joint 0
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Fig. 10. Joint trajectory during the foveation task with a clamped joint.
By clamping joint 0 (roll) it is possible to obtain more
human-like postures for the head. Fig. 11 shows the head
foveating the same point with no clamped joint and with
joint 0 clamped.
Fig. 11. Final posture of the robotic head in the normal task(left) and
in the task with a clamped joint zero(right).
2377
Finally, Fig. 12 and 13 show the graph of the error
distance from the TGFP and the joint trajectories in a
foveation task with symmetric angles for joints eyes.
0
200
400
600
800
1000
1200
0 5 10 15 20 25 30
Distance (mm)
Steps*10
Distance between the current position and the target position
Fig. 12. Distance between the current gaze fixation point and the target
in a foveation task with symmetric angles for eye joints.
-10
0
10
20
30
40
50
60
70
0 5 10 15 20 25 30
Joint angles (deg)
Steps*10
Joint trajectories
Joint 0
Joint 1
Joint 2
Joint 3
Joint 4
Joint 5
Joint 6
Fig. 13. Joint trajectory in a foveation task with symmetric angles for
eye joints.
As in the previous example, by constraining the joints
of the eyes to have symmetric values we obtain a more
anthropomorphic behaviour for the head. Fig. 14 shows the
head foveating the same point with no constraints and with
symmetric angles for joints eyes. This functionality could
be used, for example, to implement a VOR (Vestibo-Ocular
Reflex), i.e. the mechanism for stabilizing gaze during head
motion.
Fig. 14. ARTS robotic head final posture in a normal task(left) and in
a task with symmetric angles for eye joints(right).
V. C ONCLUSIONS
This paper presented a neural controller for a reduntant
robotic head with 7 DOFs.
After a learning phase of about 15 minutes, the system
is capable of controlling gaze movements of the head
to follow a given target spatial direction using different
combinations of the joints. In particular, it is able to gaze
points in the 3D space, providing a solution to the well-
known inverse kinematics problem. It is also able to gaze a
target point with additional constraints, such as some joints
locked or with symmetric angles for eye joints.
Experimental trials with the ARTS robotic head show
how more human-like postures can be obtained by acting
on such constraints. The use of self-organizing maps has
shown to be effective in managing the inverse kinematics
problem, when approached as a sensory-motor mapping
problem. The use of growing neural gas has further in-
creased the performance in computational terms and in the
representation of the input space.
REFERENCES
[1] S. Lin and A. A. Goldenberg. Neural network control of mobile
manipulators. IEEE Trans. on Neural Networks, 12(5), Sep 2001.
[2] H. D. Patino, R. Carelli, and B. Kuchen. Neural network control of
mobile manipulators. IEEE Trans. on Neural Networks, 13(2), Mar
2002.
[3] T. Kohonen. Self-organizing maps. Springer-Verlag, second edition,
1997.
[4] F. Leoni, M. Guerrini, C. Laschi, D. Taddeucci, P. Dario, and
A. Starita. Implementing robotic grasping tasks using a biological
approach. In Proceeding of the International Conference on Robotics
and Automation. Leuven, Belgium, pages 16–20, May 1998.
[5] M. Kuperstein and J. Rubinstein. Implementation of an adaptive
neural controller for sensory-motor coordination. IEEE Control
Systems Magazine, 9(3):25–30, 1989.
[6] J. A. Walter and Schulten K. J. Implementation of self-organizing
neural networks for visuo-motor control of an industrial robot. IEEE
Transactions on Neural Networks, 4(1):86–95, 1993.
[7] S. Marsland, J. Shapiro, and Nehmzow U. A self-organizing
networks that grows when required. Neural Networks, 15:1041–
1058, 2002.
[8] B. Fritzke. Growing cell structures a self-organizing network for
unsupervised and supervised learning. Neural Networks, 7(9):1441–
1460, 1994. ICSI TR-93-026.
[9] D. Bullock, S. Grossberg, and F. H. Guenther. A self-organizing
neural model of motor equivalent reaching and tool use by a
multijoint arm. Journal of Cognitive Neuroscience, 5(4):408–435,
1993.
[10] E. R. Kandel, J. H. Schwartz, and T. M. Jessell. Principles of neural
science. McGraw Hill., fourth edition, 2000.
[11] S. Grossberg. Classical and instrumental learning by neural net-
works. Progress in Theoretical Biology, 3:51–141, 1974.
[12] G. Asuni, F. Leoni, E. Guglielmelli, A. Starita, and P. Dario. A
neuro-controller for robotic manipulators based on biologically-
inspired visuo-motor co-ordination neural models. In Proceedings
of the 1st International IEEE EMBS Conference on Neural Engi-
neering, Capri Island, Italy, pages 450–453, Mar 2003.
[13] B. Fritzke. A growing neural gas network learns topologies. In
G. Tesauro, D. S. Touretzky, and T. K. Leen, editors, Advances in
Neural Information Processing Systems 7, MIT Press, Cambridge
MA, pages 625–632, 1995.
[14] T. M. Martinetz. Competitive hebbian learning rule forms perfectly
topology preserving maps. In ICANN93: International Conference
on Artificial Neural Networks, Springer, Amsterdam, pages 427–434,
1993.
[15] M. I. Jordan and D. M. Wolpert. Computational motor control In
M. Gazzaniga (Ed.), The Cognitive Neurosciences. Cambridge: MIT
Press, 2nd edition, 1999.
2378