Content uploaded by Paolo Dario
Author content
All content in this area was uploaded by Paolo Dario on Feb 22, 2016
Content may be subject to copyright.
Auton Robot (2008) 25: 85–101
DOI 10.1007/s10514-007-9065-4
A bio-inspired predictive sensory-motor coordination scheme
for robot reaching and preshaping
Cecilia Laschi ·Gioel Asuni ·Eugenio Guglielmelli ·
Giancarlo Teti ·Roland Johansson ·Hitoshi Konosu ·
Zbigniew Wasik ·Maria Chiara Carrozza ·Paolo Dario
Received: 30 October 2006 / Accepted: 15 November 2007 / Published online: 29 November 2007
© Springer Science+Business Media, LLC 2007
Abstract This paper presents a sensory-motor coordina-
tion scheme for a robot hand-arm-head system that provides
the robot with the capability to reach an object while pre-
shaping the fingers to the required grasp configuration and
while predicting the tactile image that will be perceived af-
ter grasping. A model for sensory-motor coordination de-
rived from studies in humans inspired the development of
this scheme. A peculiar feature of this model is the predic-
tion of the tactile image.
The implementation of the proposed scheme is based on
a neuro-fuzzy module that, after a learning phase, starting
from visual data, calculates the position and orientation of
the hand for reaching, selects the best-suited hand config-
uration, and predicts the tactile feedback. The implementa-
tion of the scheme on a humanoid robot allowed experimen-
tal validation of its effectiveness in robotics and provided
C. Laschi ()·G. Asuni ·G. Teti ·M.C. Carrozza ·P. Dario
ARTS (Advanced Robotics Technology and Systems) Lab,
Scuola Superiore Sant’Anna, Piazza Martiri della Libertà, 33,
56127 Pisa, Italy
e-mail: cecilia@arts.sssup.it
Present address:
G. Teti
RoboTech srl, Peccioli (Pisa), Italy
E. Guglielmelli
CIR—Center for Integrated Research, Laboratory of Biomedical
Robotics and Biomicrosystems, Campus-Biomedico University,
Rome, Italy
R. Johansson
Umeå University, Umeå, Sweden
H. Konosu ·Z. Wasik
Toyota Motor Europe, Brussels, Belgium
perspectives on applications of sensory predictions in robot
motor control.
Keywords Predictive control ·Sensory-motor
coordination ·Robot grasping ·Robot learning ·Expected
perception ·Internal models ·Neuro-fuzzy controllers
1 Introduction
In robotics, biology can represent an inspiration source,
both for the development of biomimetic components and for
new control principles for robotic systems (Brooks 1991;
Beer et al. 1997) to the final aim of developing robots
with better sensory-motor performance, especially in real-
world scenarios. Biologically-inspired approaches dramat-
ically influence robot design: biomechatronic design tends
to make biomorphic robotic platforms complex and sophis-
ticated in sensory-motor functions, and intrinsically adapt-
able, flexible and evolutionary, as biological systems are.
Thus, controlling this kind of systems requires solving prob-
lems related to complex sensory-motor coordination, for
which still biology can be an effective source of inspiration
(Guglielmelli et al. 2007).
According to neurophysiological findings, human motor
control is based on sensory predictions more than on sen-
sory feedback (Berthoz 1997; Johansson 1998).Duetothe
delays in the transmission of the nervous signals, fast and
coordinated movements cannot be explained by pure feed-
back (Kawato 1999; Miall et al. 1993;Wolpertetal.1998).
In this account, we present the implementation of a neu-
roscientific model proposed for predictive sensory-motor
coordination in human object manipulation (Fig. 1) (Johans-
son 1998) and its application to sensory-motor coordination
of reaching actions in a humanoid hand-arm-head robotic
system. This model describes the predictive behavior that
86 Auton Robot (2008) 25: 85–101
Fig. 1 The sensorimotor coordination model for human object manip-
ulation, redrawn from Johansson (1998). From visual cues about the
target object we select and activate task-specific neural sensorimotor
programs from a set of general action programs. By using the internal
models, the sensorimotor program dynamically specifies a predicted
somatosensory input. When a mismatch between the predicted and the
actual sensory feedback occurs, the information from tactile sensors in
the hand is used to update the model of the target object
humans adopt in manipulation tasks, by predicting the sen-
sory feedback, before grasping, thanks to internal models
that are built by experience. In accordance with the neurosci-
entific model, the scheme implemented here determines the
position and shape of the robot hand for grasping visually-
detected objects and, at the same time, it predicts the tactile
feedback. When correct, sensory predictions improve con-
trol tasks, both in humans and in robots. When incorrect,
reactive behaviors are elicited and the internal models are
updated. In our work, the sensory prediction is a key ele-
ment for the next phase of grasping, controlled by a scheme
based on Expected Perception (EP) (Datteri et al. 2003a).
The grasping phase is outside the scope of this paper, which
focuses on the capability of our system to predict the tactile
image that will be perceived at grasping.
Reaching and grasping have represented a basic func-
tion for robots and has been widely studied since the 1980s.
Venkataraman and Iberall (1990) and Mason and Salisbury
(1985) present a very good view of the state of knowledge in
those years. A more recent review of robot grasping can be
found in Bicchi and Kumar (2000). Grasping still presents
complex sensory-motor coordination problems, especially
if implemented on biomorphic robots equipped with mul-
timodal sensory systems. The knowledge about the human
hand, the human grasp, and the human sensory-motor con-
trol of reaching and grasping has been often proposed to
offer suggestions for solving the technological problems of
robot reaching and grasping. First of all, a main classifica-
tion of human grasps in power grasps and precision grasps
is widely accepted in robotics (Napier 1956) and grasp tax-
onomies have been proposed and adopted (Cutkosky and
Howe 1990). Human hand preshaping has been analyzed in
robotics starting from the studies by Jeannerod (1984) and
Jeannerod et al. (1998), and robot manipulation control has
also relied on findings on human control (Arbib and Iber-
all 1985; Iberall and MacKenzie 1990). Research in robotic
reaching and grasping can be divided into two main lines,
following two main general approaches:
−the synthesis approach is based on mathematical mod-
els, assumes precise sensory information and requires a
model of the gripper and the object (Iberall and MacKen-
zie 1990); sometimes the interaction between the model
of the gripper and the object results different from the
real interaction (see Shimoga 1996 for a survey and
overview of the grasp synthesis approach). Control ar-
chitectures have been proposed for robot grasping, along
this line (Bekey and Tomovic 1990; Bekey et al. 1993;
Narasimhan 1990). Usually these approaches require in-
tensive computation or memory;
−the so-called heuristic approach typically addresses hand
preshaping; in these approaches the sensory information
are used to explore the geometry of the object (Fearing
Auton Robot (2008) 25: 85–101 87
1990; Klatzky and Lederman 1990), to model an object
(Charlebois et al. 1999) or to manipulate unknown ob-
jects.
In robots, likewise in humans, predictive control can im-
prove the performance of the perception-action loop, espe-
cially when the sensory signal processing is laborious. Pre-
diction has been implemented in robotics, especially to help
the robot prepare the next motor action.
The grasping system developed at the University of
Tokyo (Namiki and Ishikawa 2003) grasps objects with high
velocity and partially predictable motion, with a multifin-
gered hand and a 4-dof (degrees of freedom) robotic arm.
The algorithm is essentially reactive, as the system is capa-
ble at each step of computing the 3D position of a ball in the
space and generating a trajectory for catching it accordingly.
The system may be considered predictive, in the sense that it
computes a trajectory to intersect the ball at future time in-
stants. Nevertheless, it has to process the visual feedback at
each control steps (segmentation, extraction of target area,
computation of image moments). High efficiency is ensured
by means of a highly parallel vision system, that allows the
system to perform all the control loop in less than 1 ms.
Two systems based on a similar approach were presented
in Koivo (1991) and in Hong and Slotine (1995). Both sys-
tems are capable of directing an end effector toward a mov-
ing object by vision, the former for grasping, the latter for
catching.
In Koivo’s system, at each time step t, the system locates
the object by visual feedback analysis and predicts the po-
sition at time t+1; the predicted position is then taken as
a sub-goal for the robot end effector. Experimental results
(Fig. 3a of Koivo 1991) demonstrate that the end effector
succeeds in approaching the predicted object position.
In the system developed by Hong and Slotine, highly effi-
cient catching is achieved by visual analysis, which locates
a ball launched in the space and calculates a trajectory to
catch it.
No data are available concerning the environment in
which these grasping and catching capabilities have been
tested, except for images of the experimental setting re-
ported on published papers. In any case, the efficiency of
the visuo-motor loop ultimately depends on the time needed
for performing all the visual processing operations, that an-
alyzes thoroughly every acquired image in order to calculate
geometric and dynamic properties of the object.
The efficiency of these systems, like that of reactive sys-
tems in general, is likely to decrease in complex environ-
ments, when detecting geometric properties of the object
of interest is intrinsically difficult. This is the main moti-
vation of the work conducted by Kragic and Christensen
(2002). The system adopts a model-based visuo-motor co-
ordination strategy: a model of the object to be tracked or
grasped increases the efficiency (precision or speed) of the
system. At each step, the object position and orientation are
predicted (by means of neural networks), and an estimation
of relevant object features in the visual space is generated
accordingly. Object features are projected onto acquired im-
age. They are used as guideline for visual processing, as
edges are searched for in the vicinity of the features. In this
way, the detection of real object features in the acquired im-
age becomes a sort of ‘local’ processing, taking less time to
be performed. After this detection phase, the difference be-
tween estimated and real features is used to estimate the real
position of the object.
A similar approach is adopted at DLR, Germany (Wun-
sch et al. 1997).
This feature estimation may be considered a kind of sen-
sory anticipation, as in the Expected Perception (EP) scheme
proposed by part of the authors (Datteri et al. 2003a,2003b,
2004) and implemented here. EP is an expected perception
predicted from the current motor commands and then com-
pared with the actual one coming from sensors. In the exam-
ples described from literature, the predicted perception takes
the form of complex 3D representations or object meshes.
Visual processing of images is still performed at each con-
trol step, even if reduced. The approach is quite similar to
that adopted in the EP scheme, except for the fact that the
latter explicitly takes into account the possibility that the
“real” percept lies far from the vicinity of the expected one
(due, for example, to an unexpected obstacle that pulled the
object away).
The approach adopted here is a heuristic approach that
uses neuro-fuzzy networks to build the internal models that
allow to predict the tactile feedback in a reaching and grasp-
ing task. The objective of this work is to demonstrate that
it is possible to build internal models in a robot by learning,
through direct experience from real-world trials, and that the
robot can then use such internal models to predict a tactile
image before grasping a visually-detected object.
2 Structure and functionality of the proposed
sensory-motor scheme
The proposed predictive sensory-motor scheme for robot
reaching and preshaping is based on the model of human
sensory-motor coordination depicted in Fig. 1(Johansson
1998). In this model, hand preshaping and positioning para-
meters are determined based on visual information and on
learned internal models of objects and own body. In addi-
tion, a predicted somatosensory feedback is generated that
represents a predicted tactile (and proprioceptive) image that
should occur when the hand successfully contacts the object.
When the hand actually grasps the object, the predicted sen-
sory input is compared with the actual sensory feedback. In
case of mismatch (due, for example, to an incorrect estima-
tion of object surface properties), compensatory motor com-
mands (also learned) are issued and the internal model of the
88 Auton Robot (2008) 25: 85–101
Fig. 2 The proposed scheme for robot sensory-motor coordination in
grasping tasks: the vision module receives images of the target object
and it provides some object geometric features to the Preshaping Mod-
ule and to the Tactile Prediction Module. The Preshaping Module pro-
vides the motor commands to the hand and arm in order to have a sta-
ble grasp. The same motor commands are also provided in input to the
Tactile Prediction Module. Such motor commands are intended to con-
trol only the final position and orientation of the arm end effector, and
not the trajectory of the arm. The Tactile Prediction Module provides a
tactile image before that the grasp is executed
object is updated for improved future interactions with the
object.
The proposed scheme for sensory-motor coordination in
grasping tasks for robots has been implemented on a hu-
manoid robot that includes a hand with tactile sensors, an
arm and a binocular head. The scheme has been conceived to
control the arm movements for hand positioning and the fin-
ger movements for hand preshaping, based on sensory infor-
mation given by the tactile and vision systems. The scheme
consists of three main modules (Fig. 2):
−Vision Module;
−Preshaping Module;
−Tactile Prediction Module.
The Vision Module receives binocular images of the
scene acquired by the cameras in the robot head and pro-
vides information about geometric features of the object of
interest to the other modules, such as object shape, dimen-
sions, position and orientation.
The Preshaping Module provides a proper hand/arm con-
figuration to grasp the object based on inputs from the Vision
Module about the object geometric features.
The Tactile Prediction Module receives information
about the object geometric features from the Vision Mod-
ule and information from the Preshaping Module pertaining
to the hand/arm configuration. Based on this information,
the Tactile Prediction Module provides as output the tactile
image expected when the objects is contacted. This infor-
mation on the predicted tactile feedback that is expected to
be perceived after grasping is generated thanks to internal
models that are built by learning through grasping experi-
ences.
The information processed by each module is described
in detail in Sect. 3.
During a training phase on a real robotic platform (see
Sect. 4.1), the system learns how to suitably preshape the
hand as well as how to generate an appropriate predicted
tactile image. During training, the system grasps different
kinds of objects in different positions in the workspace. The
learning phase is aimed at collecting a large amount of data,
by many grasps, composed of correct couples of input and
output data. Such training data sets are used by the robot
to learn the correlations between visual information, hand
and arm configurations, and tactile images. This phase cor-
responds to the creation of the internal models present in the
neuroscientific model of Fig. 1.
The proposed system shows three interesting and advan-
tageous properties: first of all, the robot does not only gener-
ate motor commands to reach the object detected by vision,
but it can predict the tactile image that will be perceived
when grasping the object; secondly, no model of the object
is required, since the system automatically correlates visual
information with arm and hand positions from experience
of correct grasps. This means that, during learning, when-
ever the robot grasps the object in an effective way (i.e. the
object remains in the hand when lifted) the robot correlates
the sensory and motor information (however, if extending
the scheme to incorporate prediction of forces required for
further manipulation of grasped objects, internal models per-
taining to object properties may be needed, e.g., models for
prediction of object weight for generating smooth lifting ac-
tions (Johansson and Westling 1987)); third, the motor com-
mands provided to the arm and to the hand are given with-
out any analytical transformation from the visual reference
system into the arm reference system, since this mapping is
encoded in the preshaping module. That is, the robot learns
directly a transformation between the 2D camera reference
system and the 3D arm reference system.
3 Implementation of the model
We used neuro-fuzzy networks to implement sensory-motor
coordination scheme in order to obtain learning, generaliza-
tion, and adaptation capabilities as well as robustness. The
following sub-sections describe in some detail the imple-
mentation of the scheme. First, we provide a description of
the SANFIS networks we used. We then describe the Vision
Module, the Preshaping Module, and the Tactile Prediction
Module. Finally, we present the procedure used to train the
system.
3.1 The SANFIS networks
The SANFIS (Self-Adaptive Neuro-Fuzzy Inference Sys-
tem) networks (Wang et al. 1999) represent a particular cat-
egory of neuro-fuzzy networks that merge the advantages of
Auton Robot (2008) 25: 85–101 89
Fig. 3 The three structures of SANFIS networks, differing in their
IF-THEN rules. All the three networks have the same structure from
Layer 1 to Layer 4; this means that they have the same antecedent of a
fuzzy rule
Fuzzy Logic Systems (i.e., human-like IF-THEN rule think-
ing, easiness of incorporating expert knowledge, no need
for complex mathematical modeling) with features of neural
networks (learning, adaptation, and connectionist structure).
These neuro-fuzzy networks are particularly suitable for
managing input data that may be noisy and affected by error,
such as the input data expected in the present sensory-motor
coordination task. At the same time, these networks are par-
ticularly able to adapt and to self-organize such that appro-
priate rules for producing proper output data are defined. Fi-
nally, the knowledge contained into the neural networks can
be interpreted as IF-THEN rules. Three types of SANFIS ar-
chitectures can be distinguished based on the IF-THEN rule
structure (Fig. 3). The following formula defines the generic
rule jfor the three types:
IF x1is Aj
1and ... and xnis Aj
n,
THEN y1is fj
1and ... and ymis fj
m,(1)
where
fj
k=⎧
⎪
⎨
⎪
⎩
Bj
k(type I),
Θj
k(type II),
bj
0k+bj
1kx1+···+bj
nkxn(type III),
(2)
xand yare the input and output variables, respectively.
Like in fuzzy systems, inputs are evaluated for mem-
bership to an input set. Aj
iare the input fuzzy term sets
(i=1,2,...,n). While in fuzzy systems they are defined by
the programmer, in this neuro-fuzzy system they are defined
from a training data set.
Fig. 4 Inputs and outputs of the Vision Module: the Vision Module
receives the images from the two cameras and provides some geometric
features about the position (POS1 and POS2), size (DIM) and shape
(CC1 and CC2) of the target object to the other modules
Bj
k,Θj
kand bj
0k+bj
1kx1+···+bj
nkxnare output fuzzy
term sets, singleton constituents, and linear combination of
input variables, respectively.
By selecting the proper equation for fj
kwe obtain the
three different types of SANFIS neural networks reported in
literature (Wang et al. 1999). As shown in (1), the antecedent
of a fuzzy rule is the same for all the three different SANFIS
networks; this means that they all have the same structure
from Layer 1 to Layer 4 (see Fig. 3). The output of Layer 4
can be defined by the following equation:
pj(x) =n
i=1μAj
i
(xi)
J
j=1n
i=1μAj
i
(xi),(3)
where Jis the number of the rules and μAj
i
(xi)is the
Gaussian membership function defined by:
μAj
i
(xi)=exp −xi−mj
i
σj
i,(4)
where mj
iand σj
iare the centers and the widths of the
Gaussian functions, respectively.
The learning phase for a SANFIS neural network consists
of two phases:
1. structure identification;
2. parameter identification.
In the structure identification phase, initial fuzzy rules are
established based on a training input-output data set. After
establishment of these rules, the whole network structure is
built.
In the parameter identification phase, the parameters of
the membership functions are adjusted for optimal perfor-
mance.
The SANFIS learning mechanism is described in
Sect. 3.5.
3.2 Vision Module
Based on binocular images, the Vision Module provides to
the other modules geometric features of the graspable ob-
ject, represented by its 2D position in the first camera im-
age, its 2D position in the second camera image, its 2D di-
mension and a representation of its shape by chain codes, as
calculated in each image (Fig. 4).
90 Auton Robot (2008) 25: 85–101
Each object position (POS1 and POS2) is represented as
the 2D position of the centroid of a bounding box enclosing
the shape of the object in the camera reference system. The
width and height of the same bounding box represent the ob-
ject dimension (DIM). Both object positions and dimensions
are expressed in pixels. A 20-element chain code was used
to represent the shape of the object (CC1, CC2). A chain
code represents the boundary of an area in an image (Brice
and Fennema 1970) and is obtained by tracing the boundary
of the image in a clockwise manner, from a defined start-
ing point, while keeping track of the directions when going
from one pixel to the next (Fig. 5). Though the chain codes
do not provide an accurate description of the object geome-
try, they result very simple to implement and very effective
from a computational point of view. The neuro-fuzzy ap-
proach adopted in this work guarantees that even such inac-
curate visual information are enough for the system to learn
preshaping and tactile predictions.
Standard image processing methods were used to imple-
ment the Visual Module (Fig. 6). First, the target object was
identified in the images acquired by the two cameras. These
images were thresholded against previously acquired back-
ground images and the bounding box enclosing the thresh-
Fig. 5 Chain code representation: from left to right,theimageofan
area, its boundary, the directions for tracing the boundary with the start-
ing point, and the conventional numbering of directions. The resulting
chain codes is: 17066066544434223217
olded area and its centroid were determined. The boundary
of the object was then extracted by an edge detection algo-
rithm based on the Sobel operator (Sobel 1978) and used to
determine the chain code.
3.3 Preshaping Module
The Preshaping Module receives as input the geometric fea-
tures of the target object from the visual module and pro-
vides an arm position and hand configuration suitable for
grasping the object (Fig. 7).
The Arm Cartesian Position (ACP) encodes the wrist po-
sition (x,y,z) and orientation (roll,pitch,yaw) in the arm
reference system. The Hand Joints Position (HJP) is en-
coded by a vector that represents the values for the hand
joints.
This module was implemented in a type I SANFIS net-
work. As described in Sect. 3.1, the peculiarity of these net-
works with respect to the other SANFIS types is that the
membership function is a Gaussian for the output, too. For
the purposes of the Preshaping Module, this is the best suited
of the three SANFIS types, as it allows to manage the com-
plexity and the redundancy of the output space, i.e. the dof
of the arm and of the hand.
3.4 Tactile Prediction Module
The Tactile Prediction Module receives as input part of the
geometric features of the object provided by the Vision
Module and information about the hand/arm configurations
from the Preshaping Module (Fig. 8). It provides as output
the Predicted Tactile Image (PTI).
Fig. 6 Steps of image
processing
Auton Robot (2008) 25: 85–101 91
Fig. 7 Inputs and outputs of the Preshaping Module: the Preshaping
Module receives the target object geometric features from the Vision
Module (POS1 and POS2 are the object centroid coordinates in the two
images, DIM is the average size of the bounding box around the object,
CC1 and CC2 are the chain codes for the object boundary in the two
images) and provides the hand/arm configuration to the low level robot
controller and also to the Tactile Prediction Module (ACP is a vector
of an arm position and HJP is a vector for the hand joints)
Fig. 8 Inputs and outputs of the Tactile Prediction Module: the Tactile
Prediction Module receives the chain codes from the Vision Module
(CC1 and CC2) and the hand/arm configuration from the Preshaping
Module (HJP and ACP, respectively) and it provides in output a Pre-
dicted Tactile Image (PTI). The Predicted Tactile Image is the tactile
configuration that it is expected to be perceived after that the grasp is
executed
PTI is a vector of components representing the tactile
configuration of ON/OFF contact sensors positioned on the
fingers of the hand. PTI is encoded by a binary vector in
which the component iis equal to 1 if the ith tactile sensor
is stimulated. Otherwise, it is 0.
A SANFIS neural network of type II satisfied the imple-
mentation of the Tactile Prediction Module because of the
reduced complexity of the output space as compared to that
of the Preshaping Module.
3.5 System learning
As indicated above, the building of the SANFIS neural net-
work involves two learning phases: structure identification
and parameter identification. Each phase requires a same
training set, i.e., a set of couples input,outputthat rep-
resent “good” examples for training the system.
The following procedure was used to generate “good”
training sets: the robot positions an object in a known loca-
tion on a table in front of it and brings the arm away from its
visual space. After the vision system has provided position,
dimensions and chain codes of the object, the hand moves in
the same position where the object was released and grasps
it. If the hand can successfully lift up the object, the rela-
tion input,outputis considered “good” and it is added to
the training set. Afterward, the hand releases the object in a
different known position and the procedure is repeated. This
procedure must be executed for different objects in many
positions and orientations of the workspace in order to col-
lect as much input-output relations as possible for training
the system. The way this procedure is defined allows the ro-
bot to carry on the learning phase autonomously, with poor
human intervention.
For the Preshaping module, the relations input,output
consist of the data (POS1,POS2,DIM,CC1,CC2), (ACP,
HJP)while for the Tactile Prediction Module the relations
input,outputconsist of the data (CC1,CC2,ACP,HJP),
(PTI).
The phase of structure identification consists of generat-
ing a fuzzy rule for each element of the training set, in the
form IF input THEN output.
A momentum back-propagation learning rule is used for
parameter identification (Haykin 1994). Learning based on
back-propagation is powerful for adjusting weights in artifi-
cial neural networks, and it usually includes two main steps.
First, each input pattern presented to the network is propa-
gated forward to the output. Second, a method called gra-
dient descent is used to minimize the total error, respect to
the known output of the training set. During the gradient de-
scent, weights are changed in proportion to the negative of
a derivative error with respect to each weight. Momentum
back-propagation learning is defined as follows:
w(t ) =−η∂E
∂w(t) −αw(t −1), (5)
where η(0≤η≤1)is the learning rate, α(0≤α≤1)the
momentum factor, and wrepresents any single weight in
the network. Weights move in the direction of steepest de-
scent on the error surface defined by the total error (summed
across patterns):
E=1
2((y(t ) −ˆy(t))2,(6)
where y(t) is the desired output, and ˆy(t) is the current out-
put.
In our system, the backpropagation learning phase makes
use of the same training set: the values in the input set are
fed to the SANFIS network and the output is evaluated with
respect to the output set, in the backpropagation learning
rule.
As described above, the Preshaping Module and the Tac-
tile Prediction Module were implemented using SANFIS
networks of type I and type II, respectively. Therefore, in
our case, the weights represent the centers and the widths of
the Gaussian membership functions (in SANFIS I) and con-
stants (in SANFIS II). The changes in the weights reduce
the overall error in the network, as defined by (6).
92 Auton Robot (2008) 25: 85–101
Fig. 9 The ARTS humanoid platform
4 Experimental methods and tools
We tested and evaluated the implemented system in experi-
ments on the ARTS humanoid platform (Dario et al. 2005),
which is a 1-link trunk that supports one arm/hand system
and a head/eye system (Fig. 9). The 2-dof trunk is part of
the arm (Dexter arm, by S.M. Scienzia Machinale srl, Pisa,
Italy) which has in total 8 dof, and integrates the 4 motors
of the three-fingered hand on the forearm. The hand has an-
thropomorphic dimensions and weight. Each finger consists
of 3 underactuated dof driven by one cable allowing flex-
ion/extension. A 2-dof trapezo-metacarpal joint at the base
of the palm allows thumb opposition movement (adduc-
tion/abduction). In total, the hand has 10 dofs, 6 of which are
underactuated. The perception system of the hand includes
proprioceptive and exteroceptive sensory systems (Carrozza
et al. 2003). Each finger is equipped with 3 on/off contact
sensors (one per phalanx). In addition, the 4 hand motors
are equipped with encoders. The anthropomorphic robotic
head was designed with reference to the physical structure
and performance of the human head regarding dofs, ranges
of motion, speeds and accelerations (Laschi et al. 2008). The
head has 7 dof, equipped with incremental encoders, which
can reproduce the human eye movements: 4 dof in the neck
(1 yaw, 2 pitches at different heights, 1 roll), 1 dof for a
common eye tilt movement and 2 dof for independent eye
pan movements. The head is equipped with two cameras.
4.1 Training
For building both SANFIS neural networks, 3 objects (a ball,
a bottle and a cassette) were placed in different positions
and orientations in the workspace. The 3 objects and their
positions and orientations were chosen so as to offer the ro-
bot different cases where different kinds of grasps have to
be used. Though limited in information and in accuracy, the
chain codes encode enough information on the shape that
the networks can learn different preshaping movements and
configurations.
The ball has a diameter of 80 mm, the bottle has a di-
ameter of 60 mm with a height of 200 mm, the cassette has
a dimension of 110 ×13 ×13 mm. The bottle in a stand-
ing position and the ball were each located in 12 different
positions. When lying down, the bottle was presented in 36
different positions/orientations. Its orientation varied from
−45° to +45°, with respect to a horizontal line perpendic-
ular to the robot y-axis. The cassette was presented in 36
different positions/orientations. Its orientation varied from
−45° to +45°, as well. The workspace was 20 cm wide
and about 30 cm long with reference to the table top where
the objects were placed. These objects afforded three types
of grasps (Fig. 10): a lateral palmar grasp for the bottle in
standing up position, a palmar grasp from the top for the
ball and for the bottle in the lying down position, and a pinch
grasp from the top for the cassette.
For each trial, we collected the following data:
−POS1: position of the object centroid (x1,y1)in the ref-
erence system of camera 1;
−POS2: position of the object centroid (x2,y2)in the ref-
erence system of camera 2;
−DIM: width and height of the bounding box enclosing
the object;
−CC1: chain codes of the object in camera 1;
−CC2: chain codes of the object in camera 2;
−ACP: arm position and orientation (x,y,z,roll,pitch,
yaw);
−HJP: hand configuration (4 joint angles);
−TI: tactile image of the 9 contact sensors.
After data collection, the Fuzzy rules for both the Pre-
shaping module and the Tactile Prediction Module were
built. As explained in Sect. 3.5, the fuzzy rules are generated
from correct couples input,output, as IF input THEN out-
put. The momentum back-propagation algorithm was then
executed to refine the membership function parameters as
described in Sect. 3.5. The resulting neural networks were:
−a SANFIS network of type I with 307 fuzzy rules for the
Preshaping Module;
−a SANFIS network of type II with 123 fuzzy rules for the
Tactile Prediction Module.
4.2 Validation trials
For evaluating the system, several trials were executed, as
reported in Sect. 4.1. For each trial, an object was located in
Auton Robot (2008) 25: 85–101 93
Fig. 10 Type of grasps
a position in the workspace. The output of the Vision Mod-
ule was fed to the Preshaping Module and its output (i.e.,
arm position and orientation) was fed to the arm controller.
After that, the hand grasped the object and lifted it up.
A trial was considered successful if the robot grasped,
lifted up and kept the object with a stable grasp.
4.3 Evaluation of the Preshaping Module
The evaluation of the Preshaping Module concerned the ro-
bot capability to generate a proper grasping position for an
object positioned in the workspace. The performance of the
Preshaping Module was evaluated in terms of success rate
across trials (SR):
SR =number of successful trials
number of trials .(7)
The SR ranges from 0 to 1, being 0 failures for all trials
and 1 success for all trials.
4.4 Evaluation of the Tactile Prediction Module
The evaluation of the Tactile Prediction Module concerned
the capability of the system to predict the tactile image
at grasping. This capability was measured as mean error
(MET) of the tactile image defined as:
MET =N
i=1dist(t, ˆ
t)
N,(8)
where Nis the number of trials, dist(t, ˆ
t) is the distance
between the outputs of the Tactile Prediction Module (t) and
the achieved tactile image (ˆ
t). This distance is given by the
difference between the on/off values of the tactile sensors, in
the two tactile images (predicted and real). The MET ranges
from 0 to 9, as 9 is the total number of tactile sensors. 0
means all 9 predicted sensor responses wrong, 9 means all
the 9 predicted responses correct.
Figure 11 shows an example of visual processing and tac-
tile prediction with the ball.
94 Auton Robot (2008) 25: 85–101
Fig. 11 The output of the Vision Module in the two cameras and the output of the Tactile Prediction Module
5 Experimental validation
For evaluating the system, 120 trials were executed. The sys-
tem was evaluated in 3 different sessions:
−evaluation of the system in the same conditions of the
training (i.e. same objects and same positions);
−evaluation of generalization capability in positioning and
orientation (same objects in different locations than train-
ing);
−evaluation of generalization capability in size and shape
(different objects in different positions than training).
5.1 Evaluation of the system in the same conditions
of the training
These trials aimed at evaluating the capability of the system
to grasp the same objects of the training in the same position
and orientation; 40 trials where performed:
−10 trials with the ball in different positions of the
workspace, but belonging to the training set;
−10 trials with the bottle standing in different positions of
the working space, all belonging to the training set;
Tabl e 1 Performance of the system with inputs from the training set
SR MET
Ball 0.90 1.00
Bottle (standing up) 0.90 0.22
Bottle (lying down) 1.00 1.60
Cassette 0.40 2.00
−10 trials with the bottle lying in different positions of the
workspace, all belonging to the training set;
−10 trials with the cassette in different positions of the
workspace, all belonging to the training set.
Table 1reports the evaluation of the system in the same
conditions of the training. The average values obtained were
0.8 for the SR and 1.0938 for the MET (see Fig. 12).
The system is able to reach and grasp two objects of
the training set (the ball and the bottle) with a good perfor-
mance; instead, with the cassette the system makes a larger
error. Such error consists of a hand positioning in a preshap-
Auton Robot (2008) 25: 85–101 95
(a) Success rate
(b) Distance between the outputs of the Tactile Prediction Module (t) and the achieved tactile image (ˆ
t); the range is be-
tween 0 and 9
Fig. 12 Evaluation in the same conditions of the training
ing position which does not allow to grasp and lift the ob-
ject effectively in all cases. The limitation of the Preshap-
ing Module to achieve this is especially in the accuracy of
positioning: the orientation of hand was good in the trials
that were not successful, but a higher accuracy in the hand
position was required. With the other objects, the accuracy
resulted enough, due to their different geometry, and espe-
cially size.
Despite the mean error of 1 cm made by the vision sys-
tem, the system has achieved good results in terms of SR
and MET.
5.2 Evaluation of generalization capability in positioning
and orientation
These trials aimed at evaluating the capability of the network
to generalize with respect to modifications of object position
and orientation. The evaluation has been performed with the
same objects of the training set placed in random positions
of the working space and with random orientations; 40 trials
have been performed:
−10 trials with the ball;
−10 trials with the standing bottle;
96 Auton Robot (2008) 25: 85–101
(a) Success rate
(b) Distance between the outputs of the Tactile Prediction Module (t) and the achieved tactile image (ˆ
t); the range is be-
tween 0 and 9
Fig. 13 Evaluation of generalization capability in positioning and orientation
−10 trials with the lying bottle;
−10 trials with the cassette.
Table 2reports the evaluation of generalization capabil-
ity in positioning and orientation. The average values ob-
tained were 0.775 for the SR and 2.0 for the MET (see
Fig. 13).
The performance of the system in this case is very similar
to that obtained in the previous trials. It is possible to note
that the system works well with the ball and the bottle in
Tabl e 2 Performance of the system with inputs from the same objects
of the training set, in random positions
SR MET
Ball 0.80 3.88
Bottle (standing up) 0.90 0.22
Bottle (lying down) 1.00 2.10
Cassette 0.40 2.00
both standing and lying positions, while with the cassette the
success rate is 0.4, because the dimensions of the cassette
Auton Robot (2008) 25: 85–101 97
(a) Success rate
(b) Distance between the outputs of the Tactile Prediction Module (t) and the achieved tactile image (ˆ
t); the range is between
0 and 9
Fig. 14 Evaluation of generalization capability in size and shape
are very small with respect to the mean error that the overall
system can make.
5.3 Evaluation of generalization capability in size and
shape
These trials aimed at evaluating the capability of the net-
work to generalize with respect to variation of object size
and shape. The following objects were taken into account:
−Large ball (diameter =100 mm);
−Small ball (diameter=65 mm);
−Large bottle (diameter =65 mm, height =230 mm);
−Small bottle (diameter =50 mm, height =150 mm);
−Large cassette (size =185 ×105 ×22 mm);
−Small cassette (size =70 ×50 ×13 mm);
−Parallelepiped (size =95 ×50 ×50 mm);
−Glass (diameter =65 mm, height =135 mm).
40 trials were performed, 5 for each item of the list above.
The average values obtained were 0.7 for the SR and
2.1786 for the MET (see Fig. 14).
98 Auton Robot (2008) 25: 85–101
The system works well with the different balls and bot-
tles, instead it is not able to reach and grasp the cassette
with different size. We have noted that the system provided
a good orientation but not a good position in the zcoordi-
nate. This happens because in the training set there are no
samples with different values of zand thus the system is not
able to generalize in the zcoordinate. Finally, the system
was able to reach and grasp also the parallelepiped and the
glass. With the parallelepiped, the Tactile Prediction Mod-
ule made a larger error in predicting the tactile image, due
to the lack of training with objects with a similar geometry.
6 Conclusions
We implemented a neuroscientific model of human sensory-
motor coordination in grasping and manipulation on a hu-
manoid robotic system with an arm, a sensorized hand and
a head with a binocular vision system. We obtained the ro-
bot to reach and grasp an object detected by vision, and to
predict the tactile feedback.
We showed that an implementation based on neuro-fuzzy
networks allows to build the internal model required for the
robot to calculate the hand and arm motor parameters and to
predict the tactile feedback, by learning.
The experimental results with our robotic system showed
that the robot is able to reach and grasp different objects us-
ing different types of grasps. In particular the system has
shown a good performance in terms of success rate, with
success given by the object kept in the hand during a lifting
action. The experimental results also showed a good capa-
bility of our system to predict the tactile feedback, as given
by the low difference between the predicted tactile image
and the actual one. In experimental conditions different from
those of the training phase, the system is capable to general-
ize with respect to variations of object position and orienta-
tion. The experimental trials showed that this generalization
capability is lower on the zcoordinate, because in the train-
ing phase the system has never had examples with different
values of z. With objects quite different in size and shape,
the system generalizes as well; for example, with the glass,
the system provided an arm orientation between the one of
the ball and the one of the bottle.
At present, our system is able to grasp only simple ob-
jects with few types of grasps. We think that, in order to
extend our system to more complex objects implying a wide
variety of preshapes, more visual features are needed. On-
line learning is under investigation to increase the gener-
alization capabilities: when grasping objects different from
those used during the training phase the system can continue
to learn by using the new examples.
The sensory prediction capabilities of the system sup-
ports the perspectives of application of predictive control
schemes in robotics.
References
Arbib, M. A., & Iberall, T. M. L. D. (1985). Coordinated control pro-
grams for movements of the hand. In A. W. Goodwin & I. Darian-
Smith (Eds.), Hand function and the neocortex (pp. 111–129).
Berlin: Springer.
Beer, R. D., Quinn, R. D., Chiel, H. J., & Ritzmann, R. E. (1997).
Biologically inspired approaches to robotics. Communications of
the ACM,40(3), 30–38.
Bekey, G. A., & Tomovic, R. (1990). Biologically based robot control.
In Proceedings of the annual international conference of the IEEE
engineering in medicine and biology society (Vol. 12, pp. 1938–
1939).
Bekey, G. A., Liu, H., Tomovic, R., & Karplus, W. J. (1993).
Knowledge-based control of grasping in robot hands using heuris-
tics from human motor skills. IEEE Transactions on Robotics and
Automation,9, 709–722.
Berthoz, A. (1997). Le Sens Du Mouvement. Paris: O. Jacob.
Bicchi, A., & Kumar, V. (2000). Robotic grasping and contact: A re-
view. In Proceedings of the conference on robotics and automa-
tion, San Francisco (pp. 348–353).
Brice, C. L., & Fennema, C. R. (1970). Scene analysis using regions.
Artificial Intelligence,1, 205–226.
Brooks, R. A. (1991). New approaches to robotics. Science,253, 1227–
1232.
Carrozza, M. C., Vecchi, F., Sebastiani, F., Cappiello, G., Roccella, S.,
Zecca, M., Lazzarini, R., & Dario, P. (2003). Experimental analy-
sis of an innovative prosthetic hand with proprioceptive sensors.
In Proceedings of the IEEE international conference on robotics
and automation (pp. 2230–2235).
Charlebois, M., Gupta, K., & Payandeh, S. (1999). Shape description
of curved surfaces from contact sensing using surface normals.
International Journal of Robotics Research,18(8), 779–787.
Cutkosky, M. R., & Howe, R. D. (1990). Human grasp choice and ro-
botic grasp analysis. In S. T. Venkataraman & T. Iberall (Eds.),
Dextrous robot hands (pp. 5–31). New York: Springer.
Dario, P., Carrozza, M. C., Guglielmelli, E., Laschi, C., Menciassi, A.,
Micera, S., & Vecchi, F. (2005). Robotics as a ‘future and emerg-
ing technology’: Biomimetics, cybernetics and neuro-robotics in
European projects. IEEE Robotics and Automation Magazine,
12(2), 29–43.
Datteri, E., Teti, G., Laschi, C., Tamburrini, G., Dario, P., &
Guglielmelli, E. (2003a). Expected perception: An anticipation-
based perception-action scheme in robots. In IROS 2003, 2003
IEEE/RSJ international conference on intelligent robots and sys-
tems, Las Vegas, Nevada (pp. 934–939).
Datteri, E., Teti, G., Laschi, C., Tamburrini, G., Dario, P., &
Guglielmelli, E. (2003b). Expected perception in robots: A bi-
ologically driven perception-action scheme. In Proceedings of
ICAR 2003, 11th international conference on advanced robotics
(Vol. 3, pp. 1405–1410).
Datteri, E., Asuni, G., Teti, G., Laschi, C., & Guglielmelli, E. (2004).
Experimental analysis of the conditions of applicability of a ro-
bot sensorimotor coordination scheme based on expected percep-
tion. In Proceedings of 2004 IEEE/RSJ international conference
on intelligent robots and systems (IROS), Sendai, Japan (Vol. 2,
pp. 1311–1316).
Fearing, R. S. (1990). Tactile sensing for shape interpretation. In
S. T. Venkataraman & T. Iberall (Eds.), Dextrous robot hands
(pp. 209–238). New York: Springer.
Guglielmelli, E., Asuni, G., Leoni, F., Starita, A., & Dario, P. (2007,
in press). A neuro-controller for robotic manipulators based on
biologically-inspired visuo-motor co-ordination neural models. In
Handbook of neural engineering: Vol. 26.Neural engineering se-
ries (pp. 433–448). New York: Wiley/IEEE Press.
Haykin, S. (1994). Neural networks, a comprehensive foundation (2nd
edn.). New York: Prentice Hall.
Auton Robot (2008) 25: 85–101 99
Hong, W., & Slotine, J. J. E. (1995). Experiments in hand-eye coordi-
nation using active vision. In Proceedings of the fourth interna-
tional symposium on experimental robotics, Stanford, CA.
Iberall, T., & MacKenzie, C. L. (1990). Opposition space and human
prehension. In S. T. Venkataraman & T. Iberall (Eds.), Dextrous
robot hands (pp. 32–54). New York: Springer.
Jeannerod, M. (1984). The timing of natural prehension movements.
Journal of Motor Behavior,16(3), 235–254.
Jeannerod, M., Paulignan, Y., & Weiss, P. (1998). Grasping an object:
One movement, several components. In Novartis found sympo-
sium (Vol. 218, pp. 5–20).
Johansson, R. S. (1998). Sensory input and control of grip. In M. Glick-
stein (Ed.), Sensory guidance of movements (pp. 45–59). Chich-
ester: Wiley.
Johansson, R. S., & Westling, G. (1987). Signals in tactile afferents
from the fingers eliciting adaptive motor responses during preci-
sion grip. Experimental Brain Research,66, 141–154.
Kawato, M. (1999). Internal models for motor control and trajectory
planning. Current Opinion in Neurobiology,9, 718–727. Elsevier
Science
Klatzky, R. L., & Lederman, S. J. (1990). Intelligent exploration by the
human hand. In S. Venkataraman & T. Iberall (Eds.), Dextrous
hands for robots (pp. 66–81). New York: Springer.
Koivo, A. J. (1991). Real-time vision feedback for servoing robotic ma-
nipulator with self-tuning controller. IEEE Transactions on Sys-
tems, Man, and Cybernetics,21(1), 134–142.
Kragic, D., & Christensen, H. I. (2002). Model based techniques for ro-
botic servoing and grasping. In Proceedings of the 2002 IEEE/RSJ
international conference on intelligent robots and systems, EPFL,
Lausanne, Switzerland (pp 299–304).
Laschi, C., Patanè, F., Maini, E. S., Manfredi, L., Teti, G., Zollo, L.,
Guglielmelli, E., & Dario, P. (2008). An anthropomorphic robotic
head for investigating gaze control. Advanced Robotics,22(1).
Mason, M., & Salisbury, K. (1985). Robot hands and the mechanics of
manipulation. Cambridge: MIT Press.
Miall, R. C., Weir, D. J., Wolpert, D. M., & Stein, J. F. (1993). Is the
cerebellum a smith predictor? Journal of Motor Behaviour,25,
203–216.
Namiki, A., & Ishikawa, M. (2003). Robotic catching using a direct
mapping from visual information to motor command. In Proceed-
ings of the IEEE international conference on robotics and au-
tomation (pp. 2400–2405).
Napier, J. R. (1956). The prehensile movements of the human hand.
Journal of Bone and Joint Surgery,36B(4), 902–913.
Narasimhan, S., Spiegel, D. M., & Hollerbach, J. M. (1990). Condor:
A computational architecture for robots. In S. T. Venkataraman &
T. Iberall (Eds.), Dextrous robot hands (pp. 117–135). New York:
Springer.
Shimoga, K. B. (1996). Robot grasp synthesis algorithms: A survey.
The International Journal of Robotics Research (MIT Press).
Sobel, I. (1978). Neighbourhood coding of binary images fast contour
following and general array binary processing. Computer Graph-
ics and Image Processing,8, 127–135.
Venkataraman, S. T., & Iberall, T. (Eds.). (1990). Dextrous robot hands.
New York: Springer.
Wang, J. S., Lee, C. S. G., & Juang, C. H. (Eds.). (1999). Structure and
learning in self-adaptive neural fuzzy inference systems. In Pro-
ceedings of the eighth int’l fuzzy syst. Association world congress
(IFSA’99), Taipei, Taiwan (pp. 975–980).
Wolpert, D. M., Miall, R. C., & Kawato, M. (1998). Internal models in
the cerebellum. Trends in Cognitive Sciences,2(9), 338–347.
Wunsch, P., Winkler, S., & Hirzinger, G. (1997). Real-time pose esti-
mation of 3-d objects from camera images using neural networks.
In Proceedings of the 1997 IEEE international conference on ro-
botics and automation (pp. 3232–3237).
Cecilia Laschi is Associate Professor of Bio-
medical Engineering at the Scuola Superiore
Sant’Anna in Pisa, Italy. She graduated in Com-
puter Science at the University of Pisa in 1993
and received the Ph.D. in Robotics from the
University of Genoa in 1998. Since 1992 she is
with the ARTS Lab (Advanced Robotics Tech-
nology and Systems Laboratory) of the Scuola
Superiore Sant’Anna in Pisa, Italy. From July
2001 to June 2002 she was visiting researcher
at the Humanoid Robotics Institute of the Waseda University in Tokyo,
as JSPS (Japan Society for the Promotion of Science) Fellow and
she now serves in the scientific committee of the Italy–Japan joint
lab RoboCasa. Her research interests are in the field of biorobot-
ics. Starting from basic robotics research, she has been investigated
bioinspired solutions for personal robotics and bionics. She has been
working in neuro-robotics, that is the application of robotics in neu-
roscience research, and she investigated and developed bioinspired
sensory-motor control schemes for robotic systems. She is currently
working on biomimetics, investigating animal and vegetal systems
from an engineering point of view and with engineering tools, and de-
signing robotic replicas that can fully explain the biological working
principles and mechanisms. She has been and currently is involved in
many National and EU-funded projects, in the field of biorobotics. She
has authored/co-authored more than 100 papers, appeared in interna-
tional journals and conference proceedings. She is Guest Co-Editor of
the Special Issue of the journal Autonomous Robots on “Bioinspired
Sensory-Motor Coordination” and of the Special Issue of the IEEE
Transactions on Robotics on “Human-Robot Interaction” and Asso-
ciate Editor of Advanced Robotics. She is member of the IEEE, of the
Engineering in Medicine and Biology Society, and of the Robotics &
Automation Society, in which she co-chairs the Technical Committee
on Human-Robot Interaction and Coordination.
Gioel Asuni was born in Cagliari, Italy. He re-
ceived a Master degree in Computer Science
in 2002 from the University of Pisa, Italy. He
received a Ph.D. in Robotics from the Univer-
sity of Genova in 2006. His interests are in the
research and development of fuzzy logic and
neural networks applied to robot control and
sensory-motor coordination.
Eugenio Guglielmelli received the Laurea de-
gree in Electronics Engineering and the Ph.D.
in Robotics from the University of Pisa, Italy,
in 1991 and in 1995, respectively. He is cur-
rently Associate Professor of Bioengineering
at Campus Bio-Medico University in Rome,
Italy, where he teaches the courses of Bio-
Mechatronics and of Rehabilitation Bioengi-
neering, and he is the Head of the Research Unit
on Biomedical Robotics. He has been working
in the field of biomedical robotics over the last fifteen years at Scuola
Superiore Sant’Anna where he served from 2002 to 2004 as the Head
of the Advanced Robotics Technology and Systems Laboratory (ARTS
Lab). His main current research interests are in the fields of novel the-
oretical and experimental approaches to human-centered robotics and
to biomporphic control of mechatronic systems, and in their applica-
tion to robot-mediated motor therapy, assistive robotics, and neuro-
developmental engineering. He participates as principal investigator
in several projects funded by the European Commission under the VI
Framework programme (IST/FET, NEST, IST/e-Health and other sub-
programmes) in the field of biomedical engineering. He serves in the
100 Auton Robot (2008) 25: 85–101
Editorial Board of the International Journal on Applied Bionics and
Biomechanics. He has been Guest Co-Editor of the Special Issue on
Rehabilitation Robotics of the International Journal ‘Autonomous Ro-
bots’ and of the Special Issue on Robotics Platforms for Neuroscience
of Int. Journal Advanced Robotics. He is member of the IEEE Robot-
ics & Automation Society, of the IEEE Engineering in Medicine &
Biology Society. He served (2002-03) as Secretary of the IEEE Robot-
ics & Automation Society (RAS) and he is currently Co-chair of the
RAS Technical Committee on Rehabilitation Robotics. He serves in
the Programme Committees of several International Conferences, such
as ICRA, IROS, ICAR, AIM and others. He was/is a member of the Or-
ganizing Committees of ICAR2003, IROS2004, IFAC/SYROCO2006
and ICRA2007.
Giancarlo Teti graduated in Computer Science
at the University of Pisa, Italy, in 1996, and
he received the Ph.D. in Biomedical Robotics
in 2002 from the Scuola Superiore Sant’Anna,
Pisa, Italy. From 1996 to 2005 he was with
the ARTS Lab (Advanced Robotics Technol-
ogy and Systems Laboratory) of the Scuola Su-
periore Sant’Anna. Since 2005 he is Technical
Manager at RoboTech srl, a spin-off company
of the Scuola Superiore Sant’Anna, whose mis-
sion is Edutainment Robotics. His research interests are in the field
of robot control architectures and bio-inspired schemes for sensory-
motor control of robots. Main application areas are Personal Robot-
ics and Humanoid Robotics. He has been and currently is involved in
many national and EU-funded projects in the field of robotics. He has
authored/co-authored more than 30 papers, appeared in international
journals and conference proceedings.
Roland Johansson is since 1988 professor in
physiology in the Medical School of Umeå Uni-
versity, Umeå, Sweden. His publications repre-
sent pioneering research concerning the neural
control of object manipulation in humans, and
in particular the sensory, mnemonic and predic-
tive mechanisms that underlie the control of fin-
gertip actions in a variety of prototypical manip-
ulatory tasks. Special interest has been devoted
to the nature and use of somatosensory and vi-
sual information in adapting the behavior to constraints imposed by the
physical properties of the target objects. This work involves showing
the nature and the roles of tactile sensory signals from the hand and the
role of gaze in planning and control of goal directed object manipula-
tions.
Hitoshi Konosu entered Toyota Motor Cor-
poration in 1987. He graduated at the Toyota
Technological Institute in 1994 and received the
master degree from Toyota Technological Insti-
tute in 1999. He is interested in machine intelli-
gence and human-robot interaction.
Zbigniew Wasik received the M.S. degree in
Robotics and Automation from the Wroclaw
University of Technology, Poland, in 1998, and
the Ph.D. degree in Computer Science and Arti-
ficial Intelligence from Orebro University, Swe-
den, in 2004. In 2004, he joined the Advanced
Technology Laboratory, Production Engineer-
ing Department, at Toyota Motor Europe, Bel-
gium. Since 2007, he is Assistant Manager in
R&D division at Toyota Motor Corporation,
Japan. His current research interests include the study of control ar-
chitectures for autonomous humanoid robots, pattern recognition from
incomplete data, intelligent robotics. He is a member of the IEEE Ro-
botics and Automation Society.
Maria Chiara Carrozza received the Laurea
degree in Physics from the University of Pisa,
Pisa, Italy, in 1990. She has got the Ph.D. in
Engineering at Scuola Superiore Sant’Anna in
1994. Since 2006, she has been a Professor
of biomedical robotics at the Scuola Superiore
Sant’Anna, Pisa, Italy. She is Director of the Re-
search Division and Deputy Director of Scuola
Superiore Sant’Anna (http://www.sssup.it). She
teaches Biomechatronics and Rehabilitation En-
gineering to Master students of Biomedical Engineering at the Univer-
sity of Pisa. She is elected Member of the national Board of the Ital-
ian association of Biomedical Engineering (Gruppo Nazionale di Bioi-
negegneria). Prof. Carrozza has been visiting professor at the Techni-
cal University of Wien, Austria with a graduate course entitled Bio-
mechatronics, and she is involved in the scientific management of
the Italy-Japan joint laboratory for Humanoid Robotics ROBOCASA,
Waseda University, Tokyo, Japan where she is responsible for artificial
hand design. Prof. Carrozza is the Coordinator of the Advanced Ro-
botics Technology and Systems Laboratory (http://www.arts.sssup.it),
founded by prof. Paolo Dario, where more than 50 peoples are in-
volved in research projects aimed at design, simulation and develop-
ment of biomedical robots for rehabilitation engineering, functional
support and humanoid robotics. She is active in several national and
international projects in the fields of biomechatronics and biomedical
robotics. Her research interests comprise biomedical robotics (cyber-
netic and robotic artificial hands, upper limb exoskeletons), rehabil-
itation engineering (neurorehabilitation, domotic, and robotic aids for
functional support and personal assistance), and biomedical microengi-
neering (microsensors, tactile sensors). The Arts Lab team coordinated
by Prof. Carrozza has designed and developed the CYBERHAND arti-
ficial hand (http://www.cyberhand.org) and is currently responsible for
the design of an Exoskeleton for functional support and enhancement
of the upper limb, in the framework of the NEUROBOTICS project
(http://www.neurobotics.org). In these projects, bioinspired design and
the fusion between neuroscience and robotics are addressed for going
“beyond robotics”. Prof. Carrozza is Member of IEEE RAS and EMBS
societies and she is an author of several scientific papers and interna-
tional patents. In addition, she is promoting industrial innovation and
start-up creation, she is co-founder of two spin-off companies of the
Scuola Superiore Sant’Anna and she is member of their Administra-
tive Boards.
Auton Robot (2008) 25: 85–101 101
Paolo Dario received his Dr. Eng. Degree in
Mechanical Engineering from the University of
Pisa, Italy, in 1977. He is currently a Professor
of Biomedical Robotics at the Scuola Superi-
ore Sant’Anna in Pisa. He also teaches courses
at the School of Engineering of the University
of Pisa and at the Campus Biomedico Univer-
sity in Rome. He has been Visiting Professor
at Brown University, Providence, RI, USA, at
the Ecole Polytechnique Federale de Lausanne
(EPFL), Lausanne, Switzerland, at Waseda University, Tokyo, Japan,
at the College de France, Paris, at the Ecole Normale Superieure de
Cachan, France and at Zhejiang University, China. He was the founder
of the ARTS (Advanced Robotics Technologies and Systems) Labo-
ratory and is currently the Co-ordinator of the CRIM (Center for the
Research in Microengineering) Laboratory of the Scuola Superiore
Sant’Anna, where he supervises a team of about 70 researchers and
Ph.D. students. He is also the Director of the Polo Sant’Anna Valdera
of the Scuola Superiore Sant’Anna. His main research interests are
in the fields of medical robotics, bio-robotics, mechatronics and mi-
cro/nanoengineering, and specifically in sensors and actuators for the
above applications, and in robotics for rehabilitation. He is the coordi-
nator of many national and European projects, the editor of two books
on the subject of robotics, and the author of more than 200 scientific
papers (130 on ISI journals). He is Editor-in-Chief, Associate Editor
and member of the Editorial Board of many international journals. He
has been a plenary invited speaker in many international conferences.
Prof. Dario has served as President of the IEEE Robotics and Automa-
tion Society in the years 2002–2003. He has been the General Chair of
the IEEE RAS-EMBS BioRob’06 Conference and General Co-Chair
of the ICRA 2007 Conference. Prof. Dario is an IEEE Fellow, a Fel-
low of the European Society on Medical and Biological Engineering,
and a recipient of many honors and awards, such as the Joseph Engel-
berger Award. He is also a member of the Board of the International
Foundation of Robotics Research (IFRR).