Conference PaperPDF Available

Model-free and learning-free grasping by Local Contact Moment matching

Authors:

Abstract and Figures

This paper addresses the problem of grasping arbitrarily shaped objects, observed as partial point-clouds, without requiring: models of the objects, physics parameters, training data, or other a-priori knowledge. A grasp metric is proposed based on Local Contact Moment (LoCoMo). LoCoMo combines zero-moment shift features, of both hand and object surface patches, to determine local similarity. This metric is then used to search for a set of feasible grasp poses with associated grasp likelihoods. LoCoMo overcomes some limitations of both classical grasp planners and learning-based approaches. Unlike force-closure analysis, LoCoMo does not require knowledge of physical parameters such as friction coefficients, and avoids assumptions about fingertip contacts, instead enabling robust contacts of large areas of hand and object surface. Unlike more recent learning-based approaches, LoCoMo does not require training data, and does not need any prototype grasp configurations to be taught by kinesthetic demonstration. We present results of real-robot experiments grasping 21 different objects, observed by a wrist-mounted depth camera. All objects are grasped successfully when presented to the robot individually. The robot also successfully clears cluttered heaps of objects by sequentially grasping and lifting objects until none remain.
Content may be subject to copyright.
Model-free and learning-free grasping
by Local Contact Moment matching
Maxime Adjigble1, Naresh Marturi1, Valerio Ortenzi2, Vijaykumar Rajasekaran1,
Peter Corke2, and Rustam Stolkin1
Abstract—This paper addresses the problem of grasping arbi-
trarily shaped objects, observed as partial point-clouds, without
requiring: models of the objects, physics parameters, training
data, or other a-priori knowledge. A grasp metric is proposed
based on Local Contact Moment (LoCoMo). LoCoMo combines
zero-moment shift features, of both hand and object surface
patches, to determine local similarity. This metric is then used
to search for a set of feasible grasp poses with associated grasp
likelihoods. LoCoMo overcomes some limitations of both classical
grasp planners and learning-based approaches. Unlike force-
closure analysis, LoCoMo does not require knowledge of physical
parameters such as friction coefficients, and avoids assumptions
about fingertip contacts, instead enabling robust contacts of large
areas of hand and object surface. Unlike more recent learning-
based approaches, LoCoMo does not require training data, and
does not need any prototype grasp configurations to be taught
by kinesthetic demonstration. We present results of real-robot
experiments grasping 21 different objects, observed by a wrist-
mounted depth camera. All objects are grasped successfully when
presented to the robot individually. The robot also successfully
clears cluttered heaps of objects by sequentially grasping and
lifting objects until none remain.
I. INTRODUCTION
Robots have been routinely and reliably grasping a vast
variety of objects in manufacturing environments for several
decades. This is based on simple pre-programmed actions, on
exactly pre-defined objects, in highly structured environments.
However, autonomous, vision-guided grasping, in unstructured
environments, remains an open research problem. In this paper,
we assume that the robot has a model of itself, but does not
have any models or prior knowledge of the objects that it is
tasked with grasping. These objects may take arbitrary shape
and appear amidst clutter, observed as noisy partial point-
clouds. Our main contribution is to show how this problem
can be approached without needing either classical physics
analysis, or any learning from training data.
Classical grasping methods based on physics analysis [1],
[2] typically require the robot to have detailed knowledge of
the grasped object’s shape, mass and mass distribution, and
friction coefficients between object surfaces and hand parts. It
is common to assume point or fingertip contacts, with contacts
of large surface areas of the hand becoming analytically
intractable. More recent work has investigated a variety of
machine learning approaches to grasping [3]–[5]. Learning
1M.Adjigble, N. Marturi, V. Rajasekaran and R. Stolkin are with the
Extreme Robotics Laboratory, School of Metallurgy and Materials, University
of Birmingham, UK. maxime.adjigble@gmail.com
2V. Ortenzi and P. Corke are with the ARC Centre of Excellence for
Robotic Vision, Queensland University of Technology, Brisbane QLD 4001,
Australia. http://www.roboticvision.org
Figure 1. (Top-left) Point cloud of the object. (Top-right) Contact moment
features for a single finger with planar surface. Red, yellow and green
respectively encodes increasing values of the metric in this order computed
using (3). (Bottom-left) Generated grasp with the highest contact probability.
(Bottom-right) Grasp executed on the robot.
approaches seek to encode a more direct link between the
geometry of a scene (typically observed as a point-cloud) and
grasp hypotheses. Such methods have significantly contributed
to overcoming limitations of classical methods. However,
all of these methods require training data (some more and
some less). Most of these methods also require prototypical
grasps (pinch-grasp, power-grasp, edge-grasp etc.) to be taught
by kinesthetic demonstration or pre-progamming, albeit that
learning-based methods can often adapt these pre-taught hand
configurations to new object shapes (generalisation) with some
success.
In this paper, we propose a novel algorithm for computing
robust grasp hypotheses on arbitrarily shaped objects. The
overall grasping pipeline is depicted in Fig. 1. Given a point-
cloud view of a surface, and the kinematics of the robot’s
arm and hand, our algorithm outputs a variety of feasible
grasp poses for the hand, and evaluates each according to a
novel grasp likelihood metric. A collision-free reach-to-grasp
trajectory is then sought, and the highest-likelihood reachable
grasp is executed. Like recent learning-based methods, our
method also maps observed surface shapes directly to grasp
hypotheses. However, this mapping is not achieved by learn-
ing, does not require any training data, nor does it require any
kinesthetic teaching or pre-programming of prototypical grasp
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, Spain, 2018
configurations. Instead, we propose a novel grasp likelihood
metric, the local contact moment probability function, which
evaluates the shape compatibility between local parts of hand
or finger surface, and local parts of an observed point-cloud.
Local contact moment (LoCoMo) is based on computing
zero-moment shift features for local parts of the observed point
cloud, and also parts of the robot’s hand. First described in the
computer graphics literature [6], zero moment shift features
represent the characteristics of limited regions of surfaces,
and are especially good at encoding information about surface
curvature, Fig. 2, which is particularly important for matching
hand parts to a grasped object. These features represent the
surface characteristics of a limited region of the point cloud,
hence they are “local” features. Also, they are computed on
the point cloud without the need of any a-priori knowledge of
the object (i.e., model-free).
Using LoCoMo as a fitness function, a point-cloud surface
can be efficiently searched for good matches to finger surface
geometry. Kinematic analysis then yields a set of feasible
grasps, with each grasp associated with a grasp likelihood.
The motion-space of the arm is then explored to find collision-
free reach-to-grasp trajectories to the highest likelihood grasp
poses.
The main contributions of this work are:
We propose the use of zero moment shift features [6] for
robotic grasp-planning.
We propose a new metric, the local contact moment
probability function, for evaluating compatibility between
the surface geometries of local parts of both object and
gripper. This metric is model-free, and does not need to
be learned from training data.
Exploitation of the kinematics of the robot to select a
subset of the graspable points, first identified by LoCoMo,
that are kinematically reachable and feasible for the arm
and hand system.
The remainder of this paper is structured as follows: Section
II highlights the novelties of this work with respect to related
literature. Section III describes the technical details of our
proposed method. Section IV shows the results of a number
of experiments conducted using a Schunk industrial two-finger
hand mounted on a KUKA LBR iiwa manipulator arm. Section
V provides concluding remarks.
II. RE LATE D WOR K
Classical approaches to grasping predominantly use
physics-based analysis to compute force-closure [7]–[11].
Most of these approaches rely on a large amount of a-
priori knowledge. They typically assume that an accurate and
complete 3D model of the object is known, as well as its mass,
mass distribution and also coefficients of friction between the
object’s surfaces and parts of the robot hand. In contrast, in
many real applications, a robot may be required to grasp a
previously unknown object of arbitrary shape, observed as a
partial point-cloud view, for which friction coefficients and
mass distribution are generally unknown. Many of these classi-
cal force-closure approaches are also restricted to assumptions
problematic when large patches of hand surface come into
contact with the object (unlike many human grasps such as the
“power grasp” where large surfaces of the hand are wrapped
around the object).
More recent approaches have explored various forms of
learning, [3], [12]–[15]. Learning-based methods overcome
some of the limitations of classical methods, and have shown
potential for generalising to grasping novel object shapes. [3]
achieved moderately successful grasping, by learning a direct
mapping between visual stimuli and motor outputs. Learning
was achieved via robots making exploratory motions coupled
with reinforcement. The system was able to synthesize novel
grasping policies, but relied on enormous amounts of training
data, involving large numbers of robots performing exploratory
actions over a long period of time. [15] minimised the amount
of reinforcement learning needed, by initiating learning from
close-to-good grasp poses by kinesthetic demonstration using
a data glove. In contrast, [13] showed significant ability to
generalise grasping to novel objects, achieved by “one-shot”
learning, i.e., the robot was taught a single grasp on a single
object, and was then able to plan successful grasps on new
shapes. [13] learned “local” models of relationships between
hand-parts and the curvatures of object surface patches. How-
ever, these must be combined with a “global” model of hand
shape, corresponding to a grasp prototype (pinch grasp, power
grasp, etc.) which is taught by demonstration. The method
therefore remains unable to synthesize novel grasp prototypes
that have not been taught.
Like the above learning approaches, our method also does
not rely on object models or physics knowledge. Like [13] it
exploits local descriptors of finger contacts (but a different
kind). However, our method requires no training data, and
can synthesize its own grasp hypotheses without any need of
demonstration.
III. METHOD
We present a method to address robotic grasping based on
the LoCoMo metric between the object and the gripper. This
similarity metric between the features on the object and the
features on the gripper is used to select viable finger poses on
the surface of the object which are then combined with the
kinematics of the gripper to form a grasp. In the following,
we assume a model of the gripper, in this case a parallel jaw
gripper.
The algorithm is given a (partial) point cloud of an object,
and first computes the zero-moment shift features on the point
cloud. The same features are extracted on the point cloud
of the gripper model. These features of object and gripper
are then used together to compute a local shape similarity
metric between object and gripper. The main idea is to find the
points that maximise the contact surface and to use only areas
of the object that match the surface curvature of the fingers
of the gripper for the grasp. Finally, a feasibility analysis is
performed to select the subset of pairs of points which are
of fingertip contacts only. Physics-based analysis becomes
Stanford Bunny. The colors Red, Yellow, Green and Blue encode in increasing
order the magnitude of the L1 norm of the zero-moment shift vector. High
values (Blue) incurs on the ears with high curvatures and low values (Red)
on surfaces with low curvatures. Left: ρ= 0.008, right:ρ= 0.016.
returned from the previous action and which are kinematically
feasible for the gripper.
A. Features Extraction and Matching Metric
Over the years, various local visual features have been
presented in the literature and were previously used for tasks
such as 2D/3D object recognition and pose estimation, [16]–
[20]. In this work we propose the use of zero-moment shift
features for grasping arbitrarily shaped objects.
Let Bρ(X)represent the Euclidean sphere of radius ρ
centered at a point XR3. Given a set of points Xin R3,
the zero-moment shift nρof the set of points ξ=X ∩Bρ(X),
belonging to the sphere Bρ(X), can be expressed as
nρ=M0
ρ(ξ)X(1)
M0
ρ(ξ) = 1
N
N
X
n=1
Xi(2)
where, M0
ρ(ξ)represents the zero moment (or centroid) of the
set of points ξbelonging to the sphere Bρ(X).Xiis a point
sampled from ξand Nthe total number of points in ξ.
The L1 norm |nρ|of the zero-moment shift is a good
indicator of the characteristics of the underlying surface of the
set of points, as shown in Fig. 2. It can be used in conjunction
with a classifier to robustly distinguish smooth surfaces from
edges, and also be used in conjunction with the first-moment
of the set of points to provide a robust surface classification for
noisy point cloud or mesh models as presented in [6]. In this
work, we focus on the use of the zero-moment shift to compute
a similarity metric between two arbitrary surfaces. We assume
that the set of point is already preprocessed and filtered of
outliers. Comparing two local surfaces is then reduced to
comparing the zero-moment shift of the two surfaces. To
this end, we introduce the LoCoMo probability function C
between two surfaces
Cρ= 1 max(x, φ(x, ~
0,Σ)) φ(ε;~
0,Σ)
max(x, φ(x,~
0,Σ)) (3)
φrepresents the multivariate Gaussian density function
φ(X, µ, Σ) = 1
p(2π)n|Σ|exp(1
2(Xµ1(Xµ)) (4)
where X, µ Rn,Σis the covariance matrix and nthe space
dimension. ~
0is the null vector of R3,εthe error between the
two zero-moment shift vectors defined as
ε=n1
ρn2
ρ(5)
where n1
ρand n2
ρare expressed in the same reference frame.
max(x, φ(x, ...)) is the maximum value of the function
φ(x, ...)for all xR3. The zero-moment shift vectors can be
projected on the axis of the normal and the axis orthogonal to
the normal of the surface to obtain a new set of coordinates
(nk, n,0) which can be used for the computation of (5).
This LoCoMo metric based on zero-moment shift features is
extremely useful for grasping, as it provides a clear indication
of the local contact between the surfaces of a gripper and an
object.
B. Grasp Selection and Ranking
Selecting stable grasps is crucial to guarantee the success
of a grasp. Several analytic methods use force closure, such as
[21] and [22]. Force closure guarantees a static equilibrium be-
tween the contact forces. Furthermore, the interaction between
two surfaces in contact can be reduced to one or multiple
contact points as described in [23]. These assumptions are
necessary conditions for a stable grasp selection, however they
are not sufficient conditions for a stable grasp, as mentioned
in [24].
The problem of generating grasp candidates can be formu-
lated as sampling finger poses on the surface of the object,
and combining them using the kinematics of the gripper to
form a grasp as described in [25]. Our method computes the
contact probability Cias given by (7) for each finger and uses
the kinematics of the gripper to select a set of finger poses to
form a grasp. The local contact probability Cρis computed
for an infinitesimal surface in a sphere of radius ρ. In order
to account for the entire shape of a finger, Cρneeds to be
integrated over its entire surface. We also introduce R, the
ranking metric (given by (6)), to rank the grasps by computing
the weighted product of the contact probability for each finger.
R=k
nf
Y
i=1
Cwi
i(6)
Ci=1
Ns
n
X
i=1
Ci,Xi
ρ(7)
where, kis a normalizing term, wiare weights satisfying
Pn
i=1 wi= 1,nfthe number of fingers, Cithe contact
probability for a finger defined in (7), nthe number of points in
the vicinity of the finger, Nsa normalizing term representing
the maximum number of points in the vicinity of the finger,
Ci,Xi
ρthe local contact moment probability between a point on
the point cloud and its orthogonal projection on the surface of
the gripper. More information on how to combine probability
Figure 2. Local surface classification base on the zero moment shift of the
Algorithm 1: Grasp generation and ranking.
Data: Point Cloud X, Fingers’ 3D model, Sphere Radius
ρ
Result: Top-k grasps
1Compute the surface normal at each point X∈ X
2for X∈ X do
3Select the set of points ξin Bρ(X)
4Compute nρwith (1)
5end
6for each finger do
7for X∈ X do
8Sample several finger poses Pfaround X
9for p∈ Pfdo
10 Select the set of points Xswithin a distance d
from the surface of the finger
11 for Xs∈ Xsdo
12 Project Xson the finger’s surface
13 Compute Cs,Xs
ρwith (3)
14 end
15 Compute Ciwith (7)
16 Append Pfto P
17 end
18 end
19 end
20 Find F, the set of finger poses in Psatisfying the
kinematic constraints of the gripper
21 for f∈ F do
22 Compute Rwith (6)
23 end
24 Order Fby decreasing order of R
25 Sample gripper pose from F
26 return the Top-k grasp poses
distributions can be found in [26]. A summary of the method
can be found in Alg. 1.
IV. EXP ER IM EN TAL RESULTS
A. Experimental setup
Our experimental setup (shown in Fig. 3) comprises a 7
degrees of freedom KUKA LBR iiwa arm whose end-effector
is mounted with a Schunk PG70 parallel jaw gripper with
flat fingers. The maximum stroke of the gripper is 68 mm.
The developed method neither require any prior knowledge
of the scene nor use any object models. However, for each
grasping trial, the robot workspace containing test objects
is observed by moving a robot wrist-mounted Ensenso N35
depth camera to six different locations. Resulting partial point
clouds from all viewpoints are stitched together, in robot base
coordinate frame, to form a point cloud of the work scene.
After segmenting the ground plane, the resulting cloud is then
used by our method to generate grasp hypotheses. Hand-eye
calibration has been performed beforehand to transform the
KUKA 7 DoF robot
3D camera
Gripper
Test objects
Figure 3. Hardware setup used to validate the proposed grasping method.
Figure 4. 21 objects used for the experiments. (left-column) spring clamp,
aluminum profile, multi-head screwdriver, screwdriver, plastic strawberry, golf
ball; (middle) racquetball, plastic lemon , plastic nectarine, wood block, potted
meat can, electric hand drill, plastic bottle, gray pipe, white pipe; (right-
column) blue cup, hammer, bleach cleanser, gas knob, bamboo bowl, mustard
container.
camera-acquired point cloud data to robot’s coordinate system
as well as to simplify the computations.
The proposed grasping method has been tested on 21
objects, as shown in Fig.4, comprising a wide variety of
shapes, masses, materials, and textures. 13 of them are from
the YCB object set [27]. The objects are selected such that
they are small enough to be physically graspable by the used
gripper.
Two sets of experiments were conducted. Firstly, we tested
the robot’s ability to grasp and lift individual objects from
Table I
SET OF OBJECTS USED FOR THE EXPERIMENT.
Object Success Rate 1st Grasp (5 Trials)
bleach cleanser 80% (4/5)
racquetball 100% (5/5
blue cup 80% (4/5)
aluminium profile 100% (5/5)
plastic bottle 100% (5/5)
bamboo bowl 100% (5/5)
spring clamp 100% (5/5)
electric hand drill 80% (4/5)
gas knob 100% (5/5)
golf ball 100% (5/5)
hammer 100% (5/5)
plastic lemon 80% (4/5)
mustard container 100% (5/5)
plastic nectarine 100% (5/5)
gray pipe 100% (5/5)
potted meat can 40% (2/5)
screwdriver 100% (5/5)
plastic strawberry 100% (5/5)
multi-head screwdriver 100% (5/5)
white pipe 60% (3/5)
wood block 100% (5/5)
Success Rate 91.43% (96/105)
the surface of a table. Second set of tests were performed
to analyse the robot’s ability to clear randomly piled heaps of
objects, by grasping and lifting objects successively, until none
remained. During trials, running on an Intel Core i7-4790K
CPU @ 4.00GHz and 16 GB RAM, our method took 13.53
seconds (on an average) to generate 1500 grasp hypotheses
for a point cloud with 31183 data points corresponding to
a clutter scene of 13 Objects. This computational time is
distributed as follows. The local contact moment computation
is performed in 1.26 seconds (9.3%), the selection of finger
pairs with feasible gripper kinematics is done in 6.29 seconds
(46.5%), and the robot’s end effector pose sample and inverse
kinematics check takes up to 5.98 seconds (44.2%).
B. Grasping individual objects
Our first experiment evaluates the robot’s ability to grasp
and lift individual objects off a flat table surface. 21 objects
were used, with five grasping trials performed on each object.
For each of the five trials, we randomly placed each object
on the table with different orientations and positions. After
capturing and registering partial point-clouds from multiple
views, points belonging to the table surface are filtered out and
the resulting object point cloud is then used to generate grasp
hypotheses, as described in Alg.1. The grasps are ranked, and
the grasp with the highest likelihood, Eq. (6), is executed. A
grasp is recorded as successful if the robot manages to grasp
and lift the object to a post-grasp position 20 cm above the
table, and hold the object for more than 10 seconds without
dropping it.
Table I shows the results of our algorithm when grasping
objects that are individually placed on a table. Fig.5 shows
images of successful grasps. The overall success rate for all
five trials on all 21 objects is 91.43% (96 successful grasps
Planned grasp Pre-grasp Grasp Post-grasp
Figure 5. Successful grasps for various objects. In each row, from left to
right, the first image shows the point cloud of the object with the contact
moment probability and the highest ranked grasp; the second image shows
the pre-grasp position of the gripper; the third image shows the grasp; finally,
the fourth image shows the post-grasp position with the object grasped.
out of 105). In 97.14% (102/105), the LoCoMo algorithm
suggested viable grasps, but objects were dropped for other
reasons. For example the object was heavy, and the selected
grasp was far from the centre of mass, placing a large torque
on the gripper jaws, causing the object to twist loose. In the
case of the potted meat can, the success rate was only 40%
(2/5). This was due to shiny surfaces which caused a very
noisy point cloud.
In safety-critical, high-consequence industries, such as nu-
clear waste handling or other extreme environments, au-
tonomous robotics methods are likely to be introduced as
“operator-assistance technologies”, i.e., human-supervised au-
tonomy. In such cases, a human operator might select between
Figure 6. Three different cluttered scenes generated for validating our
approach.
several grasps that have been suggested by an autonomous
grasp planner. As a small step towards exploring such a
system, we repeated the first experiment, however in each
attempt we allowed a human to choose one of the best five
grasp candidates suggested by the LoCoMo algorithm. In this
case, grasp success rose to 98%. This suggests that improved
performance might be obtained by combining LoCoMo with
other kinds of information, e.g., selecting grasps which result
in minimal torques.
C. Grasping objects from a cluttered heap
The second set of experiments was performed on cluttered,
self-occluding heaps of objects. For each heap, at least 6
objects were placed in a random pile. Three different heaps
were used, Fig. 6. The robot is tasked with clearing the
heap, by successively grasping and lifting objects until none
remain. No ground plane segmentation was performed in this
second experiment. However, the LoCoMo algorithm was able
to automatically label the flat table surface as ungraspable,
i.e., excluding flat surfaces, and focusing attention on objects,
appears to be an inherent behaviour of the algorithm.
At each iteration, grasps are generated, and the highest
ranked grasp is executed. Each object is removed without
replacement if the grasp is successful, and the experiment
is repeated until all the objects are grasped or the algorithm
reports that it cannot identify any more feasible grasps. The
success of each grasp attempt is determined in the same way
as in the first experiment.
Table II shows the results for the heap-picking experiments.
We report the results of three different heaps containing at least
six objects each. For the first heap, 100% of the objects were
grasped successfully from the table, one after the other. Only
the gas knob required two trials to be successfully grasped,
with all other objects grasped on the first attempt.
For the second heap, all objects were grasped at the first
attempt, and the success rate was 100%. During its third
grasp, the robot chose to grasp and lift the bowl object, while
the bowl still held three other objects inside it (multi-head
screwdriver, plastic bottle and nectarine). In order to continue
Table II
CLU TTE RE D SCE NE EX PE RIM EN T RES ULTS .
Scene Attempt Object Success / Failure
#1
1 blue cup success
2 golf ball success
3 white pipe success
4 electric hand drill success
5 gas knob failure
6 wood block success
7 gas knob success
8 plastic nectarine success
#2
1 gray pipe success
2 aluminum profile success
3 bamboo bowl success
4 multi-head screwdriver success
5 plastic bottle success
6 plastic nectarine success
#3
1 mustard container success
2 plastic bottle success
3 spring clamp failed
4 plastic lemon success
5 spring clamp success
6 hammer failed
7 hammer success
8 plastic strawberry rolled off table
the experiment, these objects were placed back on the table
and then successfully grasped, needing only one attempt each.
For the third heap, 83% of the objects (5 out of 6) were
successfully grasped. The spring clamp and hammer proved
to be difficult, due to sparse point clouds. However, only two
attempts were required to grasp these objects. The system did
not fail to plan a grasp for the final object (plastic strawberry).
Unfortunately, lifting the hammer caused the strawberry to roll
off the table so that this final object of the heap could not be
completed.
Fig. 7 shows the generated grasps in the cluttered scene 1.
The robot was able to clear all three heaps successfully, the
only exception being the final object of the third heap, which
was pushed off the table during lifting of one of the other
objects.
D. Discussion
Overall, results suggest that the LoCoMo algorithm is very
promising. For lifting individual objects, a success rate of
91.43%was obtained over five different trials on 21 dif-
ferent objects, featuring a very wide variety of shapes and
appearances. This result is remarkable considering that the
system did not have any model or other a-priori knowledge
of the objects being grasped. Additionally, no training data
was required, and no learning was involved to obtain these
results. Moreover, in the heap-picking experiments, featuring
extreme clutter conditions, LoCoMo was able to grasp most
of the objects at the first attempt (15 out of 19 objects) and
was able to successfully grasp all objects, of all heaps, with
the exception of the final object of the final heap (plastic
strawberry) which rolled off the table during earlier activity.
Aside from a small number of unusual incidents, the pro-
posed algorithm appears to have planned robust grasps almost
Planned grasp
Pre-grasp
Grasp
Post-grasp
Aempt 1 Aempt 2 Aempt 3 Aempt 4 Aempt 5 Aempt 6 Aempt 7 Aempt 8
Figure 7. Results of grasp execution in cluttered scenes. First row: images of point cloud of the scene and the gripper. Second row: pre-grasp position of
the gripper with respect to the cluttered scene. Third row: execution of the grasp. Fourth row: post-grasp position of the gripper. Chronological sequence is
from left to right, i.e. first column shows the grasping of the first object, second column the grasping of the second object and so on. Detailed results can be
found in the provided supplementary video.
100% of the time. However, we believe that we can improve
robustness in several ways. We noted earlier that the set of
five highest-ranked grasps occasionally contains a grasp that
performs better than the highest ranked grasp. This is because
LoCoMo selects grasps based purely on the geometry of
surfaces. Combining LoCoMo’s robust selection of graspable
geometrical features, with other kinds of information such
as mass distribution [28], may enable more robust perfor-
mance. Additionally, combining multiple grasp hypotheses
with human-supervised autonomy, appears to outperform pure
autonomy based on LoCoMo alone.
V. CONCLUSION
In this paper, we proposed a novel grasp generation method,
based on the LoCoMo metric which searches for similarities
between the shape of finger surfaces, and the local shape
of an object, observed as a partial point-cloud. The metric
is based on zero-moment shift visual features, which encode
useful information about local surface curvature. Our method
does not rely on any a-priori knowledge about objects or
their physical parameters, and also does not require learning
from any kind of training data. Grasps are planned from
point-cloud images of objects, viewed from a depth-camera
mounted on the robot’s wrist. Experimental trials, with a real
robot and wide variety of objects, suggest that our method
generalises well to many shapes. We also demonstrated very
robust performance in extremely cluttered scenes. Moreover,
the algorithm is also capable of classifying certain objects
(e.g., flat table surfaces) as not graspable.
Our future work will focus on improving the performance
of the method in terms of speed and extending it to perform
multi-finger grasping. We will also focus on accomplishing
complex manipulations in challenging scenarios, e.g., nuclear,
automotive etc. by integrating it with our previous state
estimation and control methodologies [29], [30].
VI. ACK NOWLEDGEMENTS
This work forms part of the UK National Centre for
Nuclear Robotics initiative, funded by EPSRC EP/R02572X/1.
It is also supported by H2020 RoMaNS 645582, and EP-
SRC EP/P017487/1, EP/P01366X/1. Stolkin is supported by
a Royal Society Industry Fellowship. Ortenzi and Corke are
supported by the Australian Research Council Centre of Ex-
cellence for Robotic Vision (project number CE140100016).
REFERENCES
[1] A. T. Miller and P. K. Allen, “Graspit! a versatile simulator for robotic
grasping,” IEEE Robotics & Automation Magazine, vol. 11, no. 4, pp.
110–122, 2004.
[2] V.-D. Nguyen, “Constructing force-closure grasps,” The International
Journal of Robotics Research, vol. 7, no. 3, pp. 3–16, 1988.
[3] S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, “Learning
hand-eye coordination for robotic grasping with deep learning and large-
scale data collection,” The International Journal of Robotics Research,
vol. 37, no. 4-5, pp. 421–436, 2018.
[4] N. Marturi, M. Kopicki, A. Rastegarpanah, V. Rajasekaran, M. Adjigble,
R. Stolkin, A. Leonardis, and Y. Bekiroglu, “Dynamic grasp and
trajectory planning for moving objects,” Autonomous Robots, in-press.
[5] A. ten Pas, M. Gualtieri, K. Saenko, and R. Platt, “Grasp pose detection
in point clouds,” The International Journal of Robotics Research, p.
0278364917735594, 2017.
[6] U. Clarenz, M. Rumpf, and A. Telea, “Robust feature detection and
local classification for surfaces based on moment analysis,” IEEE
Transactions on Visualization and Computer Graphics, vol. 10, no. 5,
pp. 516–524, 2004.
[7] J. Weisz and P. K. Allen, “Pose error robust grasping from contact
wrench space metrics,” in Robotics and Automation (ICRA), 2012 IEEE
International Conference on. IEEE, 2012, pp. 557–562.
[8] C. Rosales, R. Su´
arez, M. Gabiccini, and A. Bicchi, “On the synthesis
of feasible and prehensile robotic grasps,” in Robotics and Automation
(ICRA), 2012 IEEE International Conference on. IEEE, 2012, pp.
550–556.
[9] M. A. Roa and R. Su´
arez, “Computation of independent contact regions
for grasping 3-d objects,” IEEE Transactions on Robotics, vol. 25, no. 4,
pp. 839–850, 2009.
[10] D. Prattichizzo and J. C. Trinkle, “Grasping,” in Springer handbook of
robotics. Springer, 2008, pp. 671–700.
[11] J.-W. Li, H. Liu, and H.-G. Cai, “On computing three-finger force-
closure grasps of 2-d and 3-d objects,” IEEE Transactions on Robotics
and Automation, vol. 19, no. 1, pp. 155–161, 2003.
[12] M. Gualtieri, A. ten Pas, K. Saenko, and R. Platt, “High precision
grasp pose detection in dense clutter,” in Intelligent Robots and Systems
(IROS), 2016 IEEE/RSJ International Conference on. IEEE, 2016, pp.
598–605.
[13] M. Kopicki, R. Detry, M. Adjigble, R. Stolkin, A. Leonardis, and J. L.
Wyatt, “One-shot learning and generation of dexterous grasps for novel
objects,” The International Journal of Robotics Research, vol. 35, no. 8,
pp. 959–976, 2016.
[14] I. Lenz, H. Lee, and A. Saxena, “Deep learning for detecting robotic
grasps,” The International Journal of Robotics Research, vol. 34, no.
4-5, pp. 705–724, 2015.
[15] H. B. Amor, O. Kroemer, U. Hillenbrand, G. Neumann, and J. Peters,
“Generalization of human grasping for multi-fingered robot hands,” in
Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International
Conference on. IEEE, 2012, pp. 2043–2050.
[16] M. Ma, N. Marturi, Y. Li, A. Leonardis, and R. Stolkin, “Region-
sequence based six-stream cnn features for general and fine-grained
human action recognition in videos,” Pattern Recognition, vol. 76, pp.
506–521, 2018.
[17] M. Ma, N. Marturi, Y. Li, R. Stolkin, and A. Leonardis, “A local-global
coupled-layer puppet model for robust online human pose tracking,”
Computer Vision and Image Understanding, vol. 153, pp. 163–178,
2016.
[18] D. Smeets, J. Keustermans, D. Vandermeulen, and P. Suetens, “meshsift:
Local surface features for 3d face recognition under expression varia-
tions and partial data,” Computer Vision and Image Understanding, vol.
117, no. 2, pp. 158–169, 2013.
[19] E. Paquet, M. Rioux, A. Murching, T. Naveen, and A. Tabatabai,
“Description of shape information for 2-d and 3-d objects,” Signal
processing: Image communication, vol. 16, no. 1-2, pp. 103–122, 2000.
[20] R. B. Rusu, G. Bradski, R. Thibaux, and J. Hsu, “Fast 3d recognition and
pose using the viewpoint feature histogram,” in Intelligent Robots and
Systems (IROS), 2010 IEEE/RSJ International Conference on. IEEE,
2010, pp. 2155–2162.
[21] R. M. Murray, Z. Li, S. S. Sastry, and S. S. Sastry, A mathematical
introduction to robotic manipulation. CRC press, 1994.
[22] C. Ferrari and J. Canny, “Planning optimal grasps,” in Robotics and
Automation, 1992. Proceedings., 1992 IEEE International Conference
on. IEEE, 1992, pp. 2290–2295.
[23] A. Bicchi and V. Kumar, “Robotic grasping and contact: A review,” in
Robotics and Automation, 2000. Proceedings. ICRA’00. IEEE Interna-
tional Conference on, vol. 1. IEEE, 2000, pp. 348–353.
[24] J. Bohg, A. Morales, T. Asfour, and D. Kragic, “Data-driven grasp
synthesis - a survey,IEEE Transactions on Robotics, vol. 30, no. 2,
pp. 289–309, 2014.
[25] M. Kopicki, R. Detry, F. Schmidt, C. Borst, R. Stolkin, and J. L. Wyatt,
“Learning dexterous grasps that generalise to novel objects by combining
hand and contact models,” in Robotics and Automation (ICRA), 2014
IEEE International Conference on. IEEE, 2014, pp. 5358–5365.
[26] S. Kaplan, “Combining probability distributions from experts in risk
analysis,” Risk Analysis, vol. 20, no. 2, pp. 155–156, 2000.
[27] B. Calli, A. Walsman, A. Singh, S. Srinivasa, P. Abbeel, and A. M.
Dollar, “Benchmarking in manipulation research: Using the yale-cmu-
berkeley object and model set,IEEE Robotics & Automation Magazine,
vol. 22, no. 3, pp. 36–52, 2015.
[28] N. Mavrakis, R. Stolkin, L. Baronti, M. Kopicki, M. Castellani et al.,
“Analysis of the inertia and dynamics of grasped objects, for choosing
optimal grasps to enable torque-efficient post-grasp manipulations,” in
Humanoid Robots (Humanoids), 2016 IEEE-RAS 16th International
Conference on. IEEE, 2016, pp. 171–178.
[29] V. Ortenzi, N. Marturi, R. Stolkin, J. A. Kuo, and M. Mistry, “Vision-
guided state estimation and control of robotic manipulators which lack
proprioceptive sensors,” in Intelligent Robots and Systems (IROS), 2016
IEEE/RSJ International Conference on. IEEE, 2016, pp. 3567–3574.
[30] N. Marturi, A. Rastegarpanah, C. Takahashi, M. Adjigble, R. Stolkin,
S. Zurek, M. Kopicki, M. Talha, J. A. Kuo, and Y. Bekiroglu, “Towards
advanced robotic manipulation for nuclear decommissioning: a pilot
study on tele-operation and autonomy,” in Robotics and Automation for
Humanitarian Applications (RAHA), 2016 International Conference on.
IEEE, 2016, pp. 1–8.
... Although geometric methods may not be as fast as some learning-based methods, they are: easy to generalise for multi-fingered hands; work well for grasping novel objects with a-priori unknown geometries; and can handle heaps of objects in cluttered and unstructured environments [10]. Recent methods, based on shape similarities between hand fingers and object surfaces, have demonstrated remarkable results for robotic grasping of unknown objects [10]- [14]. ...
... Although geometric methods may not be as fast as some learning-based methods, they are: easy to generalise for multi-fingered hands; work well for grasping novel objects with a-priori unknown geometries; and can handle heaps of objects in cluttered and unstructured environments [10]. Recent methods, based on shape similarities between hand fingers and object surfaces, have demonstrated remarkable results for robotic grasping of unknown objects [10]- [14]. These methods synthesise grasp hypotheses by matching the surface regions of gripper fingers with similar surface patches on the object, to maximize the contact area during grasping. ...
... The sampled grasp candidates are then ranked or sorted by virtue of custom grasp ranking metrics to find the candidate with highest probability of success. Our previous work in this area presented a Local Contact Moment (LoCoMo) similarity metric-based grasp planner [10]. This method demonstrated high success rates in grasping individual unknown objects and also in clearing random cluttered heaps of objects. ...
Conference Paper
Full-text available
This paper presents a spectral correlation-based method (SpectGRASP) for robotic grasping of arbitrarily shaped, unknown objects. Given a point cloud of an object, SpectGRASP extracts contact points on the object's surface matching the hand configuration. It neither requires offline training nor a-priori object models. We propose a novel Binary Extended Gaussian Image (BEGI), which represents the point cloud surface normals of both object and robot fingers as signals on a 2-sphere. Spherical harmonics are then used to estimate the correlation between fingers and object BEGIs. The resulting spectral correlation density function provides a similarity measure of gripper and object surface normals. This is highly efficient in that it is simultaneously evaluated at all possible finger rotations in SO(3). A set of contact points are then extracted for each finger using rotations with high correlation values. We then use our previous work, Local Contact Moment (LoCoMo) similarity metric, to sequentially rank the generated grasps such that the one with maximum likelihood is executed. We evaluate the performance of SpectGRASP by conducting experiments with a 7-axis robot fitted with a parallel-jaw gripper, in a physics simulation environment. Obtained results indicate that the method not only can grasp individual objects, but also can successfully clear randomly organized groups of objects. The SpectGRASP method also outperforms the closest state-of-the-art method in terms of grasp generation time and grasp-efficiency.
... Although geometric methods may not be as fast as some learning-based methods, they are: easy to generalise for multi-fingered hands; work well for grasping novel objects with a-priori unknown geometries; and can handle heaps of objects in cluttered and unstructured environments [10]. Recent methods, based on shape similarities between hand fingers and object surfaces, have demonstrated remarkable results for robotic grasping of unknown objects [10]- [14]. ...
... Although geometric methods may not be as fast as some learning-based methods, they are: easy to generalise for multi-fingered hands; work well for grasping novel objects with a-priori unknown geometries; and can handle heaps of objects in cluttered and unstructured environments [10]. Recent methods, based on shape similarities between hand fingers and object surfaces, have demonstrated remarkable results for robotic grasping of unknown objects [10]- [14]. These methods synthesise grasp hypotheses by matching the surface regions of gripper fingers with similar surface patches on the object, to maximize the contact area during grasping. ...
... The sampled grasp candidates are then ranked or sorted by virtue of custom grasp ranking metrics to find the candidate with highest probability of success. Our previous work in this area presented a Local Contact Moment (LoCoMo) similarity metric-based grasp planner [10]. This method demonstrated high success rates in grasping individual unknown objects and also in clearing random cluttered heaps of objects. ...
Preprint
Full-text available
This paper presents a spectral correlation-based method (SpectGRASP) for robotic grasping of arbitrarily shaped, unknown objects. Given a point cloud of an object, SpectGRASP extracts contact points on the object's surface matching the hand configuration. It neither requires offline training nor a-priori object models. We propose a novel Binary Extended Gaussian Image (BEGI), which represents the point cloud surface normals of both object and robot fingers as signals on a 2-sphere. Spherical harmonics are then used to estimate the correlation between fingers and object BEGIs. The resulting spectral correlation density function provides a similarity measure of gripper and object surface normals. This is highly efficient in that it is simultaneously evaluated at all possible finger rotations in SO(3). A set of contact points are then extracted for each finger using rotations with high correlation values. We then use our previous work, Local Contact Moment (LoCoMo) similarity metric, to sequentially rank the generated grasps such that the one with maximum likelihood is executed. We evaluate the performance of SpectGRASP by conducting experiments with a 7-axis robot fitted with a parallel-jaw gripper, in a physics simulation environment. Obtained results indicate that the method not only can grasp individual objects, but also can successfully clear randomly organized groups of objects. The SpectGRASP method also outperforms the closest state-of-the-art method in terms of grasp generation time and grasp-efficiency.
... In this paper, we take a different approach for the design of a dual quaternion-based visual controller by explicitly integrating grasping into our visual servoing framework. With this aim, we explore our previous local contact moment (LoCoMo)-based unknown object grasping [16] to grasp arbitrarily moving objects in a 3D space. Firstly, LoCoMo will generate a ranked list of stable grasp poses on the perceived point cloud data of the scene object. ...
... Throughout this process, by using visual servoing instead of conventional path planning, we are able to accomplish moving object grasping. 1 Visual tracking from fiducial markers, 2 grasp planning from LoCoMo [16], 3 velocity regulator and robot joint space visual servo control, 4 joint limit avoidance as a secondary task projected at the robot null-space and 5 grasping and testing. The process starts with 1 , where we capture the object point cloud and an image to estimate initial object pose. ...
... Moving object grasping is performed by combining a grasp generator with the proposed visual servoing control scheme. As mentioned earlier, we have used LoCoMo-based grasp planner [16] to synthesise grasps on a task object. The advantage of using the LoCoMo grasp planner is twofold. ...
Preprint
Full-text available
This paper presents a new dual quaternion-based formulation for pose-based visual servoing. Extending our previous work on local contact moment (LoCoMo) based grasp planning, we demonstrate grasping of arbitrarily moving objects in 3D space. Instead of using the conventional axis-angle parameterization, dual quaternions allow designing the visual servoing task in a more compact manner and provide robustness to manipulator singularities. Given an object point cloud, LoCoMo generates a ranked list of grasp and pre-grasp poses, which are used as desired poses for visual servoing. Whenever the object moves (tracked by visual marker tracking), the desired pose updates automatically. For this, capitalising on the dual quaternion spatial distance error, we propose a dynamic grasp re-ranking metric to select the best feasible grasp for the moving object. This allows the robot to readily track and grasp arbitrarily moving objects. In addition, we also explore the robot null-space with our controller to avoid joint limits so as to achieve smooth trajectories while following moving objects. We evaluate the performance of the proposed visual servoing by conducting simulation experiments of grasping various objects using a 7-axis robot fitted with a 2-finger gripper. Obtained results demonstrate the efficiency of our proposed visual servoing.
... In this paper, we take a different approach for the design of a dual quaternion-based visual controller by explicitly integrating grasping into our visual servoing framework. With this aim, we explore our previous local contact moment (LoCoMo)-based unknown object grasping [16] to grasp arbitrarily moving objects in a 3D space. Firstly, LoCoMo will generate a ranked list of stable grasp poses on the perceived point cloud data of the scene object. ...
... Throughout this process, by using visual servoing instead of conventional path planning, we are able to accomplish moving object grasping. 1 Visual tracking from fiducial markers, 2 grasp planning from LoCoMo [16], 3 velocity regulator and robot joint space visual servo control, 4 joint limit avoidance as a secondary task projected at the robot null-space and 5 grasping and testing. The process starts with 1 , where we capture the object point cloud and an image to estimate initial object pose. ...
... Moving object grasping is performed by combining a grasp generator with the proposed visual servoing control scheme. As mentioned earlier, we have used LoCoMo-based grasp planner [16] to synthesise grasps on a task object. The advantage of using the LoCoMo grasp planner is twofold. ...
Conference Paper
Full-text available
This paper presents a new dual quaternion-based formulation for pose-based visual servoing. Extending our previous work on local contact moment (LoCoMo) based grasp planning, we demonstrate grasping of arbitrarily moving objects in 3D space. Instead of using the conventional axis-angle parameterization, dual quaternions allow designing the visual servoing task in a more compact manner and provide robustness to manipulator singularities. Given an object point cloud, LoCoMo generates a ranked list of grasp and pre-grasp poses, which are used as desired poses for visual servoing. Whenever the object moves (tracked by visual marker tracking), the desired pose updates automatically. For this, capitalising on the dual quaternion spatial distance error, we propose a dynamic grasp re-ranking metric to select the best feasible grasp for the moving object. This allows the robot to readily track and grasp arbitrarily moving objects. In addition, we also explore the robot null-space with our controller to avoid joint limits so as to achieve smooth trajectories while following moving objects. We evaluate the performance of the proposed visual servoing by conducting simulation experiments of grasping various objects using a 7-axis robot fitted with a 2-finger gripper. Obtained results demonstrate the efficiency of our proposed visual servoing.
... As an alternative to precise grasp planning computation, with complex fully actuated robotic hands, under-actuated gripper designs have also been proposed, which mechanically adapt to different object shapes [14]. Grasp planning algorithms for unknown objects can be categorized into global grasping approaches [15] considering the whole object to find the best grasp, and local grasping approaches [16], [17] using partial data from the object, e.g. local contact moments [17], or a hierarchical representation using edge and texture information to generate grasps from only visible parts of objects that may cause failures [16]. ...
... Grasp planning algorithms for unknown objects can be categorized into global grasping approaches [15] considering the whole object to find the best grasp, and local grasping approaches [16], [17] using partial data from the object, e.g. local contact moments [17], or a hierarchical representation using edge and texture information to generate grasps from only visible parts of objects that may cause failures [16]. In general, global approaches are preferred for multi-finger grasping [13]. ...
Preprint
Full-text available
This paper addresses the problem of simultaneously exploring an unknown object to model its shape, using tactile sensors on robotic fingers, while also improving finger placement to optimise grasp stability. In many situations, a robot will have only a partial camera view of the near side of an observed object, for which the far side remains occluded. We show how an initial grasp attempt, based on an initial guess of the overall object shape, yields tactile glances of the far side of the object which enable the shape estimate and consequently the successive grasps to be improved. We propose a grasp exploration approach using a probabilistic representation of shape, based on Gaussian Process Implicit Surfaces. This representation enables initial partial vision data to be augmented with additional data from successive tactile glances. This is combined with a probabilistic estimate of grasp quality to refine grasp configurations. When choosing the next set of finger placements, a bi-objective optimisation method is used to mutually maximise grasp quality and improve shape representation during successive grasp attempts. Experimental results show that the proposed approach yields stable grasp configurations more efficiently than a baseline method, while also yielding improved shape estimate of the grasped object.
... As an alternative to precise grasp planning computation, with complex fully actuated robotic hands, under-actuated gripper designs have also been proposed, which mechanically adapt to different object shapes [14]. Grasp planning algorithms for unknown objects can be categorized into global grasping approaches [15] considering the whole object to find the best grasp, and local grasping approaches [16], [17] using partial data from the object, e.g. local contact moments [17], or a hierarchical representation using edge and texture information to generate grasps from only visible parts of objects that may cause failures [16]. ...
... Grasp planning algorithms for unknown objects can be categorized into global grasping approaches [15] considering the whole object to find the best grasp, and local grasping approaches [16], [17] using partial data from the object, e.g. local contact moments [17], or a hierarchical representation using edge and texture information to generate grasps from only visible parts of objects that may cause failures [16]. In general, global approaches are preferred for multi-finger grasping [13]. ...
Article
Full-text available
This paper addresses the problem of simultaneously exploring an unknown object to model its shape, using tactile sensors on robotic fingers, while also improving finger placement to optimise grasp stability. In many situations, a robot will have only a partial camera view of the near side of an observed object, for which the far side remains occluded. We show how an initial grasp attempt, based on an initial guess of the overall object shape, yields tactile glances of the far side of the object which enable the shape estimate and consequently the successive grasps to be improved. We propose a grasp exploration approach using a probabilistic representation of shape, based on Gaussian Process Implicit Surfaces. This representation enables initial partial vision data to be augmented with additional data from successive tactile glances. This is combined with a probabilistic estimate of grasp quality to refine grasp configurations. When choosing the next set of finger placements, a bi-objective optimisation method is used to mutually maximise grasp quality and improve shape representation during successive grasp attempts. Experimental results show that the proposed approach yields stable grasp configurations more efficiently than a baseline method, while also yielding improved shape estimate of the grasped object.
... In robotic grasping, grasp selection is usually complex as its stability takes into account different factors such as gripper geometry, object shape, mass, and surface friction. Although methods have been proposed to grasp without needing the estimation of physical properties [24], most common approaches for grasping rely on metrics based on physical properties to evaluate grasp quality. In particular, a set of best grasps are sampled and evaluated from a grasp sampling and evaluation algorithm. ...
Article
Full-text available
Accurately modeling local surface properties of objects is crucial to many robotic applications, from grasping to material recognition. Surface properties like friction are however difficult to estimate, as visual observation of the object does not convey enough information over these properties. In contrast, haptic exploration is time consuming as it only provides information relevant to the explored parts of the object. In this work, we propose a joint visuo-haptic object model that enables the estimation of surface friction coefficient over an entire object by exploiting the correlation of visual and haptic information, together with a limited haptic exploration by a robotic arm. We demonstrate the validity of the proposed method by showing its ability to estimate varying friction coefficients on a range of real multi-material objects. Furthermore, we illustrate how the estimated friction coefficients can improve grasping success rate by guiding a grasp planner toward high friction areas.
... Nowadays, manipulation has become an increasingly important standing research topic in robotics. Most of the related works in this field consider the grasping of rigid bodies as an extensively studied area, which is rich with theoretical analysis and implementations using different robotic hands ( [1][2][3][4][5][6][7][8]). Robotic grasping of deformable objects has also acquired importance recently due to several potential applications in various areas, including biomedical processing, the food processing industry, service robotics, robotized surgery, etc. ( [9][10][11][12][13][14]). ...
Article
Full-text available
In the grasping and manipulation of 3D deformable objects by robotic hands, the physical contact constraints between the fingers and the object have to be considered in order to validate the robustness of the task. Nevertheless, previous works rarely establish contact interaction models based on these constraints that enable the precise control of forces and deformations during the grasping process. This paper considers all steps of the grasping process of deformable objects in order to implement a complete grasp planning pipeline by computing the initial contact points (pregrasp strategy), and later, the contact forces and local deformations of the contact regions while the fingers close over the grasped object (grasp strategy). The deformable object behavior is modeled using a nonlinear isotropic mass-spring system, which is able to produce potential deformation. By combining both models (the contact interaction and the object deformation) in a simulation process, a new grasp planning method is proposed in order to guarantee the stability of the 3D grasped deformable object. Experimental grasping experiments of several 3D deformable objects with a Barrett hand (3-fingered) and a 6-DOF industrial robotic arm are executed. Not only will the final stable grasp configuration of the hand + object system be obtained, but an arm + hand approaching strategy (pregrasp) will also be computed.
... The task to perform plays an important role in such choice (Ansuini et al., 2006(Ansuini et al., , 2008Feix et al., 2014a;Vergara et al., 2014;Hjelm et al., 2015;Detry et al., 2017;Cini et al., 2019). The robotics community proposed grasping strategies (Adjigble et al., 2018;Morrison et al., 2020) whose success is defined using traditional metrics, such as stability (Bicchi and Kumar, 2000), and speed (Mahler et al., 2018). However, task-oriented grasping has gained momentum recently, especially thanks to improved techniques in vision and learning (Do et al., 2018;Cavalli et al., 2019), and metrics shaped by the task . ...
Article
Full-text available
Task-aware robotic grasping is critical if robots are to successfully cooperate with humans. The choice of a grasp is multi-faceted; however, the task to perform primes this choice in terms of hand shaping and placement on the object. This grasping strategy is particularly important for a robot companion, as it can potentially hinder the success of the collaboration with humans. In this work, we investigate how different grasping strategies of a robot passer influence the performance and the perceptions of the interaction of a human receiver. Our findings suggest that a grasping strategy that accounts for the subsequent task of the receiver improves substantially the performance of the human receiver in executing the subsequent task. The time to complete the task is reduced by eliminating the need of a post-handover readjustment of the object. Furthermore, the human perceptions of the interaction improve when a task-oriented grasping strategy is adopted. The influence of the robotic grasp strategy increases as the constraints induced by the object's affordances become more restrictive. The results of this work can benefit the wider robotics community, with application ranging from industrial to household human-robot interaction for cooperative and collaborative object manipulation.
Conference Paper
Full-text available
We present early pilot-studies of a new international project, developing advanced robotics to handle nuclear waste. Despite enormous remote handling requirements, there has been remarkably little use of robots by the nuclear industry. The few robots deployed have been directly teleoperated in rudimentary ways, with no advanced control methods or autonomy. Most remote handling is still done by an aging workforce of highly skilled experts, using 1960s style mechanical Master-Slave devices. In contrast, this paper explores how novice human operators can rapidly learn to control modern robots to perform basic manipulation tasks; also how autonomous robotics techniques can be used for operator assistance, to increase throughput rates, decrease errors, and enhance safety. We compare humans directly teleoperating a robot arm, against human-supervised semi-autonomous control exploiting computer vision, visual servoing and autonomous grasping algorithms. We show how novice operators rapidly improve their performance with training; suggest how training needs might scale with task complexity; and demonstrate how advanced autonomous robotics techniques can help human operators improve their overall task performance. An additional contribution of this paper is to show how rigorous experimental and analytical methods from human factors research, can be applied to perform principled scientific evaluations of human test-subjects controlling robots to perform practical manipulative tasks.
Article
Full-text available
This paper shows how a robot arm can follow and grasp moving objects tracked by a vision system, as is needed when a human hands over an object to the robot during collaborative working. While the object is being arbitrarily moved by the human co-worker, a set of likely grasps, generated by a learned grasp planner, are evaluated online to generate a feasible grasp with respect to both: the current configuration of the robot respecting the target grasp; and the constraints of finding a collision-free trajectory to reach that configuration. A task-based cost function enables relaxation of motion-planning constraints, enabling the robot to continue following the object by maintaining its end-effector near to a likely pre-grasp position throughout the object's motion. We propose a method of dynamic switching between: a local planner, where the hand smoothly tracks the object, maintaining a steady relative pre-grasp pose; and a global planner, which rapidly moves the hand to a new grasp on a completely different part of the object, if the previous graspable part becomes unreachable. Various experiments are conducted using a real collaborative robot and the obtained results are discussed.
Article
Full-text available
This paper addresses the problems of both general and also fine-grained human action recognition in video sequences. Compared with general human actions, fine-grained action information is more difficult to detect and occupies relatively small-scale image regions. Our work seeks to improve fine-grained action discrimination, while also retaining the ability to perform general action recognition. Our method first estimates human pose and human parts positions in video sequences by extending our recent work on human pose tracking, and crops different scaled patches to obtain richer action information in a variety of different scales of appearance and motion cues. We then utilize a Convolutional Neural Network (CNN) to process each such image patch. Instead of using the output one dimension feature from the full-connection layer, we utilize the outputs of the pooling layer of CNN structure, which contains more spatial information. Then the high dimension of the pooling features is reduced by encoding, to generate the final human action descriptors for classification. Our method reduces feature dimension while also effectively combining appearance and motion information in a unified framework. We have carried out empirical experiments using two publicly available human action datasets, comparing the human action recognition result of our algorithm against six recent state-of-the-art methods from the literature. The results suggest comparatively strong performance of our method.
Article
Full-text available
We consider the problem of detecting robotic grasps in an RGB-D view of a scene containing objects. In this work, we apply a deep learning approach to solve this problem, which avoids time-consuming hand-design of features. This presents two main challenges. First, we need to evaluate a huge number of candidate grasps. In order to make detection fast, as well as robust, we present a two-step cascaded structure with two deep networks, where the top detections from the first are re-evaluated by the second. The first network has fewer features, is faster to run, and can effectively prune out unlikely candidate grasps. The second, with more features, is slower but has to run only on the top few detections. Second, we need to handle multimodal inputs well, for which we present a method to apply structured regularization on the weights based on multimodal group regularization. We demonstrate that our method outperforms the previous state-of-the-art methods in robotic grasp detection, and can be used to successfully execute grasps on a Baxter robot.
Article
Full-text available
Recently, a number of grasp detection methods have been proposed that can be used to localize robotic grasp configurations directly from sensor data without estimating object pose. The underlying idea is to treat grasp perception analogously to object detection in computer vision. These methods take as input a noisy and partially occluded RGBD image or point cloud and produce as output pose estimates of viable grasps, without assuming a known CAD model of the object. Although these methods generalize grasp knowledge to new objects well, they have not yet been demonstrated to be reliable enough for wide use. Many grasp detection methods achieve grasp success rates (grasp successes as a fraction of the total number of grasp attempts) between 75% and 95% for novel objects presented in isolation or in light clutter. Not only are these success rates too low for practical grasping applications, but the light clutter scenarios that are evaluated often do not reflect the realities of real world grasping. This paper proposes a number of innovations that together result in a significant improvement in grasp detection performance. The specific improvement in performance due to each of our contributions is quantitatively measured either in simulation or on robotic hardware. Ultimately, we report a series of robotic experiments that average a 93% end-to-end grasp success rate for novel objects presented in dense clutter.
Conference Paper
Full-text available
We present early pilot-studies of a new international project, developing advanced robotics to handle nuclear waste. Despite enormous remote handling requirements, there has been remarkably little use of robots by the nuclear industry. The few robots deployed have been directly teleoperated in rudimentary ways, with no advanced control methods or autonomy. Most remote handling is still done by an aging workforce of highly skilled experts, using 1960s style mechanical Master-Slave devices. In contrast, this paper explores how novice human operators can rapidly learn to control modern robots to perform basic manipulation tasks; also how autonomous robotics techniques can be used for operator assistance, to increase through-put rates, decrease errors, and enhance safety. We compare humans directly teleoperating a robot arm, against human-supervised semi-autonomous control exploiting computer vision, visual servoing and autonomous grasping algorithms. We show how novice operators rapidly improve their performance with training; suggest how training needs might scale with task complexity ; and demonstrate how advanced autonomous robotics techniques can help human operators improve their overall task performance. An additional contribution of this paper is to show how rigorous experimental and analytical methods from human factors research, can be applied to perform principled scientific evaluations of human test-subjects controlling robots to perform practical manipulative tasks.
Article
Full-text available
This paper addresses the problem of online tracking of articulated human body poses in dynamic environments. Many previous approaches perform poorly in realistic applications: often future frames or entire sequences are used anticausally to mutually refine the poses in each individual frame, making online tracking impossible; tracking often relies on strong assumptions about e.g. clothing styles, body-part colours and constraints on body-part motion ranges, limiting such algorithms to a particular dataset; the use of holistic feature models limits the ability of optimisation-based matching to distinguish between pose errors of different body parts. We overcome these problems by proposing a coupled-layer framework, which uses the previous notions of deformable structure (DS) puppet models. The underlying idea is to decompose the global pose candidate in any particular frame into several local parts to obtain a refined pose. We introduce an adaptive penalty with our model to improve the searching scope for a local part pose, and also to overcome the problem of using fixed constraints. Since the pose is computed using only current and previous frames, our method is suitable for online sequential tracking. We have carried out empirical experiments using three different public benchmark datasets, comparing two variants of our algorithm against four recent state-of-the-art (SOA) methods from the literature. The results suggest comparatively strong performance of our method, regardless of weaker constraints and fewer assumptions about the scene, and despite the fact that our algorithm is performing online sequential tracking, whereas the comparison methods perform mutual optimisation backwards and forwards over all frames of the entire video sequence.
Book
A Mathematical Introduction to Robotic Manipulation presents a mathematical formulation of the kinematics, dynamics, and control of robot manipulators. It uses an elegant set of mathematical tools that emphasizes the geometry of robot motion and allows a large class of robotic manipulation problems to be analyzed within a unified framework. The foundation of the book is a derivation of robot kinematics using the product of the exponentials formula. The authors explore the kinematics of open-chain manipulators and multifingered robot hands, present an analysis of the dynamics and control of robot systems, discuss the specification and control of internal forces and internal motions, and address the implications of the nonholonomic nature of rolling contact are addressed, as well. The wealth of information, numerous examples, and exercises make A Mathematical Introduction to Robotic Manipulation valuable as both a reference for robotics researchers and a text for students in advanced robotics courses.
Conference Paper
Multi-fingered robot grasping is a challenging problem that is difficult to tackle using hand-coded programs. In this paper we present an imitation learning approach for learning and generalizing grasping skills based on human demonstrations. To this end, we split the task of synthesizing a grasping motion into three parts: (1) learning efficient grasp representations from human demonstrations, (2) warping contact points onto new objects, and (3) optimizing and executing the reach-and-grasp movements. We learn low-dimensional latent grasp spaces for different grasp types, which form the basis for a novel extension to dynamic motor primitives. These latent-space dynamic motor primitives are used to synthesize entire reach-and-grasp movements. We evaluated our method on a real humanoid robot. The results of the experiment demonstrate the robustness and versatility of our approach.