Conference PaperPDF Available

An Intuitive, Affordances Oriented Telemanipulation Framework for a Dual Robot Arm Hand System: On the Execution of Bimanual Tasks

Authors:

Abstract and Figures

The concept of teleoperation has been studied since the advent of robotics and has found use in a wide range of applications, including exploration of remote or dangerous environments (e.g., space missions, disaster management), telepresence based time optimisation (e.g., remote surgery) and robot learning. While a significant amount of research has been invested into the field, intricate manipulation tasks still remain challenging from the user perspective due to control complexity. In this paper, we propose an intuitive, affordances oriented telemanipulation framework for a dual robot arm hand system. An object recognition module is utilised to extract scene information and provide grasping and manipulation assistance to the user, simplifying the control of adaptive, multi-fingered hands through a commercial Virtual Reality (VR) interface. The system's performance was experimentally validated in a remote operation setting, where the user successfully performed a set of bimanual manipulation tasks.
Content may be subject to copyright.
An Intuitive, Affordances Oriented Telemanipulation Framework for a
Dual Robot Arm Hand System: On the Execution of Bimanual Tasks
Gal Gorjup, Anany Dwivedi, Nathan Elangovan, and Minas Liarokapis
Abstract The concept of teleoperation has been studied
since the advent of robotics and has found use in a wide range
of applications, including exploration of remote or danger-
ous environments (e.g., space missions, disaster management),
telepresence based time optimisation (e.g., remote surgery) and
robot learning. While a significant amount of research has
been invested into the field, intricate manipulation tasks still
remain challenging from the user perspective due to control
complexity. In this paper, we propose an intuitive, affordances
oriented telemanipulation framework for a dual robot arm hand
system. An object recognition module is utilised to extract scene
information and provide grasping and manipulation assistance
to the user, simplifying the control of adaptive, multi-fingered
hands through a commercial Virtual Reality (VR) interface. The
system’s performance was experimentally validated in a remote
operation setting, where the user successfully performed a set
of bimanual manipulation tasks.
I. INTRODUCTION
Robot arm hand systems are reaching remarkable levels of
speed and accuracy, making them invaluable for applications
that require precise and repetitive manipulation in unstruc-
tured environments. Apart from working independently, such
systems may also be configured to serve as extensions of
human operators in scenarios where autonomous operation
is infeasible or undesired. In such teleoperation frameworks,
full or partial control of the robot agent is granted to the
remote user, allowing them to accomplish their intent through
the agent. Thus, human guidance enables the agent to solve
complex tasks that would not have been feasible to attempt
through autonomous system development.
Teleoperation has proven its effectiveness in a wide range
of applications that require human skill without their phys-
ical presence. For operations in hazardous or inaccessible
environments, robot deployment is considerably safer and
more practical, compared to human expeditions. In NASA’s
Robonaut project [1], for example, a teleoperated robot was
developed for the purposes of space station maintenance.
After the Fukushima Daiichi nuclear disaster in 2011, the
plant environment was too dangerous for recovery through
direct human involvement and several remotely operated
robots were considered for the missions [2]. Apart from
environmental factors, teleoperation may also be desired for
the objective of reducing travel-related time delays and in-
creasing operator efficiency. An application with high impact
in healthcare is remote surgery, where expert knowledge
Gal Gorjup, Anany Dwivedi, Nathan Elangovan, and Minas
Liarokapis are with the New Dexterity research group, Department
of Mechanical Engineering, The University of Auckland, New Zealand.
E-mails: ggor290@aucklanduni.ac.nz, adwi592@aucklanduni.ac.nz,
sela886@aucklanduni.ac.nz, minas.liarokapis@auckland.ac.nz
can be applied over great distances, with no time spent on
transport [3], [4]. Other examples also include teleopera-
tion for more commercial purposes such as communication,
telepresence, or mobile manipulation in home environments
[5]. Opposed to purely practical applications, remote robot
operation is also frequently used in research, namely as a
means of providing examples in various learning frameworks
[6], [7]. In this context, the data is recorded on the robot
itself while the human operator compensates for differences
in kinematics and dynamics of the platform. The learning
framework is therefore not burdened with considering ad-
ditional embodiment mappings, which generally results in
shorter training times and better results.
In manipulation tasks, it is most intuitive for the user
to operate a bimanual system that maps to their left and
right hand. With two arms, the ability of the system to
execute complex tasks drastically increases, especially when
equipped with appropriately dexterous hands. While mapping
human motion to robot arms in 3D space is relatively
straightforward for most applications, issues often arise when
it comes to controlling hands and grippers with several
degrees of freedom (DoF). Various mapping strategies have
been proposed to tackle this challenge [8]–[11], but most
of them require human finger pose data captured through
expensive sensory systems. Alternatively, when using com-
mercial, widely accessible controllers, managing dexterous
hands often becomes difficult, tedious or both. Attempting to
directly map individual hand actuators to the limited interface
inputs is not feasible and would most likely confuse the user.
On the other hand, defining a fixed set of grasp primitives
requires the operator to comb through the grasp types in
order to find one appropriate for the target task.
This work presents an assistive, affordance oriented tele-
operation framework aimed at simplifying remote bimanual
manipulation with dexterous hands using commercial virtual
reality controllers. Instead of defining a static library of
hand-specific grasp types, the framework relies on visual
object recognition to propose a set of affordances to the
user. Based on the desired task and object in focus, the
system determines an appropriate grasp type that enables
successful task execution with minimal input from the user.
The framework was tested and experimentally validated to
determine its suitability in practical applications.
The rest of this paper is organised as follows: Section II
introduces the related work in this field, Section III presents
the developed framework and the experimental setup, Section
IV presents the obtained results, while Section V concludes
the work and discusses future directions.
II. REL ATED WORK
As one of the earliest aspects of robotics, the field
of teleoperation has seen significant development since its
introduction in the early 1950s. In the context of remote
manipulation with arm hand systems, teleoperation was in
the early stages explored mostly from a control theoretic
perspective [12], [13]. Interfaces were generally implemented
in the form of classic I/O devices (joysticks, keyboards and
monitors) or master/slave robots, where the master module
often kinematically resembled the slave to provide intuitive
control to the user. As technology progressed, the operated
system functionality expanded through increased robot hand
dexterity, arm range of motion and number of manipulators.
To maximise control efficiency and convenience, alternative
interfacing and control options were explored and adapted to
the novel frameworks.
The standard solution for intuitively guiding a robot arm is
mapping operator motion to the tool position and orientation.
Examples employ inertial [14] or magnetic [15] motion
sensors to track the human hand and achieve stable arm
control. The above works also recorded elbow pose of the
human arm and used it to grant a degree of anthropo-
morphism to the robot motion. Attempts with no sensors
mounted on the arm were also explored, but the human
motion data obtained through a vision system was noisy
and not as reliable [16]. Concerning end-effector control, a
trivial parallel gripper may be managed by a simple button or
switch accessible by the operator. For frameworks employing
dexterous, possibly anthropomorphic hands, a popular choice
are data gloves which may track finger poses, forces and
wrist orientation [1], [17]. A solution that combines user
hand pose tracking and a rich robot hand control potential
came in the form of Virtual Reality (VR) interfaces, which
were promptly considered for telemanipulation [18]. Some
recent examples include utilising the Leap Motion tracking
system to obtain hand and finger poses and use them in a
gesture-based, VR powered teleoperation framework [19]. In
[20], the authors used the commercial HTC Vive VR system
to collect teleoperation data in a learning framework for a
PR2 robot.
The natural inconsistencies and imperfections of human
motion propagate through the tracking system to produce a
jittery and, depending on the choice of sensor, noisy signal.
If left uncompensated, these errors lead to unstable robot
motion which can hinder execution of precise tasks and
frustrate the user. To account for this, the concept of shared
control and assistive teleoperation was introduced, where the
robot operates with a degree of autonomy to reduce user
effort. Early assistive frameworks assumed to know the user’s
intent [21], which later evolved into classifying motion into
a predefined set of paths or behaviours to aid with execution
of the task [22], [23]. More recent approaches focused on
predicting arbitrary goals in real-world environments which
expanded the application range [24], [25]. It is worth noting
that the bulk of work in assistive teleoperation targeted move-
ment compensation and neglected any support with grasping
and manipulation. Even though the topic of grasp detection
and synthesis has received much attention in the robotics
research sphere, it has yet to be successfully integrated in
remote operation frameworks. This paper moves towards
manipulation aid in teleoperation by presenting a framework
with an incorporated affordance system that simplifies con-
trol of multi-DoF hands with commercial controllers.
Affordances can be described as the sum of "all action
possibilities" for a given object [26]. They are characterised
by the properties of the object and robot to determine the pos-
sible interactions between them [27]. Affordances can either
be defined explicitly, where only the object attributes are con-
sidered, or implicitly, where information about the action and
outcomes is incorporated (object-robot interaction). Studying
explicit affordances, the work of [28] extended the affordance
attributes to identify visual cues (such as handles) as suitable
interaction points. A later work [29] proposed a system for
detection of functional object attributes (such as "liftable")
based on affordance cues ("handle") from image features. In
[27], the authors introduced visual object grouping (balls,
boxes, etc.) as an intermediate step that enabled scalability
of their affordance methodology. Such explicit affordances
only offer object attributes, without considering robot capa-
bilities. Contrary to this, studies such as [30], [31] utilised
the implicit affordance representation that involves mapping
object and robot affordances to predicted outcomes that can
be directly used for planning and control. Authors of [31]
proposed object-action complexes to map representational
differences between the high level intelligence planning and
low level robot control. This was achieved by pairing the
transition states of object-action pairs as an instantiated state
transition fragment (ISFT). In [30], the relationships between
the actions, objects and effects were encoded in affordances,
whose learning and usage was then discussed in detail.
III. MET HOD S
A. Hardware and Framework
Concerning hardware, the system was based on two 6-DoF
serial manipulators by Universal Robots (UR5 and UR10),
equipped with adaptive robot hands developed by the New
Dexterity research group at the University of Auckland [32].
The control interface was implemented with a commercial
virtual reality system (HTC Vive).
The software side of the framework was created within
the Robot Operating System (ROS) [33], which provided the
necessary communication, testing and visualisation utilities.
The framework layout is presented in Fig. 1, where the
blocks conceptually correspond to the implemented ROS
node architecture. For clarity, the diagram only presents the
architecture corresponding to a single arm hand system. For
bimanual operation, the arm and hand control capability was
efficiently extended.
A VR interface module provides connectivity to the HTC
Vive system, tracking the controller poses and button states,
in addition to offering haptic feedback functionality through
controller vibration. The tracked controller position, orienta-
tion and button states are passed to the Pose Mapping and
Fig. 1. Framework architecture for the proposed methodology. The robot arm trajectories are controlled using the Vive controllers. The objects in proximity
to the robot arms are identified using a camera-based environment sensing module. The grasp affordances of the objects closest to the end-effector are
displayed on the visual interface. A particular grasp is selected and executed by the user.
Fig. 2. Example subset of the object set used for affordance analysis.
Inverse Kinematics node which maps them to the robot end-
effectors and computes the corresponding inverse kinematics
(IK). A safety mechanism is incorporated into the module
and the controller pose is only forwarded to the robot if
an "enable" button is held on the controller. The robot tool
position follows the relative controller offset with respect to
the initial state where tracking was enabled through the safety
switch, while orientation tracking is absolute. The computed
robot configuration qis forwarded to the Arm Controller,
which is based on the UR modern driver [34] and ROS
Control [35] packages. To enable smooth and responsive
real-time pose tracking, a closed-loop velocity controller for
the UR robots was implemented as the joint speed commands
provide best performance with the used robot models and
control boxes [36]. The controller loops at 125 Hz, which
corresponds to the maximum frequency allowed by the UR
system real-time interface.
Affordance analysis is based on object detection per-
formed on a 2D video stream of the teleoperated system
workspace. Images from an HD webcam are passed to the
Object Detection module employing a pre-trained deep Con-
volutional Neural Network (CNN) on common household
items (Tensorflow ssd_mobilenet_v1_coco model [37]) to
extract the labels and bounding boxes for objects of interest.
These are processed with methods described in Section III-
A to produce a grasp appropriate for the task that the
user wishes to execute. The Hand Controller translates the
obtained grasp into positions and velocities of motors used in
the robot hands. To provide the user with a sense of executed
grasp strength, hand motor current is mapped to vibration of
the VR controllers. The webcam feed with overlaid object
bounding boxes and affordance lists is streamed to a remote
screen visible to the operator. A simple selection mechanism
was implemented within the affordance analysis module,
allowing the user to loop through and select the desired
task through buttons on the VR controllers. The selected
option is highlighted in the affordance list, where the default
selection was for every object set to a neutral rest state to
avoid accidental triggering of the grasping motion.
Fig. 3. Bimanual telemanipulation with the developed framework. The user
guides robot arms with VR controllers, receiving visual and haptic feedback.
The framework offers grasping and manipulation assistance in the form of
an affordance menu, where the user selects the option corresponding to the
desired task.
TABLE I
FUN CTIO NAL AN D GRA SP AFFO RDAN CES O F THE EXA MP LE
OBJECT SET.
Objects Functional Affordance Grasp Affordance
Apple
Move
- Side Power
- Top Pinch
Cup
Move
- Side Power
- Top Pinch
Drink Power
Sports Ball Move Tripod
Throw Spherical
Bottle
Move
- Side Power
- Top Pinch
Drink Power
Mouse Move Palm
Click Palm
B. Object Affordance Analysis
The implicit approach to affordance analysis (refer to
Section II) is used when linking the object functionality and
robot grasp affordances to find actions that can be performed
on the object by the robot hand. For this proof of concept,
the object and grasp affordances were defined manually and
an example subset for the objects examined in Fig. 2 is
presented in Table I.
Following object detection through methods presented
in Section III-A, the objects are filtered to find the ones
with which the user most likely intends to interact. This is
achieved by projecting the current robot end-effector pose
on to the camera image and comparing the distances of
the detected objects to the tool projection (Algorithm 1).
Once the closest objects for each hand have been found,
the database is queried for the appropriate set of functional
affordances. These are presented in a tree structure to the
user, who chooses the desired functionality through the inter-
face. The offered options also include information regarding
the corresponding grasp type so the user can account for it
when deciding on the angle of hand approach. For example,
if the robot recognises an open water bottle, the bottle-cap
and a glass, it might provide the functionality affordances
of "drink", "pour" or "close" to the user, as well as their
corresponding grasp types. If the user chooses to "close",
the framework selects the power grasp to hold the bottle
with the left arm, while the pinch grasp is proposed for the
right arm picking up the cap.
Algorithm 1: Affordance analysis: Linking detected
objects to hands
ob jects =Find Ob jects(video_stream,threshold);
for hand in [le f t _hand,right_hand]do
hand_pos =Pro ject ToolToI mage(tool _pose);
closest =ob ject s[0];
dmin =Distance(closest,hand _pos);
for ob ject in ob jects do
d=Distance(ob ject ,hand_pos);
if d<dmin then
closest =ob ject ;
dmin =d;
end
end
a f f ordances =Map(closest,dat abase[hand]);
Displ ay(a f f ord ances,video_stream);
end
C. Experiments
In the first stage, the pose tracking capabilities of the
developed framework were investigated by recording and ex-
amining the desired and actual pose of the robot end-effector.
The operator first performed a one-dimensional motion with
varying frequency which was used to characterise system lag.
This was followed by arbitrary 2D and 3D motion which was
examined in terms of tracking error. The second experiment
was aimed at testing the affordance analysis and object-to-
hand linking. A selection of objects from the set presented
in Table I was placed on the table, in clear view of the
camera. One of the hands was then hovered over each of the
objects while observing its matching affordance list. Through
this, the object recognition, robot tool-to-image projection
and affordance linking was verified.
The final set of experiments utilised the entire framework
in a remote manipulation setting. The VR system and visual
interface were set up in a remote location, streaming the VR
controller data and receiving the video feed. The operator
was in control of both robot arms and hands, monitoring
them on the feedback screen (Fig. 3). Two tasks were
performed in this context:
1) Pick and place: A ball was placed at an arbitrary
position on the table and the operator was asked to
pick it up choosing a grasp from the different available
affordance options.
2) Pouring: A bottle filled with small loose components
and a cup were placed in the workspace and the
Fig. 4. Tracking of the controller motion by the UR robot. Subfigure A shows the target and actual position for one-dimensional periodic motion. Subfigure
B shows the tracking for a two-dimensional trajectory, while subfigure C shows the trajectory tracking in a three-dimensional motion.
Fig. 5. Linking object affordances to current robot end-effector pose. Robot tool pose is projected to the camera image and the affordances of the closest
detected object are presented in the selection menu.
operator was asked to pour the contents of the bottle
into the cup.
During task execution, the user and robot actions were
recorded and compiled into a video.
IV. RES U LTS A ND DISCUSSION
Fig. 4 presents results of the teleoperation tracking ex-
periment. In the one-dimensional case, it is visible that the
robot tool smoothly follows the controller with a lag of
roughly 0.3 s. This allows stable control as long as the user
does not perform movement exceeding the critical frequency.
Observing the 2D and 3D motions, it is visible that some
amount of tracking error is present, although it is for this
application insignificant, as it can be easily compensated by
the user through visual feedback. Fig. 5 shows snapshots of
a robot arm hovering over a set of objects in the workspace.
As the hand moves close to an object, the system assumes
the user intends to interact with it and responds accordingly.
The displayed object affordance menu updates and changes
to the options available for the closest object. Concerning
the final set of experiments, the results are best observed in
the accompanying video available also in HD quality at the
following URL:
www.newdexterity.org/telemanipulation
The clip highlights the advantages of using an assistive,
affordance based telemanipulation framework for simple
bimanual tasks. The tasks were successfully completed with
minimal effort from the operator’s side.
V. CONCLUSION
This paper presented an affordances oriented, assistive
telemanipulation framework implemented on a dual robot
arm hand system. A commercially available VR system was
used for guiding the arms and interfacing with the affordance
selection menu which provided grasping and manipulation
assistance. The framework was successfully tested in a
remote manipulation setting, where a number of bimanual
tasks were executed by the operator.
Regarding future work, several aspects of the concept can
be expanded and improved. Currently, the system relies on a
manually defined, static affordance database and can not pro-
vide the affordances of completely new objects. This could
be solved by implementing a scalable knowledge base that
employs a set of inference rules to estimate affordances of
previously unseen objects. System control could be improved
by including assistance in the form of local motion correction
for object grasping, effectively shrinking the dimensionality
of the user control space. The user experience could also be
enhanced by enforcing anthropomorphism of the the robot
motion and by embedding the control interface into the video
stream using augmented reality approaches. Further work
should also be invested into system validation and compari-
son with alternative solutions from an ergonomic viewpoint.
A user experience survey or an appropriate benchmark would
provide valuable information regarding the system’s ease of
use and its intuitiveness of operation.
REFERENCES
[1] R. O. Ambrose, H. Aldridge, R. S. Askew, R. R. Burridge, W. Blueth-
mann, M. Diftler, C. Lovchik, D. Magruder, and F. Rehnmark,
“Robonaut: NASAs space humanoid,” IEEE Intelligent Systems and
their Applications, vol. 15, no. 4, pp. 57–63, July 2000.
[2] S. Kawatsuma, M. Fukushima, and T. Okada, “Emergency response by
robots to Fukushima-Daiichi accident: summary and lessons learned,”
Industrial Robot: An International Journal, vol. 39, no. 5, pp. 428–
435, 2012.
[3] C. Meng, T. Wang, W. Chou, S. Luan, Y. Zhang, and Z. Tian, “Remote
surgery case: robot-assisted teleneurosurgery,” in IEEE International
Conference on Robotics and Automation, 2004. Proceedings. ICRA
’04. 2004, vol. 1, April 2004, pp. 819–823 Vol.1.
[4] G. T. Sung and I. S. Gill, “Robotic laparoscopic surgery:
a comparison of the da Vinci and Zeus systems,Urology,
vol. 58, no. 6, pp. 893–898, 2001. [Online]. Available: http:
//www.sciencedirect.com/science/article/pii/S0090429501014236
[5] M. Ciocarlie, K. Hsiao, A. Leeper, and D. Gossow, “Mobile manipula-
tion through an assistive home robot,” in 2012 IEEE/RSJ International
Conference on Intelligent Robots and Systems, Oct 2012, pp. 5313–
5320.
[6] B. D. Argall, S. Chernova, M. Veloso, and B. Browning, “A Survey
of Robot Learning from Demonstration,” Robotics and Autonomous
Systems, vol. 57, no. 5, pp. 469–483, May 2009. [Online]. Available:
http://dx.doi.org/10.1016/j.robot.2008.10.024
[7] Z. Zhu and H. Hu, “Robot Learning from Demonstration in Robotic
Assembly: A Survey,Robotics, vol. 7, no. 2, 2018. [Online].
Available: http://www.mdpi.com/2218-6581/7/2/17
[8] H. Hu, X. Gao, J. Li, J. Wang, and H. Liu, “Calibrating human hand for
teleoperating the HIT/DLR hand,” in IEEE International Conference
on Robotics and Automation, 2004. Proceedings. ICRA ’04. 2004,
vol. 5, April 2004, pp. 4571–4576 Vol.5.
[9] M. Ciocarlie, C. Goldfeder, and P. Allen, “Dimensionality reduction
for hand-independent dexterous robotic grasping,” in 2007 IEEE/RSJ
International Conference on Intelligent Robots and Systems (IROS),
Oct 2007, pp. 3270–3275.
[10] L. Pao and T. H. Speeter, “Transformation of human hand positions for
robotic hand control,” in Proceedings, 1989 International Conference
on Robotics and Automation, May 1989, pp. 1758–1763 vol.3.
[11] W. B. Griffin, R. P. Findley, M. L. Turner, and M. Cutkosky, “Calibra-
tion and mapping of a human hand for dexterous telemanipulation,”
in ASME IMECE 2000 Symposium on Haptic Interfaces for Virtual
Environments and Teleoperator Systems, 2000.
[12] P. F. Hokayem and M. W. Spong, “Bilateral teleoperation:
An historical survey,Automatica, vol. 42, no. 12, pp. 2035
– 2057, 2006. [Online]. Available: http://www.sciencedirect.com/
science/article/pii/S0005109806002871
[13] G. Niemeyer, C. Preusche, S. Stramigioli, and D. Lee, Telerobotics.
Cham: Springer International Publishing, 2016, pp. 1085–1108.
[Online]. Available: https://doi.org/10.1007/978-3-319-32552- 1_43
[14] B. Omarali, T. Taunyazov, A. Bukeyev, and A. Shintemirov, “Real-
Time Predictive Control of an UR5 Robotic Arm Through Human
Upper Limb Motion Tracking,” in Proceedings of the Companion
of the 2017 ACM/IEEE International Conference on Human-Robot
Interaction, ser. HRI ’17. New York, NY, USA: ACM, 2017, pp. 237–
238. [Online]. Available: http://doi.acm.org/10.1145/3029798.3038918
[15] M. V. Liarokapis, P. K. Artemiadis, and K. J. Kyriakopoulos, “Map-
ping human to robot motion with functional anthropomorphism for
teleoperation and telemanipulation with robot arm hand systems,” in
2013 IEEE/RSJ International Conference on Intelligent Robots and
Systems, Nov 2013, pp. 2075–2075.
[16] G. Du, P. Zhang, J. Mai, and Z. Li, “Markerless Kinect-Based Hand
Tracking for Robot Teleoperation,” International Journal of Advanced
Robotic Systems, vol. 9, no. 2, p. 36, 2012. [Online]. Available:
https://doi.org/10.5772/50093
[17] M. V. Liarokapis, P. K. Artemiadis, and K. J. Kyriakopoulos, “Tele-
manipulation with the DLR/HIT II robot hand using a dataglove and
a low cost force feedback device,” in 21st Mediterranean Conference
on Control and Automation, June 2013, pp. 431–436.
[18] G. C. Burdea, “Invited Review: The Synergy Between Virtual Real-
ity and Robotics,” IEEE Transactions on Robotics and Automation,
vol. 15, no. 3, pp. 400–410, June 1999.
[19] L. Peppoloni, F. Brizzi, C. A. Avizzano, and E. Ruffaldi, “Immersive
ROS-integrated framework for robot teleoperation,” in 2015 IEEE
Symposium on 3D User Interfaces (3DUI), March 2015, pp. 177–178.
[20] T. Zhang, Z. McCarthy, O. Jowl, D. Lee, X. Chen, K. Goldberg,
and P. Abbeel, “Deep Imitation Learning for Complex Manipulation
Tasks from Virtual Reality Teleoperation,” in 2018 IEEE International
Conference on Robotics and Automation (ICRA), May 2018, pp. 1–8.
[21] L. B. Rosenberg, “Virtual fixtures: Perceptual tools for telerobotic
manipulation,” in Proceedings of IEEE Virtual Reality Annual Inter-
national Symposium, Sep. 1993, pp. 76–82.
[22] D. Aarno, S. Ekvall, and D. Kragic, “Adaptive Virtual Fixtures for
Machine-Assisted Teleoperation Tasks,” in Proceedings of the 2005
IEEE International Conference on Robotics and Automation, April
2005, pp. 1139–1144.
[23] W. Yu, R. Alqasemi, R. Dubey, and N. Pernalete, “Telemanipulation
Assistance Based on Motion Intention Recognition,” in Proceedings of
the 2005 IEEE International Conference on Robotics and Automation,
April 2005, pp. 1121–1126.
[24] A. Dragan and S. Srinivasa, “Formalizing Assistive Teleoperation,” in
Robotics: Science and Systems, July 2012.
[25] C. Schultz, S. Gaurav, M. Monfort, L. Zhang, and B. D. Ziebart,
“Goal-predictive robotic teleoperation from noisy sensors,” in 2017
IEEE International Conference on Robotics and Automation (ICRA),
May 2017, pp. 5377–5383.
[26] J. Gibson, “The ecological approach to visual perception. boston, ma,
us,” 1979.
[27] J. Sun, J. L. Moore, A. Bobick, and J. M. Rehg, “Learning visual
object categories for robot affordance prediction,The International
Journal of Robotics Research, vol. 29, no. 2-3, pp. 174–197, 2010.
[28] G. Fritz, L. Paletta, R. Breithaupt, E. Rome, and G. Dorffner, “Learn-
ing predictive features in affordance based robotic perception systems,
in 2006 IEEE/RSJ International Conference on Intelligent Robots and
Systems. IEEE, 2006, pp. 3642–3647.
[29] M. Stark, P. Lies, M. Zillich, J. Wyatt, and B. Schiele, “Functional
object class detection based on learned affordance cues,” in Interna-
tional conference on computer vision systems. Springer, 2008, pp.
435–444.
[30] L. Montesano, M. Lopes, A. Bernardino, and J. Santos-Victor, “Learn-
ing object affordances: from sensory–motor coordination to imitation,”
IEEE Transactions on Robotics, vol. 24, no. 1, pp. 15–26, 2008.
[31] C. Geib, K. Mourao, R. Petrick, N. Pugeault, M. Steedman,
N. Krueger, and F. Wörgötter, “Object action complexes as an interface
for planning and robot control,” in IEEE RAS International Conference
on Humanoid Robots, 2006.
[32] G. Gao, A. Dwivedi, N. Elangovan, Y. Cao, L. Young, and
M. Liarokapis, “The New Dexterity adaptive, humanlike robot hand,
IEEE International Conference on Robotics and Automation, 2019.
[33] M. Quigley, K. Conley, B. P. Gerkey, J. Faust, T. Foote, J. Leibs,
R. Wheeler, and A. Y. Ng, “ROS: an open-source Robot Operating
System,” in ICRA Workshop on Open Source Software, 2009.
[34] T. T. Andersen, “Optimizing the Universal Robots ROS driver.” Tech-
nical University of Denmark, Department of Electrical Engineering,
Tech. Rep., 2015.
[35] S. Chitta, E. Marder-Eppstein, W. Meeussen, V. Pradeep,
A. Rodríguez Tsouroukdissian, J. Bohren, D. Coleman, B. Magyar,
G. Raiola, M. Lüdtke, and E. Fernández Perdomo, “ros_control:
A generic and simple control framework for ROS,The
Journal of Open Source Software, 2017. [Online]. Available:
http://www.theoj.org/joss-papers/joss.00456/10.21105.joss.00456.pdf
[36] O. Ravn, N. Andersen, and T. Andersen, UR10 Performance Analysis.
Technical University of Denmark, Department of Electrical Engineer-
ing, 2014.
[37] J. Huang, V. Rathod, C. Sun, M. Zhu, A. Korattikara, A. Fathi,
I. Fischer, Z. Wojna, Y. Song, S. Guadarrama, and K. Murphy,
“Speed/Accuracy Trade-Offs for Modern Convolutional Object De-
tectors,” in The IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), July 2017.
... classes # Obj. classes Class-labels Bimanual IIT-AFF [31] Exocentric [41] 8.8K Manually-labeled Masks 9 10 Explicit No AGD20K [28] Exo+Egocentric [3,25] 23.8K Manually-labeled Heatmaps 36 50 Explicit No 3DOI [35] Exo+Egocentric [5,36] 10K Manually-labeled Points 3 n.a. Explicit No ACP [10] Egocentric [5] 15K Auto-labeled Heatmaps n.a. ...
... Moreover, we consider the especially challenging problem of bimanual affordance detection, for which the spatial context of the objects and their interconnection is also important. Although bimanual affordances have been considered in previous work [9,19,33,42], to the best of our knowledge, ours is the first method to extract bimanual affordances from videos which we then use to train our model to predict taskspecific affordance masks based on a text prompt. ...
Preprint
When interacting with objects, humans effectively reason about which regions of objects are viable for an intended action, i.e., the affordance regions of the object. They can also account for subtle differences in object regions based on the task to be performed and whether one or two hands need to be used. However, current vision-based affordance prediction methods often reduce the problem to naive object part segmentation. In this work, we propose a framework for extracting affordance data from human activity video datasets. Our extracted 2HANDS dataset contains precise object affordance region segmentations and affordance class-labels as narrations of the activity performed. The data also accounts for bimanual actions, i.e., two hands co-ordinating and interacting with one or more objects. We present a VLM-based affordance prediction model, 2HandedAfforder, trained on the dataset and demonstrate superior performance over baselines in affordance region segmentation for various activities. Finally, we show that our predicted affordance regions are actionable, i.e., can be used by an agent performing a task, through demonstration in robotic manipulation scenarios.
... LeapMotion Vive [1] [2] 2023 7 11 * * Institute of Science Tokyo ■ . [3] iFeel Desktop Haptic Device IFHD 遠隔操作時における利き手と非利き手間の 協調性の違いに関する検証 関野真央 * ,三浦智 * ...
Article
The intuitive tele-operation of industrial robot arms is necessary for the teaching of autonomous movement. We developed a novel interface, namely the iFeel Desktop Haptic Device, for operating robots intuitively. However, when a user uses two interfaces with two hands, there are differences in coordination between the dominant and non-dominant hands. In this paper, we investigated the differences in coordination between dominant and non-dominant hands using two devices for the development of cooperative control. In an experiment, when the participants manipulated the two interfaces to operate a virtual tracking system, we measured the tracking error in each degree of freedom (i.e., X, Y, Z, pitch, yaw, and roll directions). The results show that there were significant differences between the dominant and non-dominant hands for movement in the X, Y, Z, and pitch directions. Significant differences may result in damage to property, reduced task performance, and so on. We conclude that the operations that involve more body parts have a greater difference between dominant and non-dominant hands.
... In these contexts, robots are employed to work in tandem with human users, allowing for either partial or full teleoperation of the system when needed. By integrating both autonomous and teleoperated functionalities, robotic systems can leverage human expertise and robotic precision to enhance performance across various applications [3]. ...
Conference Paper
Robotic arms demonstrate superior speed and precision when performing tasks in challenging environments. However, creating fully autonomous robots remains challenging, particularly for tasks that demand the kind of subtle perception and complex decision-making that humans use naturally. This paper investigates the integration of Virtual Reality (VR) with robotic telemanipulation, aiming to boost cooperation between humans and robots in the execution of peg-in-hole assembly tasks. The introduced semi-autonomous framework utilizes visual object detection technology to identify the types and 6D poses of objects, offering affordances to the user. The robot autonomously plans its movement toward the designated goal location before switching to the real-time telemanipulation scheme. This allows the user to either approve the predetermined task execution plan or take over the robot control and make necessary adjustments. Following a comparative user study between the proposed framework and a pure telemanipulation system, the effectiveness of this approach is evaluated and demonstrated.
... This gives the user a feeling of being in the scene of action. This concept is facilitated by virtual and augmented reality [34], haptic devices [35], visual interfaces [28] or a mixture of them [27], [36], [37]. ...
... Beyond training, VR applications extend to various sectors, enhancing internal operations in settings such as operating rooms and hospitals, exemplified by the HoloSurg system developed by Exovite [17]. Furthermore, VR's reach expands into non-industrial domains, such as the Thyssen-Bornemisza Museum, where visitors can explore visual representations of paintings in a realistic virtual environment [18]. ...
Article
Full-text available
Over the past few years, the industry has experienced significant growth, leading to what is now known as Industry 4.0. This advancement has been characterized by the automation of robots. Industries have embraced mobile robots to enhance efficiency in specific manufacturing tasks, aiming for optimal results and reducing human errors. Moreover, robots can perform tasks in areas inaccessible to humans, such as hard-to-reach zones or hazardous environments. However, the challenge lies in the lack of knowledge about the operation and proper use of the robot. This work presents the development of a teleoperation system using HTC Vive Pro 2 virtual reality goggles. This allows individuals to immerse themselves in a fully virtual environment to become familiar with the operation and control of the KUKA youBot robot. The virtual reality experience is created in Unity, and through this, robot movements are executed, followed by a connection to ROS (Robot Operating System). To prevent potential damage to the real robot, a simulation is conducted in Gazebo, facilitating the understanding of the robot’s operation.
Article
Full-text available
Material handling loco-manipulation is heavily present in humanitarian assistance and disaster relief (HADR) efforts. Consider a scenario requiring human expertise to transcend the physical location of the human body; an approach—harnessing the innately long-range and precise abilities of robotic Avatar technologies—was successfully applied to material handling and loco-manipulation tasks, proving that humanoids may play an integral role in future disaster relief. Typically, first responders, such as firefighters and/or paramedics, must carry, push, pull, and handle objects, facilitating the transportation of goods. Hence, researchers have sought to enable full-sized humanoid robots to perform such essential material handling tasks. This work aims to tackle current limitations in humanoid object interaction capabilities, specifically with common objects such as carts, wheelbarrows, etc. Furthermore, this article compiles many methods to ensure stable gait during cart loco-manipulation. The examined objects range from simple carts (such as rolling and utility carts) to challenging carts (such as wheelbarrows). Thus, the authors present a comprehensive approach to address some of the most convoluted material handling and loco-manipulation challenges in the field of humanoid robotics. Finally, promising results are showcased when ART (Avatar Robotics Telepresence) and humanoid embodiment are applied in the context of loco-manipulation and material handling.
Conference Paper
Full-text available
In this paper, we present the New Dexterity adaptive, tendon-driven, human-like robot hand. The particular hand is the first attempt towards industrialization of a tendon-driven, underactuated, anthropomorphic design with structural compliance in the form of elastic fingerpads and distal flexure joints. The accompanying video demonstrates the robot hand's ability to execute efficiently robust grasping and dexterous, in-hand manipulation tasks, using a range of everyday objects.
Article
Full-text available
Learning from demonstration (LfD) has been used to help robots to implement manipulation tasks autonomously, in particular, to learn manipulation behaviors from observing themotion executed by human demonstrators. This paper reviews recent research and development in the field of LfD. The main focus is placed on how to demonstrate the example behaviors to the robot in assembly operations, and how to extract the manipulation features for robot learning and generating imitative behaviors. Diverse metrics are analyzed to evaluate the performance of robot imitation learning. Specifically, the application of LfD in robotic assembly is a focal point in this paper.
Conference Paper
Full-text available
This video demonstrates the authors' initial results on developing a real-time control system for an Universal Robot UR5 robotic arm through human motion capture with a visualization utility built on an open-source platform.
Conference Paper
This paper reports the authors' initial results on developing a real-time teleoperation system for an Universal Robots robotic arm through human motion capture with a visualization utility built on the Blender Game Engine open-source platform. A linear explicit model predictive robot controller (EMPC) is implemented for online generation of optimal robot trajectories matching operator's wrist position and orientation, whilst adhering to the robot's constraints. The EMPC proved to be superior to open-loop and naive PID controllers in terms of accuracy and safety.
Article
In this chapter we present an overview of the field of telerobotics with a focus on control aspects. To acknowledge some of the earliest contributions and motivations the field has provided to robotics in general, we begin with a brief historical perspective and discuss some of the challenging applications. Then, after introducing and classifying the various system architectures and control strategies, we emphasize bilateral control and force feedback. This particular area has seen intense research work in the pursuit of telepresence. We also examine some of the emerging efforts, extending telerobotic concepts to unconventional systems and applications. Finally, we suggest some further reading for a closer engagement with the field.