Conference PaperPDF Available

Garbage Collection and Sorting with a Mobile Manipulator using Deep Learning and Whole-Body Control


Abstract and Figures

Domestic garbage management is an important aspect of a sustainable environment. This paper presents a novel garbage classification and localization system for grasping and placement in the correct recycling bin, integrated on a mobile manipulator. In particular, we first introduce and train a deep neural network (namely, GarbageNet) to detect different recyclable types of garbage. Secondly, we use a grasp localization method to identify a suitable grasp pose to pick the garbage from the ground. Finally, we perform grasping and sorting of the objects by the mobile robot through a whole-body control framework. We experimentally validate the method, both on visual RGB-D data and indoors on a real full- size mobile manipulator for collection and recycling of garbage items placed on the ground.
Content may be subject to copyright.
Garbage Collection and Sorting with a Mobile Manipulator
using Deep Learning and Whole-Body Control
Jingyi Liu1, Pietro Balatti2,3, Kirsty Ellis1, Denis Hadjivelichkov1,
Danail Stoyanov1, Arash Ajoudani2, and Dimitrios Kanoulas1
Abstract— Domestic garbage management is an important
aspect of a sustainable environment. This paper presents a
novel garbage classification and localization system for grasping
and placement in the correct recycling bin, integrated on
a mobile manipulator. In particular, we first introduce and
train a deep neural network (namely, GarbageNet) to detect
different recyclable types of garbage. Secondly, we use a grasp
localization method to identify a suitable grasp pose to pick
the garbage from the ground. Finally, we perform grasping
and sorting of the objects by the mobile robot through a
whole-body control framework. We experimentally validate the
method, both on visual RGB-D data and indoors on a real full-
size mobile manipulator for collection and recycling of garbage
items placed on the ground.
Rapid urbanization over the past several years resulted in
an excessive increase of waste generation per capita, from
which a third is not managed in an environmental-friendly
manner [1]. In domestic environments, a large amount of
garbage is daily thrown or left on the ground, polluting the
environment heavily and preventing it from being sustainable
and pleasant. Garbage collection and recycling (i.e., sorting
garbage into different types) is a common solution that
addresses this issue. Garbage separation is essential in this
process, however, it is a labor-intensive job that might also
affect the labors’ health. There are two different types of
garbage sorting: 1) centralized classification, where a large
amount of garbage is dumped on a conveyor and workers
sort out the recyclable waste and 2) piecemeal sorting, which
often happens outdoors, such as in parks and streets, where
sanitation workers pick up different garbage and place them
into corresponding bins. In this paper, we focus on the second
type, which significantly reduces the need of extra sorting in
the factory and reduces hazardous contact between workers
and garbage. Our intention is to allow mobile robots to
collect and sort garbage, preventing in this way workers from
physical health issues and improving the recycling efficiency.
Garbage collection from the ground (Fig. 1-left) for
the purpose of recycling is considered a challenge to be
solved using robots. It involves the integration of several
subsystems. Firstly, visual or another type of perceptual
1Department of Computer Science, University College London, Gower
Street, WC1E 6BT, London, UK. {j.liu.19, kirsty.ellis,
dennis.hadjivelichkov, danail.stoyanov,
2HRI2Lab, Istituto Italiano di Tecnologia, via Morego 30, 16163 Gen-
ova, Italy {pietro.balatti,arash.ajoudani}
3Department of Information Engineering, University of Pisa, Pisa, Italy
equal contribution
Fig. 1: Typical garbage on the ground [2] (left); the IIT-
MOCA/UCL-MPPL mobile robot (right).
sensing is required to identify the existence of garbage in the
environment and further localize them. Moreover, the type
of the garbage must be identified, given that the collection
is for recycling, and thus it needs to be placed in the right
bin. Secondly, the grasp pose of the garbage object needs to
be extracted. Lastly, a planning and control method for the
robot to grasp the garbage and place it into the right bin.
This whole process needs to be done with all garbage items
in the scene in the most efficient way.
In this paper, we introduce a novel integration of the afore-
mentioned scheme, in order to allow a mobile manipulator
to collect garbage from the ground, after identifying their
location, grasping pose, and type. An overview of the system
can be visualized in Fig. 2, while the mobile robot that was
used is visualized in Fig. 1-right and has been modified to
carry three different recycling bins (paper, metal plastic). The
process is as follows. First, RGB-D data are acquired from
the visual sensor on the robot. These data are fed to a deep
learning network (we call it GarbageNet) that is trained to
segment, classify (based on their material type), and localize
all garbage in the 2D RGB scene. The 3D location of each
garbage object can then be extracted from the associated
depth data, as well as the grasp pose of the closest one.
Last, the robot starts approaching the closest garbage and a
whole-body controller enables the robot to grasp the target
object and place it in the right recycling bin.
Next, we review related work of garbage collection and
sorting robots (Sec. I-A). Then, we present our novel in-
tegration of three subsystems, namely the deep garbage
recognition and localization, the grasp pose extraction, and
(Fig 5)
vision data
Robot state
GPD (Fig. 4)
garbage type
grasp pose
arm/base priority
GarbageNet (Fig. 3)
Fig. 2: Software architecture of the whole system.
the whole-body mobile manipulation (Sec. II). Moreover,
we demonstrate the performance of our introduced system
through experimental results (Sec. III). Finally, we conclude
with future directions (Sec. IV).
A. Related Work
The vast majority of autonomous garbage sorting robots
mainly focus on the centralized classification, i.e., an au-
tomated conveyor along with one or more arms and a
visual detection system are combined to sort garbage in the
factory. The most representative one is the sorting station
developed by ZenRobotics Recycler in Finland [3]. A high-
resolution 3D sensor is used to get an isometric 2D height
map of the conveyor, then a machine learning method
is employed for object recognition and manipulation. The
sorting efficiency of this system is as high as 98%, and
the average sorting speed is 3000 times per hour. Another
successful commercial product is the Waste Robotics devel-
oped by FANUC [4], where convolutional neural networks
are employed to classify data that are collected by RGB-D
cameras. After the model is trained successfully, the robot
arm uses suction grippers to pick the recyclable waste. A
similar approach has been recently investigated using a fast
parallel manipulator with a suction gripper, for sorting items
on a conveyor [5]. Several other similar systems have been
developed recently [6], [7], [8], and the difference from our
approach is that they usually classify items in a known
background environment (conveyor) in the factory, while
we are looking into sorting items during their collection by
grasping from the ground and placing them in the right bin.
The second type of garbage sorting (i.e., piecemeal) that
we are interested in this paper, still remains an active
research area in robotics, with several open challenges. For
instance, the potential unstructured surrounding environment
that garbage may lie in, or the fact that a robot operating
robustly and efficiently in such a task, involves many as-
pects of operations, such as object recognition, grasp pose
estimation, grasp control algorithm, path planning, etc. Even
though there is work to been done on garbage detection [9],
[10], the only mobile manipulation robotic system that has
developed a pick-up garbage method on the grass is the one
presented by Bai et al. [11]. In particular, a deep learning
method is deployed to classify the waste on the grass (i.e.,
as waste or not) and a novel navigation algorithm is presented
based on grass segmentation. However, this system does not
work in real-time and is not able to classify garbage by type
for the purpose of recycling. In this paper, we propose a
novel integration of systems that detect the type and pose of
the garbage on the floor and use state-of-the-art whole-body
control to collect them and sort them in the right bin, based
on their type.
In this section, we discuss the approaches we employ to
realize the garbage recycling robot, including finding what
and where the garbage is and how the robot can grasp it.
A. GarbageNet: Deep Garbage Recognition & Localization
While object detection methods satisfy the demands of
garbage classification and localization, by providing class
labels and bounding boxes, instance segmentation methods
have the advantage of also providing pixel-level masks.
These masks can then be projected onto a depth image and
significantly simplify the robot grasp search for a given target
Given the need to detect and localize garbage in real-
time with the mobile robot, we decided to use the YOLACT
framework introduced in [12] and train it for garbage objects.
We have named the new trained network GarbageNet. Using
this type of network structure it is possible to infer the bound-
ing box and type of an object, as well as to acquire pixel-level
object masks that could better help the robot comprehend
Fig. 3: GarbageNet: Convolutional image features are
produced and passed onto two branches - the Protonet
branch produces mask prototypes, while the other estimates
their coefficients. Both are combined into an instance-level
mask [12].
its surrounding environment. The real-time performance and
high accuracy contribute to its advantage over other types
of object segmentation methods, such as Mask R-CNN [13],
SOLO [14] and TensorMask [15]. We have integrated the
original network in a ROS wrapper, where the robot visual
sensor is used as input and the garbage object segmentation,
bounding box, type, and grasping pose messages are gen-
erated. Our framework produces instance masks and scores
them with mask coefficients. Masks are combined using Non-
Maximum Suppression (NMS) to ensure there is no overlap
between instances while retaining useful information. The
core structure is shown in Fig. 3.
1) Dataset: The original YOLACT network is trained on
the COCO [16] dataset, originally used for image recogni-
tion and does not fulfill the requirements of garbage type
characterization and segmentation. Thus, a novel dataset to
train GarbageNet for garbage identification was needed. For
this reason, we used the newly introduced TACO dataset [2],
which is specialized for garbage segmentation and classi-
fication. The dataset uses an object taxonomy that can be
directly used for garbage sorting purposes. In particular, it
includes 1500 images with 4784 annotations, 60 categories
which belong to 28 super-categories (e.g., paper, glass, metal,
carton, plastic, polypropylene, etc). Moreover, the objects’
background environment includes both indoors and outdoors
environments, such as tiles, pavements, grass, roads, etc. In
this way, even deformed garbage objects in the wild can be
classified and segmented.
2) Training: To exploit our framework, we randomly
split the TACO dataset into training (80%), cross-validation
(10%), and testing (10%) sets. We used an ImageNet [17]
pre-trained model of YOLACT to fine-tune the weights on
the TACO dataset, using a batch size of 8on two Titan XP
GPUs for 1day and 40,000 iterations (learning rate: 103,
weight decay: 5×104, momentum: 0.9). Using ResNet-50
as backbone, we achieved a mAP75 of 40.43 (mean Average
Precision with an IoU threshold of 0.75), in roughly 30
frames per second (i.e. almost the speed of the input RGB-D
sensing). This is slightly better than the original mAP75 of
YOLACT on the COCO dataset, which is 31.2, or Mask R-
CNN, which is around 37.8. Notice here that the exact mask
(a) (b)
Fig. 4: Grasps produced by GPD [19]: (a) candidate pool
and (b) axes defining each grasp.
segmentation of the object is not particularly important in this
stage, since the grasping pose is extracted from a different
process, as described in the next section.
3) Implementation: To allow the system be integrated on
our ROS-based architecture, a wrapper was used to interact
easily with the other components and the real robot through
ROS topics. In particular, an interface node subscribes to
the input point cloud and the GarbageNet-produced masks,
which in turn projects the masks onto the point cloud. The
approximate position of the closest garbage piece is produced
using these projections. The interface also filters the detected
garbage category into three super-categories: paper, metal
and plastic, based on keyword search. Finally, the interface
publishes the approximate position of the nearest object, its
projected mask points and its super-category.
B. GPD: Grasp Pose Detection
Traditional grasp pose generation methods [18] require
either the geometric properties or an exact 3D model of the
targeted object. However, litter thrown on the ground often
has a non-rigid structure with varying textures and shapes.
Providing precise models or establishing a large garbage
grasping database is impractical. Moreover, a mobile robot
dealing with cluttered scenes would only have access to
RGB-D information from a single view.
A more general solution that deals with these challenges
would be to generate grasps directly from a voxelized
point cloud. That is the principle on which Grasping Pose
Detection (GPD) [19] is based. GPD has successfully been
integrated with object detectors in cluttered environments.
1) Method: The GPD algorithm follows several steps as
briefly outlined in Algorithm 1.
Algorithm 1: Grasp Pose Detection
input : Pointcloud C;
Subset of points where the grasps are to occur S;
Grasp filtering parameters Θ;
output: Grasp Configurations G;
1) H= HandSearch(C, S);
2) G= SelectGraspConfigurations(H, C,Θ);
In Step 1, the received point cloud data Cis voxelized and
filtered. Points uniformly sampled from the subset of points
Fig. 5: Perception pipeline: Input image is passed through GarbageNet to detect garbage. In the interface, masks of detected
objects are projected onto the point cloud. The approximate position of the nearest garbage is outputted, while its mask
projection is used as sampling points for GPD. A garbage type label is also produced. Finally, GPD produces a grasp.
Sare used to produce hand candidates (see Fig. 4a) at the
axes aligned with the points’ normals. Each hand candidate
is defined by axes for approach, hand binormal and object
axis as shown in Fig. 4b. Filtering is applied to reject any
candidates that would collide with the point cloud or do not
contain at least one point in the closing region of the hand.
In Step 2, grasp candidates are produced from the hand
candidates, given some allowable angle deviation and ap-
proach restrictions Θ. The candidates are encoded into sev-
eral image embeddings, which are passed through a trained
convolutional neural network based on LeNet [20]. The
output of the network classifies the candidates as successful
grasps by assigning them a score. Finally, the grasp config-
urations Gwith the highest scores are selected as the best
2) Implementation: The pre-trained original implementa-
tion of the GPD package [21] is used within a GPD ROS
wrapper. The input to GPD is set as the RGB-D view
received from a camera, along with sampling points based
on the detected garbage instance masks to provide a region
of interest. The outputted grasp with the highest score is
selected and transformed into a ROS pose message type.
Following the aforementioned framework, a unified
garbage detection, classification, localization and grasp gen-
eration pipeline is created by connecting GarbageNet and
GPD through an interface node as shown in Fig. 5.
C. Whole-Body Mobile Manipulation Grasping
With the aim of localizing and collecting garbage items
from the ground with a robotic system, we introduce in this
section the control module that has been implemented on
the research platform IIT MOCA/UCL MPPL [22]. This
versatile cobot is composed by a Robotnik SUMMIT-XL
STEEL mobile platform (3-Degrees of Freedom (DoFs)),
and a Franka Emika Panda robotic arm (7-DoFs). Since
the control of the former is achieved through admittance
control while the robotic arm is torque-controlled, a Whole-
Body Impedance Controller has been developed to deal with
their different causalities, extending our methods introduced
in [23], [24]. The implementation of such control system
allows both to achieve the desired end-effector behavior,
and to exploit the redundant DoFs of the robot. This is a
fundamental requirement to successfully execute autonomous
and complex manipulation tasks.
Considering the mobile-manipulator with 3-DoFs (rigid
body motion) at the mobile base and n-DoFs at the ma-
nipulator, we can define the generalised coordinates q=
r]TR3+n, with qvand qrthe coordinates of the
mobile base and the manipulator. We describe the dynamics
equations of the combined system as follows, taking into
account the admittance causality of the mobile base that is
velocity controlled:
z}| {
Madm 0
z}| {
Dadm 0
where Madm R3×3and Dadm R3×3represent the
virtual inertial and virtual damping terms for the admittance
control of the mobile base, ˙qvR3is the velocity of
the generalised motion of mobile platform, Γvir
vR3are the virtual and external torques. MrRn×n
is the symmetric and positive definite inertial matrix, Cr
Rn×nis the Coriolis and centrifugal matrix, grRnis the
gravity vector, ΓrRnand Γext
rRnare the joint torque
vector and external torque vector of the robotic manipulator,
Let us consider xR6as the task coordinates in Carte-
sian space. It follows that the desired task-space dynamics
behaviour in response to the external wrench Fext R6,
(leading to the external torques Γext = [Γext
(1)), can be obtained as:
Fext =Λ(q)¨
˜x+ (µ(q) + D)˙
Fig. 6: Example output of input point clouds (left), GarbageNet mask and classification of the closest garbage (middle,
zoomed), mask projected onto the pointcloud (middle) and grasps generated via GPD (right).
where ˜x=xxdis the Cartesian error from the desired task
xd, and KR6×6and DR6×6are the desired Cartesian
stiffness and damping matrices, respectively. Λ(q)R6×6
represents the Cartesian inertial and µ(q)R6×6the
Cartesian Coriolis and centrifugal matrix, respectively. For
more details, please see our previous work on this [22].
In order to navigate through unstructured environment and
to grasp garbage items from the ground, it is crucial to
selectively assign different mobility priorities to the mobile
base or to the robotic arm, when a desired trajectory is
executed at the end-effector level. Specifically, during the
exploration of the environment, the robot movements must be
performed mostly by the mobile base, while when collecting
objects from the ground the priority needs to be set to the
arm movements.
To this end, we implemented a weighted dynamically-
consistent pseudo-inverse to achieve such behaviours. This is
done by applying the desired motion constraints through real-
time variable weighting factors. The weighted dynamically
consistent pseudo-inverse is defined as
where ΛW=JTM W MJ 1represents the weighted
Cartesian inertia, JR6×(3+n)denotes the whole-body
Jacobian matrix, MR(3+n)×(3+n)is the whole-body
inertial matrix, and WR(3+n)×(3+n)is the diago-
nal and positive-definite weight matrix defined by W=
diag [w1w2· · · wn], with wi0. Therefore, a
higher value of wiat the i-th joint will impede the motion
of that joint, and W=I3+nwill make no effect on the
motion mapping.
Finally, the whole-body Cartesian impedance controller’s
commanded torque for the main task are calculated as:
Γimp =g+¯
The robot desired poses are retrieved through the Trajec-
tory planner unit, that, once received as input a target pose,
computes the intermediate waypoints by means of a classical
fifth-order polynomial law.
In this section, we present a brief experimental analysis
of the garbage segmentation and classification (GarbageNet),
Fig. 7: GarbageNet classification and segmentation: images
with single items are classified correctly with high confidence
scores (top). Images containing multiple items are classified
with smaller confidence score due to occlusions (bottom).
grasp pose proposal (GPD), and overall system performance
that identifies and collects for recycling three different types
of garbage (paper, metal, plastic) using the whole-body
controlled mobile manipulation robot.
A. GarbageNet: Garbage Segmentation and Classification
To test the quality of GarbageNet segmentation and clas-
sification introduced in Sec. II-A, we have first validated
on the testing TACO dataset (see Sec. II-A.2), with a
resultant mAP75 of 40.43 at 30 frames per second. We
further segmented several unseen test images (roughly 1h
of recorded data, including objects from the categories into
which we will be sorting), both from a handheld RGB-D
RealSense camera and the visual sensor of the mobile robot.
It is found that instance segmentation of spread out pieces
of garbage is successful (Fig. 7-top), while in some scenes
containing a cluster of many pieces of garbage it is less
successful and needs a further research investigation (Fig. 7-
bottom). This localization failure of cluttered scenes has
been identified as one of two typical errors encountered in
mask generation by GarbageNet, the second being leakage -
noise that is included in the instance mask when a bounding
box is not accurate [12]. The success of the classification
of garbage provided by GarbageNet is influenced by the
quality of the images that are provided to the system. It is
found that in overexposed images, the algorithm struggles to
detect features that differentiate the garbage item from the
surrounding environment.
B. GPD: Garbage Grasp Proposal
An advantage of our introduced system is that the pre-
cision of the garbage mask segmentation and bounding-
box estimation does not highly influence the grasping pose
extraction, since this is estimated from the GPD method,
introduced in Sec. II-B. Items of garbage to be picked are
provided to the GPD node sequentially by order of proximity
to the robot. Some example grasp generation sequences are
shown in Fig. 6. The quality of the grasps generated by GPD
depends on the number of sample points on the item, e.g.,
a sparse point cloud can result in no grasp candidates. This
was observed in some scenes, but it was quickly rectified
by capturing new RGB-D data. With a well populated point
cloud, GPD produces very good grasp proposals with a
grasping success rate of almost 90%, tested with 50 grasps
on the robotic manipulator.
Notice that we had to restrict all grasps to be from the
top of the object, to respect the reachability constraints
of the robot manipulator. GPD parameters allow for easy
selection of approach direction as well as allowable angle
deviation from it. It was found that when generating grasps
on objects that were seen only from the side, GPD, as
expected, struggles to produce grasps from above and data
recapturing is required from a different pose. Generated
grasps have been successfully transferred from simulation
to the real robots with a two fingered mobile manipulator.
C. Whole-Body Grasping Results
Exploiting the Whole-Body impedance controller intro-
duced in Sec. II-C, we performed a set of experiments
with the IIT MOCA/UCL MPPL robotic platform (Fig. 1-
left). To describe the phases of such experiments, we follow
the control flow of the Finite State Machine (FSM) (see
Fig. 2). As in a real world scenario, the mobile robot explores
the environment, until an acknowledgment (ack) message is
provided by the visual perception module. Fig. 8 shows all
the phases taking place after this ack is triggered for three
different materials: metal (a), paper (b), and plastic (c). In the
garbage detected state (light red), the robot halts its motion,
so that GarbageNet identifies the garbage type, and GPD ex-
tracts the grasp pose. These data are sent to the FSM, that can
move on to the next phases. The grasp pose (visualized inside
Fig. 8: The grasping results performed by the IIT
MOCA/UCL MPPL robotic platform exploiting the Whole-
Body impedance controller. Images of garbage detection
(with the grasp pose in the embedded image), reach, grasp,
and disposal in the correct type of trash bin are visualized.
Three different items were identified and collected: a tomato
juice can - classified as metal (a), a lentils carton box -
classified as paper (b), and a water plastic bottle - classified
as plastic (c).
the garbage detection image in Fig. 8) is reported in the plots
with point markers at the moment of detection, and reported
until the grasp takes place with (dashed lines). Next, in the
garbage reach state (light blue), the robot moves towards
this grasp pose. During this process we can distinguish two
sub-phases. In the first one, the robot reaches the vicinity of
the goal pose, assigning a higher priority to the mobile base
through (3), i.e. setting wi= 1 to the mobile base joints and
wi= 3 to the arm joints, with the impedance parameters
set to a compliant value K=diag(500N/m). Like this,
the mobile robot can approximately reach the item pose in a
compliant way, and avoiding unnecessary movements of the
arm out of the mobile base support polygon. This guarantees
a safety interaction in case of an unexpected collision with
the environment. Subsequently, the priority is switched to
the arm through (3), i.e. setting wi= 5 to the mobile base
joints and wi= 1 to the arm joints, and the impedance
parameters are set to be stiffer with K=diag(1000N/m).
In this way, the robotic arm can reach the ground towards the
grasp pose in a precise manner. From Fig. 8, it is possible
to notice that the robot end-effector reaches the grasp pose
with a high accuracy, so that the garbage grasp state (light
green) can be performed successfully. In this state, the robot
gripper closes its finger until a force of 3Nis sensed, to
ensure the object is firmly grasped. Lastly, in the garbage
trash state (light yellow) the robot takes the garbage item to
the corresponding trash bin placed on its back, selecting it
through the garbage type message received previously.
In this work, we present a novel garbage identification and
sorting system, integrated on a mobile robot, using whole-
body control. This approach works in real-time, identifying,
localizing, and sorting garbage.
In the future, we aim at validating the integrated system
outdoors in the wild, under various forecast conditions, and
work further on the path planning and exploitation part of
the method. In particular, the problem of where to look for
garbage in a big outdoors space and how to collect them
in an energy and time efficient way are our next steps to
address the problem.
This work was supported by the UCL Global Engagement
Funds 2020/21 and the EU H2020 SOPHIA project (no
871237). The Titan Xp GPUs were donated by the NVIDIA
[1] S. Kaza, L. C. Yao, P. Bhada-Tata, and F. Van Woerden, “What a
Waste 2.0: A Global Snapshot of Solid Waste Management to 2050,
The World Bank, Washington DC, Tech. Rep., 2018.
[2] P. F. Proenc¸a and P. Sim˜
oes, “TACO: Trash Annotations in Context
for Litter Detection,” arXiv preprint arXiv:2003.06975, 2020.
[3] D. T. J. Lukka, D. T. Tossavainen, D. J. V. Kujala, and D. T. Raiko,
“ZenRobotics Recycler – Robotic Sorting using Machine Learning,”
ZenRobotics Recycler, Helsinki, Finland, Tech. Rep., 2014.
[4] W. Liu, H. Qian, and Z. Pan, “Dispersion multi-object robot sorting
method in material frame based on deep learning,” China Patent
2 017 111 944 941, June 08, 2018.
[5] F. Raptopoulos, M. Koskinopoulou, and M. Maniadakis, “Robotic
Pick-and-Toss Facilitates Urban Waste Sorting,” in 2020 IEEE 16th
International Conference on Automation Science and Engineering
(CASE), 2020, pp. 1149–1154.
[6] I. Vegas, K. Broos, P. Nielsen, O. Lambertz, and A. Lisbona, “Up-
grading the quality of mixed recycled aggregates from construction
and demolition waste by using near-infrared sorting technology,
Construction and Building Materials, vol. 75, pp. 121–128, 2015.
[7] A. Shaukat, Y. Gao, J. A. Kuo, B. A. Bowen, and P. E. Mort,
“Visual classification of waste material for nuclear decommissioning,
Robotics and Autonomous Systems, vol. 75, pp. 365–378, 2016.
[8] G. SP, H. S, and T. A, “Multi-material classification of dry recyclables
from municipal solid waste based on thermal imaging,” Waste Man-
agement, vol. 70, pp. 13–21, 2017.
[9] R. Sultana, R. D. Adams, Y. Yan, P. M. Yanik, and M. L. Tanaka,
“Trash and Recycled Material Identification using Convolutional Neu-
ral Networks (CNN),” in 2020 SoutheastCon, 2020, pp. 1–8.
[10] X. Li, M. Tian, S. Kong, L. Wu, and J. Yu, “A modified YOLOv3
detection method for vision-based water surface garbage capture
robot,” International Journal of Advanced Robotic Systems, vol. 17,
no. 3, p. 1729881420932715, 2020.
[11] J. Bai, S. Lian, Z. Liu, K. Wang, and D. Liu, “Deep Learning Based
Robot for Automatically Picking Up Garbage on the Grass,” IEEE
Transactions on Consumer Electronics, vol. 64, no. 3, pp. 382–389,
[12] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-Time
Instance Segmentation,” in 2019 IEEE/CVF International Conference
on Computer Vision (ICCV), 2019, pp. 9156–9165.
[13] K. He, G. Gkioxari, P. Doll´
ar, and R. Girshick, “Mask R-CNN,”
in 2017 IEEE International Conference on Computer Vision (ICCV),
2017, pp. 2980–2988.
[14] X. Wang, T. Kong, C. Shen, Y. Jiang, and L. Li, “SOLO: Segmenting
Objects by Locations,” 2019.
[15] X. Chen, R. Girshick, K. He, and P. Dollar, “TensorMask: A Founda-
tion for Dense Object Segmentation,” in 2019 IEEE/CVF International
Conference on Computer Vision (ICCV), 2019, pp. 2061–2069.
[16] T.-Y. Lin, Y. Cui, G. Paterr, and etc, “Coco: Common objects
in context,” [EB/OL], Accessed
September 14, 2020.
[17] J. Deng, W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei, “ImageNet:
A large-scale hierarchical image database,” in 2009 IEEE Conference
on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
[18] D. Kanoulas, J. Lee, D. G. Caldwell, and N. G. Tsagarakis, “Visual
Grasp Affordance Localization in Point Clouds Using Curved Contact
Patches,” International Journal of Humanoid Robotics, vol. 14, no. 01,
p. 1650028, 2017.
[19] A. ten Pas, M. Gualtieri, K. Saenko, and R. Platt, “Grasp Pose
Detection in Point Clouds,” The International Journal of Robotics
Research, vol. 36, no. 13-14, pp. 1455–1473, 2017. [Online].
[20] C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed,
D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going
deeper with convolutions,” in 2015 IEEE Conference on Computer
Vision and Pattern Recognition (CVPR), 2015, pp. 1–9.
[21] A. ten Pas, “Grasp Pose Estimation,” [EB/OL],
atenpas/gpd Accessed September 1, 2020.
[22] Y. Wu, P. Balatti, M. Lorenzini, F. Zhao, W. Kim, and A. Ajoudani,
“A teleoperation interface for loco-manipulation control of mobile col-
laborative robotic assistant,IEEE Robotics and Automation Letters,
vol. 4, no. 4, pp. 3593–3600, 2019.
[23] P. Balatti, D. Kanoulas, G. F. Rigano, L. Muratore, N. G. Tsagarakis,
and A. Ajoudani, “A Self-Tuning Impedance Controller for Au-
tonomous Robotic Manipulation,” in IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS), 2018, pp. 5885–5891.
[24] P. Balatti, D. Kanoulas, N. G. Tsagarakis, and A. Ajoudani, “Towards
Robot Interaction Autonomy: Explore, Identify, and Interact,” in
International Conference on Robotics and Automation (ICRA), 2019,
pp. 9523–9529.
... Bai et al. (2018) presented a novel garbage pickup robot tested on grass using a learning-based object segmentation algorithm. Liu et al. (2021) developed a comprehensive system that uses deep learning for object segmentation and classification Frontiers in Robotics and AI 02 of different classes. ...
... The LitterBot also only requires 2D images for picking and binning, unlike prior works which require 3D point cloud images (Lukka et al., 2014;Raptopoulos et al., 2020;Liu et al., 2021) for planning appropriate grasp movements. Our method circumvents this through the use of a soft gripper which greatly simplifies the control complexity without sacrificing performance. ...
... 08 through more advanced and complex computer vision algorithms. Other networks such as YOLACT which was used in (Liu et al., 2021) will also be tested to increase the control frequency of the visual-servoing approach such that faster-moving objects can be tracked and objects can be grasped faster. ...
Full-text available
Road infrastructure is one of the most vital assets of any country. Keeping the road infrastructure clean and unpolluted is important for ensuring road safety and reducing environmental risk. However, roadside litter picking is an extremely laborious, expensive, monotonous and hazardous task. Automating the process would save taxpayers money and reduce the risk for road users and the maintenance crew. This work presents LitterBot, an autonomous robotic system capable of detecting, localizing and classifying common roadside litter. We use a learning-based object detection and segmentation algorithm trained on the TACO dataset for identifying and classifying garbage. We develop a robust modular manipulation framework by using soft robotic grippers and a real-time visual-servoing strategy. This enables the manipulator to pick up objects of variable sizes and shapes even in dynamic environments. The robot achieves greater than 80% classified picking and binning success rates for all experiments; which was validated on a wide variety of test litter objects in static single and cluttered configurations and with dynamically moving test objects. Our results showcase how a deep model trained on an online dataset can be deployed in real-world applications with high accuracy by the appropriate design of a control framework around it.
... At the end of the study, they determined training success as 99.3% and test success as 93.2%. In their study, Jingyi et al. [6] performed waste recognition and separation with a mobile manipulator. First, they detected the garbage with the algorithm they created on the TACO data set in the literature, which they called GarbageNet. ...
... It is the harmonic mean of sensitivity and precision. Equation 6 shows how the f-1 score is calculated. ...
Full-text available
The management of recycling wastes is one of the most important issues because of the increasing production rates. The collecting and recycling of waste are also becoming more crucial for economic and environmental reasons because landfill space is becoming more and more limited. Automatic sorting systems are defined as systems that separate recyclable waste materials with robotic manipulators where human intervention is minimal. In this study, while determining the type of waste, the location of the waste will be determined in 3D with a depth camera and image processing techniques.
... Research shows that in the past three decades, the proportion of organic matter has stayed almost stagnant at 41%, but recyclables have risen from 9.56% to 17.18%, as seen in table 4.2 below. Garbage is projected to contain approximately 45-75% biodegradable waste in Indian cities (as opposed to 25 percent of united state city garbage) with 50-55% moisture; 35-45% food biomass, fruit and vegetable; and 8-15% non-organic materials such as metal, stones, glass, plastic (Liu et al. 2021). ...
OBJECTIVE Solid waste supervision is one of the most severe environmental issues in developing countries, where services are frequently inadequate, especially in low-income areas. These communities frequently account for a significant amount of the city's land and population. Rapid urbanisation, rising population density, traffic congestion, air and water pollution, rising per capita solid waste creation, and a lack of garbage disposal land all contribute to the worsening of solid waste management issues. With the expansion of the residential, commercial, and industrial sectors, as well as increased economic development, future demand will rise. The purpose of this chapter is to look into the present solid waste management practises, understanding the expertise and skills of the different elements of urban solid waste
... Their advantage is that the current understanding of physical systems is refined and works well on fully actuated robots. Most methods focus on WBC for quadrupeds [14], [15], [16], humanoids [17], [18], [19], [20], [21], animaloids [22], [23], [24], or mobile manipulators [17], [25]. Model Predictive Control methods are popular with works such as Minniti et al. [26] showing success in WBC pose-tracking and interaction tasks. ...
Many robotic path planning problems are continuous, stochastic, and high-dimensional. The ability of a mobile manipulator to coordinate its base and manipulator in order to control its whole-body online is particularly challenging when self and environment collision avoidance is required. Reinforcement Learning techniques have the potential to solve such problems through their ability to generalise over environments. We study joint penalties and joint limits of a state-of-the-art mobile manipulator whole-body controller that uses LIDAR sensing for obstacle collision avoidance. We propose directions to improve the reinforcement learning method. Our agent achieves significantly higher success rates than the baseline in a goal-reaching environment and it can solve environments that require coordinated whole-body control which the baseline fails.
... From these investigations, domain-specific sorting approaches for underwater [40], [41], [42], [43], floating [44], [45], [46], space [47], [48], nuclear [49], [50], [51], [52], [53], [54], [55], and biomedical [56], [57] waste were excluded. Rather, Municipal and CNI [58] waste, along with ND, CND [36], WEEE [59], [60], garbage in public spaces [61], [62], [63], [64], [65], [66], [67], [68], and household waste [69], [70], [71], [72], [73] were included. Furthermore, material-or product-specific waste sorting was also researched. ...
Full-text available
To achieve recycling of mixed industrial waste toward an advanced sustainable society, waste sorting automation through robots is crucial and urgent. For this purpose, a robot is required to recognize the category, shape, pose, and condition of different waste items and manipulate them according to the category to be sorted. This survey considers three potential difficulties in the sorting automation: 1) End-effector: to robustly grasp and manipulate different waste items with dirt and deformations; 2) Sensor: to recognize the category, shape, and pose of existing objects to be manipulated and the wet and dirty conditions of their surfaces; and 3) Planner: to generate feasible and efficient sequences and trajectories. This survey includes 76 references to studies related to automatic waste sorting and 159 references to worldwide waste recycling attempts. This pioneering investigation reveals the possibility and limitations of conventional systems; thus, providing insights on open issues and potential technologies to achieve a robot-incorporated sorter for the chaotic mixed waste is one of its contributions. This paper further presents a system design policy for readers and discusses future advanced sorters, thereby contributing to the field of robotics and automation. Note to Practitioners —Most automated sorting systems operate for limited target waste items. This study is motivated by the automation of mixed industrial waste treatment facilities using advanced robotic sorters. Emerging advances and increasing functionalities of robot system components will widen system applicability and increase use cases in the chaotic mixed industrial waste domain. This paper surveys the research conducted to date, discusses open issues and potential approaches, and presents user guides that provide practitioners with a system design policy. The user guides created according to the strengths and weaknesses of each system configuration provide future researchers and developers with a useful a priori design policy that has been thus far validated on efficiency, quality, productivity, and reliability. A question-and-answer style guide and a sorting-target-aware previous study reference list allows users to find the desired system configuration, including the investigated components according to their purpose.
... The data between the robot and the VR headset were transmitted via 5 GHz Wi-Fi to ensure bandwidth. Several experiments, e.g., VR-based manipulation, were also performed on a mobile manipulator at the early stage of development (UCL MPPL [20]). ...
Full-text available
Human life is invaluable. When dangerous or life-threatening tasks need to be completed, robotic platforms could be ideal in replacing human operators. Such a task that we focus on in this work is the Explosive Ordnance Disposal. Robot telepresence has the potential to provide safety solutions, given that mobile robots have shown robust capabilities when operating in several environments. However, autonomy may be challenging and risky at this stage, compared to human operation. Teleoperation could be a compromise between full robot autonomy and human presence. In this paper, we present a relatively cheap solution for telepresence and robot teleoperation, to assist with Explosive Ordnance Disposal, using a legged manipulator (i.e., a legged quadruped robot, embedded with a manipulator and RGB-D sensing). We propose a novel system integration for the non-trivial problem of quadruped manipulator whole-body control. Our system is based on a wearable IMU-based motion capture system that is used for teleoperation and a VR headset for visual telepresence. We experimentally validate our method in real-world, for loco-manipulation tasks that require whole-body robot control and visual telepresence.
Full-text available
Inspection and repair interventions play vital roles in the asset management of railways. Autonomous mobile manipulators possess considerable potential to replace humans in many hazardous railway track maintenance tasks with high efficiency. This paper investigates the prospects of the use of mobile manipulators in track maintenance tasks. The current state of railway track inspection and repair technologies is initially reviewed, revealing that very few mobile manipulators are in the railways. Of note, the technologies are analytically scrutinized to ascertain advantages, unique capabilities, and potential use in the deployment of mobile manipulators for inspection and repair tasks across various industries. Most mobile manipulators in maintenance use ground robots, while other applications use aerial, underwater, or space robots. Power transmission lines, the nuclear industry, and space are the most extensive application areas. Clearly, the railways infrastructure managers can benefit from the adaptation of best practices from these diversified designs and their broad deployment, leading to enhanced human safety and optimized asset digitalization. A case study is presented to show the potential use of mobile manipulators in railway track maintenance tasks. Moreover, the benefits of the mobile manipulator are discussed based on previous research. Finally, challenges and requirements are reviewed to provide insights into future research.
Full-text available
The rising amount of waste generated worldwide is inducing issues of pollution, waste management, and recycling, calling for new strategies to improve the waste ecosystem, such as the use of artificial intelligence. Here, we review the application of artificial intelligence in waste-to-energy, smart bins, waste-sorting robots, waste generation models, waste monitoring and tracking, plastic pyrolysis, distinguishing fossil and modern materials, logistics, disposal, illegal dumping, resource recovery, smart cities, process efficiency, cost savings, and improving public health. Using artificial intelligence in waste logistics can reduce transportation distance by up to 36.8%, cost savings by up to 13.35%, and time savings by up to 28.22%. Artificial intelligence allows for identifying and sorting waste with an accuracy ranging from 72.8 to 99.95%. Artificial intelligence combined with chemical analysis improves waste pyrolysis, carbon emission estimation, and energy conversion. We also explain how efficiency can be increased and costs can be reduced by artificial intelligence in waste management systems for smart cities.
Conference Paper
Full-text available
The aim of this research is to improve municipal trash collection using image processing algorithms and deep learning technologies for detecting trash in public spaces. This research will help to improve trash management systems and help to create a smart city. Two Convolutional Neural Networks (CNN), both based on the AlexNet network architecture, were developed to search for trash objects in an image and separate recyclable items from the landfill trash objects, respectively. The two-stage CNN system was first trained and tested on the benchmark TrashNet indoor image dataset and achieved great performance to prove the concept. Then the system was trained and tested on outdoor images taken by the authors in the intended usage environment. Using the outdoor image dataset, the first CNN achieved a preliminary 93.6% accuracy to identify trash and non-trash items on an image database of assorted trash items. A second CNN was then trained to distinguish trash that will go to a landfill from the recyclable items with an accuracy ranging from 89.7% to 93.4% and overall 92%. A future goal is to integrate this image processing based trash identification system in a smart trash can robot with a camera to take real-time photos that can detect and collect the trash all around it.
Conference Paper
Full-text available
Incorporating robots into industrial settings is not a new concept, but their use in the waste recycling industry is critical. Recently AI-assisted robots are used to support waste sorting and improve the quantity and quality of recovered materials. This article aims to study and apply a new transfer paradigm for recyclable sorting using Delta robots, which is based on replacing the usual Pick-and-Place process with the much faster Pick-and-Toss process. Current robotic sorting systems can sort one item per second, Pick-and-Toss intends to significantly advance this score. We quantitatively and qualitatively assess the tossing approach by comparing it to Pick-and-Place, in terms of accuracy and robustness, both in simulation and on a real waste sorting lab-setup equipped with an ABB-IRB360 Delta robot. Overall, the Pick-and-Toss approach proves to be a powerful mechanism that succeeds faster sorting of waste streams in comparison to the standard Pick-and-Place procedure.
Full-text available
To tackle the water surface pollution problem, a vision-based water surface garbage capture robot has been developed in our lab. In this article, we present a modified you only look once v3-based garbage detection method, allowing real-time and high-precision object detection in dynamic aquatic environments. More specifically, to improve the real-time detection performance, the detection scales of you only look once v3 are simplified from 3 to 2. Besides, to guarantee the accuracy of detection, the anchor boxes of our training data set are reclustered for replacing some of the original you only look once v3 prior anchor boxes that are not appropriate to our data set. By virtue of the proposed detection method, the capture robot has the capability of cleaning floating garbage in the field. Experimental results demonstrate that both detection speed and accuracy of the modified you only look once v3 are better than those of other object detection algorithms. The obtained results provide valuable insight into the high-speed detection and grasping of dynamic objects in complex aquatic environments autonomously and intelligently.
Full-text available
Modern society produces an enormous quantity of wastes. No matter whether it is ordinary garbage (municipal solid waste), hazardous, medical, electronic, pharmaceuticals and personal care products, or nuclear waste, all have the potential to negatively impact human and ecological health, unless managed properly; which, unfortunately, has not been the case. Despite the large volume of published scientific studies on acute and chronic health problems and deaths caused by careless and uncontrolled disposal of wastes, open dumping of solid, hazardous, and other wastes is common in many countries even today. Of a total estimated quantity of over 2 billion metric tons of municipal solid waste, generated globally in 2016, nearly 0.7 billion tons, or 33%, ended up as dumps, mostly in developing countries-and in low-income African countries 93% of the waste was deposited at dump sites 1. These dumps are frequented by the socially-and economically-disadvantaged people of the society who make their living by picking up any and every marketable material, even food for subsistence. These workers include children and women who sift through the garbage pile, spending long hours without any protection from heat, cold, rain or toxic fumes and other dangerous substances. The unsanitary and hazardous conditions to which the 'garbage pickers' are exposed to on a regular basis, result in disease and chronic health problems, in addition to impairment of air, soil, surface and groundwater quality; and harming plants and wildlife in the region. Fires, floods, and landslides that often occur at these dump sites cause additional deaths and injuries. Recycling of electronic wastes and large ships-that is being carried out in several developing countries of the world-has brought to fore yet another serious health and ecological problem, including workers morbidity and mortality, that needs attention. Solution to health problems associated with waste management requires a multidisciplinary approach by earth and environmental scientists, health care professionals, social and behavioral experts, administrators, politicians, and legal professionals. The presentation aims at initiating scientific discourse on this important topic that has not been adequately addressed, and to find ways to solve the problem. The presentation also provides an overview of the universe of waste generated in modern society, their potential to cause adverse impact on human and ecological health; along with case studies to highlight the urgent need for serious discussion by medical geology and health science professionals in collaboration with administrators and policy makers to develop workable solutions.
Conference Paper
Full-text available
Nowadays, robots are expected to enter in various application scenarios and interact with unknown and dynamically changing environments. This highlights the need for creating autonomous robot behaviours to explore such environments, identify their characteristics and adapt, and build knowledge for future interactions. To respond to this need, in this paper we present a novel framework that integrates multiple components to achieve a context-aware and adaptive interaction between the robot and uncertain environments. The core of this framework is a novel self-tuning impedance controller that regulates robot quasi-static parameters, i.e., stiffness and damping, based on the robot sensory data and vision. The tuning of the parameters is achieved only in the direction(s) of interaction or movement, by distinguishing expected interactions from external disturbances. A vision module is developed to recognize the environmental characteristics and to associate them to the previously/newly identified interaction parameters, with the robot always being able to adapt to the new changes or unexpected situations. This enables a faster robot adaptability, starting from better initial interaction parameters. The framework is evaluated experimentally in an agricultural task, where the robot effectively interacts with various deformable environments.
Conference Paper
Full-text available
Complex interactions with unstructured environments require the application of appropriate restoring forces in response to the imposed displacements. Impedance control techniques provide effective solutions to achieve this, however, their quasi-static performance is highly dependent on the choice of parameters, i.e. stiffness and damping. In most cases, such parameters are previously selected by robot programmers to achieve a desired response, which limits the adaptation capability of robots to varying task conditions. To improve the generality of interaction planning through task-dependent regulation of the parameters, this paper introduces a novel self-regulating impedance controller. The regulation of the parameters is achieved based on the robot's local sensory data, and on an interaction expectancy value. This value combines the interaction values from the robot state machine and visual feedback, to authorize the autonomous tuning of the impedance parameters in selective Cartesian axes. The effectiveness of the proposed method is validated experimentally in a debris removal task.
We present a new, embarrassingly simple approach to instance segmentation. Compared to many other dense prediction tasks, e.g., semantic segmentation, it is the arbitrary number of instances that have made instance segmentation much more challenging. In order to predict a mask for each instance, mainstream approaches either follow the “detect-then-segment” strategy (e.g., Mask R-CNN), or predict embedding vectors first then use clustering techniques to group pixels into individual instances. We view the task of instance segmentation from a completely new perspective by introducing the notion of “instance categories”, which assigns categories to each pixel within an instance according to the instance’s location and size, thus nicely converting instance segmentation into a single-shot classification-solvable problem. We demonstrate a much simpler and flexible instance segmentation framework with strong performance, achieving on par accuracy with Mask R-CNN and outperforming recent single-shot instance segmenters in accuracy. We hope that this simple and strong framework can serve as a baseline for many instance-level recognition tasks besides instance segmentation. Code is available at
This paper presents a novel teleoperation interface that enables remote loco-manipulation control of a MObile Collaborative robotic Assistant (MOCA). MOCA is a new research platform developed at IIT, which is composed by a lightweight manipulator arm, a Pisa/IIT SoftHand, and a mobile platform driven by four Omni-directional wheels. A whole-body impedance controller is consequently developed to ensure accurate tracking of the impedance and position trajectories at MOCA end-effector by considering the causal interactions in such a dynamic system. The proposed teleoperation interface provides the user with two control modes: Locomotion and Manipulation. The Locomotion mode receives inputs from a personalised human Center-of-Pressure model, which enables real-time navigation of MOCA mobile base in the environment. The Manipulation mode receives inputs from a tele-impedance interface, which tracks human arm endpoint stiffness and trajectory profiles in real-time and replicates them using the MOCA's whole-body impedance controller. To evaluate the performance of the proposed teleoperation interface in the execution of remote tasks with dynamic uncertainties, a sequence of challenging actions, i.e., navigation, door opening, and wall drilling, has been considered in the experimental setup.