Content uploaded by Shatadal Mishra
Author content
All content in this area was uploaded by Shatadal Mishra on May 26, 2021
Content may be subject to copyright.
1
Autonomous Vision-guided Object Collection from
Water Surfaces with a Customized Multirotor
Shatadal Mishra, Student Member, IEEE, Danish Faraaz Syed, Student Member, IEEE,
Michael Ploughe, and Wenlong Zhang∗,Member, IEEE
©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including
reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or
reuse of any copyrighted component of this work in other works. Citation information: DOI 10.1109/TMECH.2021.3080701, IEEE/ASME Transactions on
Mechatronics, https://ieeexplore.ieee.org/document/9432733
Abstract—This paper presents a multirotor system with an
integrated net mechanism for autonomous object detection and
collection from water surfaces utilizing only onboard sensors.
The task of object collection on water surfaces is challenging
due to the following reasons: i) propeller outwash alters the
object’s dynamics, ii) unpredictable current flows, iii) extreme
reflection and glare off water surfaces affects the object detection,
and iv) noisy height measurements over water surface. A two-
phase object detection algorithm is developed with a linearized
polarization filter and specularity removal algorithm to eliminate
reflections and an edge-based contour detector for detection of
objects on water surface. Subsequently, a boundary layer sliding
mode control is implemented to ensure the system is robust to
modeling uncertainties. A dynamic sliding surface is designed
based on constrained linear model predictive control. The efficacy
of the proposed collection system is validated by multiple outdoor
tests. Multiple objects of different shapes and sizes are collected
with an overall success rate of 91.6%.
Index Terms—Vision based control, Aerial grasping, Sliding
control, Unmanned aerial vehicles.
I. INTRODUCTION
THE advent of unmanned aerial vehicles (UAVs) has
opened up numerous opportunities for executing mission-
critical tasks such as search and rescue, package delivery [1],
exploration of constrained spaces [2], [3] and water sample
collection [4]. UAVs can fly over dams and canals for water
sample collection, inspection and potentially avoid physical
labor for these hazardous tasks. For instance, canals in Arizona
have played a critical role in irrigation, power generation, and
human’s daily uses. These canals require frequent cleaning and
inspection. Currently, they are dried up periodically to collect
trash items which is a time-consuming and expensive process.
The objective of this work is to develop a cost effective and
time-efficient autonomous aerial robotic manipulation system
for collecting floating objects on water surfaces.
Autonomous object collection from water surfaces using
UAVs poses challenges in 1) aerial manipulation and object
collection and 2) landing on a moving object. The field of
aerial grasping and manipulation has witnessed great progress
recently. A multirotor equipped with a 7-degree of freedom
This work was supported by the Salt River Project.
S. Mishra, D. F. Syed, and W. Zhang are with the Polytech-
nic School, Ira A. Fulton Schools of Engineering, Arizona State
University, Mesa, AZ 85212, USA. Email: {smishr13, dsyed,
wenlong.zhang}@asu.edu
M. Ploughe is with Water Quality & Waste Management, Salt River Project,
Phoenix, AZ 85072, USA. Email: Mike.Ploughe@srpnet.com.
∗Address all correspondence to this author.
a) Object detected b) Multirotor descends
c) Multirotor lands on
object
d) Multirotor picks up
object
Fig. 1: Autonomous object collection from water surface using
a multirotor. Video link: https://youtu.be/Yj0 LIz027s
(DOF) manipulator was proposed in [5] for aerial manipulation
tasks. The 7-DOF manipulator is integrated with a camera to
track an object. Pounds et al. employed a helicopter for aerial
grasping and discussed the effects of ground wash [6]. To
ensure repeatable grasping performance, precise motion con-
trol is necessary to ensure the object lies within the grasper’s
workspace. Additionally, adding a robotic manipulator would
increase the aircraft gross weight (AGW), which reduces the
overall flight time and effective payload. In our previous work
on autonomous aerial grasping [7], a hexarotor was integrated
with a three-finger soft grasper, which was made of silicone
with pneumatically-controlled channels. Experimental results
demonstrated that off-centered and irregularly-shaped objects
were grasped successfully by the soft grasper. In [8], a de-
formable quadrotor was proposed for whole-body grasping. An
origami-inspired foldable arm was proposed which could be
utilized for performing different tasks in confined spaces [9].
Although the proposed robotic arm is extremely lightweight,
the arm has one degree of freedom and can pick up objects
from limited range of directions. Despite considerable work
in this field, aerial grasping of objects on water surfaces poses
2
additional challenges such as i) random motion of floating
objects due to unpredictable current flow, and ii) partially
submerged objects. In this work, we propose a net-based
collection system, with a large workspace, to address these
challenges. In addition to an integrated net system, a reliable
vision-based control algorithm is necessary for successfully
landing and collecting the floating object.
Considerable research has been conducted for a multirotor
to autonomously land on a moving target. The Deep Determin-
istic Policy Gradients (DDPG) algorithm was used for landing
a UAV on a moving target [10]. In [11], a minimum-energy
based trajectory planning method was proposed to land on
a moving target with a constant velocity. In [12], Lee et al.
proposed a line-of-sight based trajectory tracking control for
quadrotor landing on a moving vehicle in outdoor conditions.
In this work, the velocity commands were generated based on
the relative distance between the current and target positions.
Multiple outdoor tests were performed in [13] to autonomously
land a UAV on a moving target using model predictive control
(MPC). In [14], a small quadrotor was demonstrated to track
a moving spherical ball and the quadrotor’s planned trajectory
was updated using a receding horizon trajectory planner. How-
ever, there are several significant differences between landing
on a moving target on ground and a floating object on water
surface. Generally, moving targets consist of distinctive mark-
ers which reduce the complexity of tracking [10], [11], [13],
[15], and dynamics of the moving target is deterministic [11],
[16]. However, tracking floating objects on a water surface is
challenging due to reflection and glare from water surfaces.
Moreover, the motion of a floating object in close proximity
to a multirotor’s propeller outwash is complex and random.
Therefore, a robust control technique is required to handle
modeling uncertainties for reliably tracking and landing on the
floating object. In this paper, a boundary layer sliding mode
control (BLSMC) with a dynamic sliding surface is proposed.
The dynamic sliding manifold is designed to eliminate the
reaching phase for sliding mode control and robustness is en-
sured from the start of the motion. The proposed robust vision
based controller in conjunction with an integrated net system
collected floating objects of different shapes and sizes with a
91.6% success rate. To the best of the authors’ knowledge, this
is the first work which demonstrates an autonomous multirotor
system for object collection from water surfaces. The main
contributions of this work are summarized as follows:
•A net-based multirotor system with flotation is proposed
for object collection from water surfaces.
•A computationally efficient contour-based algorithm with
specularity removal is developed for detecting objects on
water surfaces under different illumination conditions. A
comparison with different detectors is provided to show
the advantages of our algorithm.
•A BLSMC approach, with a dynamic sliding surface,
is developed. A constrained MPC is designed for de-
termining the optimal sliding surface and the controller
demonstrates reliable object tracking in the presence
of modeling uncertainties while enabling the system to
collect objects under different weather conditions.
80 cm
70 cm
Net Open Net Closed
Servo Arm
Camera
Servo
Lidar
Wood Dowel
Buoyant Foams
GPS
High Level
Computer
Front
Rear
Fig. 2: Multirotor system setup. Top: Multirotor with attached
net system; Bottom row: Net open, net closed and hardware
components.
The rest of the paper is structured as follows: Section II
describes the hardware components of the aerial manipulation
system. The vision algorithm for autonomous object detection
is introduced in Section III. Section IV describes the system
modeling and control of the combined object and multirotor
system followed by simulation results. Experimental results
are demonstrated and analyzed in Sections V. Section VI
concludes the paper and discusses the future work.
II. MU LTIR OTO R SET UP
In this section, the hardware and software components of
the aerial manipulation system are described. The system
demonstrates complete autonomy by utilizing the onboard
sensors and actuators for detection, tracking and collection
of floating objects on water surfaces.
A. Aerial Platform and Actuation System
A co-axial octorotor base frame (3DR Robotics, Berkeley,
CA) was utilized because of its enhanced in-air stability,
compact design, and high thrust-to-weight ratio. The base
frame is customized to attach the flotation system as shown
in Fig. 2. The octorotor has a flight time of 12 minutes with a
500-gram payload. The battery capacity is 11,000 mAh. The
entire system, excluding the payload, weighs 2,352 grams.
B. Flotation and Integrated Net System
The co-axial octorotor is equipped with two polyethylene
buoyant and cylindrical foams to land and float on water. The
foams’ dimensions are determined to provide enough buoyant
force for keeping the aerial system afloat. The generated
buoyant force is 41.2 N whereas weight of the aerial system
is 23.07 N. The mass of the flotation system is only 120
grams. The buoyant foams are each located 40 cm away from
the central body to prevent toppling on water surfaces. The
3
Object’s Inertial
Pose Estimation
Object Detection Vision Based
Control (Eq. 4)
Multirotor 6-D
Pose Estimation
Multirotor
Attitude Control
Visual
Features
(50 Hz)
Multirotor Position
and Velocity (80 Hz)
Inertial Frame
Acceleration
(50 Hz)
Multirotor
Orientation (150Hz)
Flight
Control Unit
Desired Torques
(150 Hz)
Multirotor and
Camera System
Images (80 fps)
IMU, Baro (200 Hz)
GPS (15Hz)
Rotor
Signals
(400 Hz)
Integrated Net System Servo Commands (20 Hz)
Propeller Outwash
LiDAR (15Hz)
Object
Multirotor Position (80 Hz)
Object Collection
xo,y
o
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
vxo,v
yo
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
xq,y
q
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
xq,y
q,v
xq,v
yq
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
ux,u
y
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
(F)
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
<latexit sha1_base64="(null)">(null)</latexit>
Algorithm 1
Integrated Net System
Fig. 3: Block diagram of the aerial manipulation system. Green, orange and blue blocks depict object detection and estimation,
control modules and the hardware respectively. The object detection block detects the object which drifts due to the propeller
outwash. The object’s position is estimated by object’s inertial pose estimation block. The vision based control block implements
the proposed controller and the inertial acceleration are sent to the multirotor attitude control which generates desired attitude
setpoints. Eventually, the generated torques are applied to the multirotor system.
object is collected by an active net mechanism, which has a
larger workspace compared to a grasper, as shown in Fig. 2. A
durable polypropylene based net is chosen as it is lightweight
while providing a large lifting capacity. A Savox SW-0231MG
(Savox, Taiwan) high torque motor actuates a servo arm, which
is attached perpendicularly to the center of a wood dowel. One
end of the net is attached to the wood dowel and the other end
of the net is fixed to the rear end of the multirotor as shown
in Fig. 2. When the servo is deactivated, the net is pushed
towards the rear end and when the servo is activated the net
covers the entire area within the buoyant foams.
C. Sensor and Computation Subsystem
The multirotor system is equipped with the PIXHAWK
flight controller which has an inertial measurement unit (IMU)
for attitude estimation and differential pressure sensors along
with GPS position estimation. A Teraranger Evo (Terabee,
France) single-shot LiDAR is used for height estimation
above the water surface for water landing. An oCam-1CGN-U
(Withrobot, Rep. of Korea) global shutter monocular camera
for object detection. The camera outputs 640×480 images at
80 FPS. The high-level computer is an Intel UpBoard which
runs the object detection and vision-based control functions.
Figure 3 shows the software blocks of the proposed approach.
The object’s position is estimated based on the image frame
coordinates received from object detection block. The image
frame coordinates are converted to object’s position in the
camera frame based on the virtual camera approach [17].
The vision-based control approach receives the multirotor’s
and object’s estimated position and velocity. Eventually, the
generated inertial frame acceleration is scaled and sent to
multirotor attitude control.
III. OBJECT DET EC TI ON ALGORITHM
The goal of the vision algorithm is to detect objects on water
surfaces. We use a combination of a linear polarization filter
in front of the camera lens and a reflection removal algorithm
to suppress reflections and employ an edge detector followed
by contour extraction for object detection.
a) b)
(a) (c)
(b) (d)
Fig. 4: Object detection algorithm: (a) Current polarized frame;
(b) Largest contour extracted; (c) Specularity removal from
pixels outside the extracted contour; (d) Object detected.
A. Challenges
Detection of objects on water surfaces pose a multitude
of challenges. A major challenge with outdoor testing is the
glare and reflection of light from the water surfaces which
makes object detection extremely complex. Our first approach
was to remove these reflections using background subtraction
techniques such as the Gaussian Mixture model [18] or per-
pixel Bayesian Mixture Model [19]. However, these algorithms
assume still backgrounds and are ineffective when applied to
backgrounds that are in motion [20]. Another challenge is the
change in the aspect ratio of the object as the multirotor lands
on it. This called for the need of a reliable object detection
algorithm that is not affected by scale changes. Changing
illumination conditions and partial occlusion of the bottle due
to drift pose additional challenges.
B. Proposed Strategy
The vision algorithm is illustrated in Fig. 4 and the pseudo-
code is given by Alg. 1. The complete algorithm is as follows:
Read polarized video frame (Line 1): In the first phase
of the algorithm, a thin filament of polarization filter is used
in front of the camera lens to suppress a major portion of the
specular reflections, as shown in Fig. 4a.
Estimate initial position of object and store bounding
box coordinates (Lines 3-11): The polarized video feed is
passed on to a canny edge detector and closed contours are
extracted. A contour search method is employed to find the
largest closed contour. Due to the reflections on the water
surface, the largest closed contour can either be an object or
a reflection. Since the reflections are volatile, their contours
aren’t consistently placed in the same region. If the largest
closed contour is consistently placed in the same region for
4
Algorithm 1 Object detection algorithm
1: while read polarized camera frame do
2: if Object detected == false then
3: f←getf rame
4: Canny edge detector(f)
5: contours(i) = F indC ontours(f)
6: Obj =max area{contours(i)}
7: if No contour jumps in a window of 10 frames then
8: Object detected ←true
9: else
10: Object detected ←false
11: end if
12: end if
13: if Object detected == true then
14: f←getf rame
15: Imin(p) = min{fR(p), fG(p), fB(p)}
16: T=µv+η∗σv
17: τ(p) = (T, if Imin (p)> T
Imin(p),otherwise
18: ˆ
β(p) = (0, p within bounding box
Imin(p)−τ(p),otherwise
19: fsf (p) = merge{fR(p)−ˆ
β(p), fG(p)−ˆ
β(p), fB(p)−ˆ
β(p)}
20: Canny edge detector(fsf )
21: contours(i) = F indC ontours(fsf )
22: Obj =max area{contours(i)}
23: if Object not detected for more than 10 frames then
24: Object detected ←false
25: end if
26: end if
27: end while
atleast 10 consecutive frames, it indicates the presence of an
object in the scene as shown in Fig. 4b. The bounding box
coordinates of this initial closed contour are then computed. To
ensure reliable detection, reflections are removed from the next
frame by considering the current bounding box coordinates
of the object. This guarantees that the reflection removal
algorithm removes specular component of the frame without
affecting the features of the object.
Removing specular reflections (Lines 15-19): In the
second phase, the minimum intensity value Imin(p)across
the RGB channel for each pixel in the frame is computed. An
intensity threshold value T=µv+η·σvis then calculated
to distinguish highlighted pixels as shown in [21]. µvand
σvare the mean and standard deviation of the minimum
values Imin over all the pixels. ηis a measure of the specular
degree of an image and is generally set to 0.5. Based on the
calculated intensity threshold T, an offset τis then computed
to determine which pixels have to be changed to suppress
reflections. The specular component ˆ
β(p)of the frame is
computed by subtracting the offset from Imin. For any pixel
inside the bounding box, we set ˆ
β(p)to 0 to preserve the
features of the object to be detected. Finally, the specular
component is subtracted from the frame to get the specular-
free image without removing reflections from the area where
the object is located. It can be clearly seen the reflections
removed without the object being affected by comparing Figs.
4a and 4c.
Detect object and update bounding box coordinates
(Lines 20-26): The contours are extracted again using canny
edge detector and the object is detected as seen in Fig. 4d. The
updated bounding box coordinates are utilized for specularity-
removal from the next frame, and the process iterates. If no
contours are detected for a specified time window, the algo-
rithm reverts back to the maximum-area contour searching.
C. Performance Comparison
The performance comparison was conducted on an Intel
Core i7-10750H CPU system with 10GB RAM and 4 cores.
One-minute test videos each were recorded for five objects at
30 FPS, as shown in Table I, with a polarizing filter in front
of the camera lens. The gain and exposure values were 15 and
45, respectively. The proposed algorithm was compared with
deep-learning based detectors such as YOLOv3 and Mobilenet
SSDv2 pre-trained on the COCO dataset [22]. The metrics
used for the evaluation are the processed frames per second
(FPS), CPU usage, precision (T P /(T P +F P )) and recall
(T P /(T P +F N )), where TP, FP, and FN are the numbers of
True Positives, False Positives, and False Negatives detected
by the detector, respectively.
For training the deep-learning models, 2000 images were
collected and labeled under a single class for the purpose
of object detection. Operations such as shearing, rotation,
brightness modulation were performed to emulate outdoor
conditions and generate synthetic data, increasing the size of
the training dataset to 6000 images. Both YOLO and SSD
were trained until the respective average losses stabilized at
0.13 and 3.1. Subsequently, the models were deployed on the
test videos and results are shown in Table I. It can be noted that
TABLE I. Performance comparison of different detectors
Objects Average FPS Total CPU usage in % Precision & Recall
Ours YOLO SSD Ours YOLO SSD Ours YOLO SSD
White Bottle 21 2.8 11 37.5 86.25 75 0.96 0.97 0.57
Dark carton 21 2.8 11 37.5 86.25 75 0.91 0.96 0.52
Dark can 21 2.8 11 37.5 86.25 75 0.78 0.90 0.22
Silver can 21 2.8 11 37.5 86.25 75 0.91 0.90 0.86
Juice carton 21 2.8 11 37.5 86.25 75 0.97 0.96 0.80
our algorithm is a clear choice considering all the metrics. The
average CPU usage for our algorithm is 37.5% with an average
21 FPS. The YOLO model utilizes 86.25% of the total CPU
with an average 2.8 FPS which is not suitable for real-time
experiments. SSD had a better average 11 FPS when compared
to YOLO, but it also had a high CPU usage of 75% that made
it unsuitable for our purpose. The ground truth generation and
performance evaluation were conducted in MATLAB. This
ground truth was only utilized for calculating precision and
recall, and was not used in field experiments. Precision and
recall are equal in our case since the test videos and the ground
truth video had an equal number of frames which results
to equal number of true positives and false negatives. The
YOLO model has the highest precision and recall for most
of the objects. Our algorithm shows comparable performance
to YOLO. SSD has the lowest precision and recall due to its
inability to detect objects for a large number of frames.
IV. SYS TE M MODELING AND CON TROL
The dynamics of the multirotor and floating object is
outlined in this section. The multirotor’s propeller outwash
moves the floating object around, and a robust controller is
5
Propeller
Outwash (F)
(xo, yo, zo)
Floating Object
(xq, yq, zq)North - Xinertial
East - Yinertial
Down - Zinertial
Forward - Xbody
Right - Ybody
Down - Zbody
AB
C
D
Object Drifts
Fig. 5: Operation principle of the aerial manipulation system.
A. Multirotor approaches the object drifted by propeller out-
wash. B. The object is detected. C. The multirotor lands on
the object. D. The net is activated and object is collected.
subsequently designed for the multirotor to track and land on
the object.
A. Multirotor and Object Dynamics
The planar dynamics of the multirotor and object are studied
in detail as reliable 2-D tracking of the object is necessary for
successful object collection. The following assumptions are
made for dynamic modeling of the multirotor and the object.
1) The drag acting on the multirotor is negligible.
2) The propeller outwash is radially outwards from the
location of the multirotor.
3) Water currents and object’s vertical dynamics are negli-
gible.
4) The direction of force (represented by ~vair in (2)) due
to propeller outwash is along the projection of vector,
on the water surface, from the center of the multirotor
to that of the object on the water surface. It can be seen
in Fig. 5 that the vertical downwash transitions to radial
outwash upon interaction with water surface.
As illustrated in Figure 5, the radial outwash drifts the object
and the force generated due to the airflow governs the dy-
namics of the object. The 3-D translational dynamics of the
multirotor are as follows:
˙xq=vxq,˙yq=vyq ,˙vxq =ux,˙vyq =uy,(1)
˙zq=vzq,˙vzq =g−uz,˙xo=vxo,˙yo=vyo
The 2-D dynamics of the floating object are as follows:
mo˙vxo =−b(vxo)2sgn(vxo) + Fx, Fx=F cosδ, (2)
mo˙vyo =−b(vyo )2sgn(vyo ) + Fy, Fy=F sinδ,
Fx=Fempcosδ + ∆Fx,|∆Fx| ≤ βx, βx≥0,
Fy=Fempsinδ + ∆Fy,|∆Fy| ≤ βy, βy≥0,
Femp =k1|vair|2~vair , δ =tan−1((yo−yq)/(xo−xq))
where: g= 9.81,(ux, uy)∈R2is the control input to
the system, (xq, yq, zq)∈R3is the multirotor’s position,
(vxq, vy q, vz q)∈R3is the multirotor’s velocity, (xo, yo, zo)∈
R3is the object’s position, (vxo, vyo, vz o)∈R3is the object’s
velocity, all in the North-East-Down (NED) Frame. The ob-
ject’s mass is mo∈R,F∈R2is the planar force on the object
due to propeller outwash representing the coupling dynamics
between the multirotor and object. k1∈Ris a function of the
object’s area and density of surrounding fluid. Femp ∈R2is
the empirical formulation of F.(βx, βy)∈R2represent the
bounds on modeling uncertainties. The damping coefficient is
b∈R,vair ∈R2is the airflow velocity surrounding the object
due to the propeller outwash and ~vair ∈R2is the unit vector
of vair.
B. Controller Design
The objective of the control structure is to reduce the
position and velocity errors between the multirotor and object
subject to modeling and parameter uncertainties. A boundary
layer sliding mode control (BLSMC) and constrained MPC
approach is proposed and they offer the following advantages:
The BLSMC strategy is proposed to make the system robust
against modeling uncertainties. A dynamic sliding manifold is
used to eliminate the reaching phase for BLSMC to ensure
robustness from the start of the motion. Furthermore, the
constrained MPC is designed considering the closed loop error
dynamics of our system. It predicts the future position and
velocity errors over a prediction horizon and finds an optimal
control input to drive the errors to zero. For the inertial Z
direction, a PD velocity controller is designed to descend with
a predefined constant velocity. For the 3-D rotational dynamics
of the multirotor, the attitude controller in [23] is implemented.
For object collection, the multirotor has a predefined yaw
setpoint as it can collect objects with any heading.
1) Boundary Layer Sliding Mode Control (BLSMC): A
BLSMC approach is proposed to alleviate the chattering of
control signals as seen in standard sliding mode control. As
the designed control inputs are the target accelerations in the
inertial Xand Ydirections, chattering acceleration signals
are detrimental because they cause a multirotor to constantly
roll and pitch back and forth. As a result, the camera’s
measurements of the object’s position and velocity can be
adversely affected. To design a BLSMC, dynamic sliding
manifolds are defined as the following:
sx= (vxo −vxq) + λx(xo−xq) + φx
sy= (vyo −vyq ) + λy(yo−yq) + φy
(3)
It can be noted that sx(0) = 0 and sy(0) = 0, if φx(0) =
−˙ex(0) −λex(0) and φy(0) = −˙ey(0) −λey(0). This elimi-
nates the reaching phase and the system is inside the boundary
layer from the beginning. Thus the controller to keep the
system in the boundary layer is designed as:
ux=−b
mo
(vxo)2sgn(vxo) + Fempcosθ
mo
+λx(vxo −vxq)
+˙
φx+ (ηx+βx)sat(sx
x
),
uy=−b
mo
(vyo)2sgn(vyo) + Fempsinθ
mo
+λy(vyo −vyq )
+˙
φy+ (ηy+βy)sat(sy
y
)
(4)
where ux, uyare the desired inertial accelerations and sat(·)
is the saturation function. The BLSMC is designed with the
objective to keep the system in the boundary layer and make it
insensitive to modeling and parameter uncertainties. The next
objective is to design ˙
φxand ˙
φysuch that the position and
velocity errors are regulated to the origin in an optimal manner.
6
Formulation for Femp :Femp is a function of vair which is
determined empirically by collecting the windspeed data using
an anemometer. The windspeed, due to propeller outwash, is
collected at horizontal distances dfrom the multirotor every
0.5 m until d= 3 m. For every distance d, the height of the
multirotor hfrom the surface is also varied from 0.5 m to 2
m. Finally, vair is obtained as a function of dand h. In field
experiments, the sum of the first two terms in the controller
is constrained to prevent aggressive maneuvers.
2) Constrained Linear Model Predictive Control (MPC):
A constrained linear MPC approach is proposed to design ˙
φx
and ˙
φy. For the sake of brevity, only the design of ˙
φxis shown
and ˙
φyis designed in the same way. Based on the uxin (4),
the closed-loop error (ex=xo−xq) dynamics is:
¨ex= ˙vxo −˙vxq =−λ˙ex−˙
φx−(ηx+βx)sat(sx
x
)(5)
Within the boundary layer, sat(sx
x) = sx
x
. From (5) and (3):
¨ex=−(λx+ζx
x
) ˙ex−ζxλx
x
ex−˙
φx−ζx
x
φx(6)
where ζx=ηx+βxand the closed-loop error dynamics are:
˙ex
¨ex
˙
φx
=
0 1 0
−ζxλx
x−λx−ζx
x
−ζx
x
0 0 0
ex
˙ex
φx
+
0
−1
1
w(7)
where w=˙
φx. The continuous-time system is discretized and
the cost function for the linear MPC is defined as follows:
J= min
UET
NP EN+
N−1
X
i=0
(ET
iQEi+wT
iRwi)(8)
where U=col(w0, ..., wN−1),Ei=ex(i) ˙ex(i)φx(i)T
and Nis the prediction horizon. The cost function (8) is re-
written into a Quadratic Program (QP) as follows:
J= min
UUT2( ˜
R+˜
ST˜
Q˜
S)U+xT2˜
T˜
Q˜
SU,
s.t. Umin ≤U≤Umax
(9)
An optimal control sequence, U∗, is generated by solving (9)
and w=˙
φx=U∗(0). The matrices are defined as follows:
˜
Q=diag(Q1, ..., QN−1, P ),˜
R=diag(R, ..., R),
˜
S=
B0... 0 0
AB B ... 0 0
: : : : :
AN−1B AN−2B ... AB B
,˜
T=
A
:
:
AN
,
(10)
A=
1dt 0
−ζxλx
xdt −λxdt −ζx
xdt + 1 ζx
xdt
0 0 1
, B =
0
−dt
dt
where dt is the sampling time. This constrained MPC was
implemented in using the qpOASES libraries [24].
0 5 10
Time (Seconds)
-1
-0.5
0
0.5
1
1.5
2
X-axis Position(m)
BLSC+MPC
Xo
Xq
0 5 10
Time (Seconds)
-3
-2.5
-2
-1.5
-1
-0.5
0
X-axis Velocity(m/s)
BLSC+MPC
Vxo
Vxq
0 5 10
Time (Seconds)
-10
-5
0
5
10
Control Input(m/s 2)
BLSC+MPC
0 5 10
Time (Seconds)
-4
-3
-2
-1
0
1
2
X-axis Position(m)
SMC
Xo
Xq
0 5 10
Time (Seconds)
-2
-1.5
-1
-0.5
0
X-axis Velocity(m/s)
SMC
Vxo
Vxq
0 5 10
Time (Seconds)
-5
0
5
10
Control Input(m/s 2)
SMC
Fig. 6: Simulation results: Comparison of states for multirotor
and object, and control input along X-axis.
C. Simulation Results
The performance of the proposed controller is compared
with that of a standard sliding mode controller (SMC) in Fig. 6.
Due to space limitation, comparisons are provided only along
the x-axis. The following gains were used: ζx= 5,λx= 5,
x= 0.1,R= 0.1,P= 250I3and N= 10.ζxand λx
are selected such that state errors converge to origin quickly
while avoiding oscillations due to high gains. xis chosen
such that chatter is eliminated while minimally compromis-
ing robustness. Nis chosen such that transient response is
captured while the computation is low. The lower and upper
bounds Umin and Umax are -10 ms−2and 10 ms−2. The
initial positions are xq(0) = 2 and xo(0) = 0. The initial
velocities are zero. It can be noted that the multirotor can track
the position and velocity of the object with both approaches.
However, for standard SMC both positions continue to evolve
unbounded with time. Whereas, in the proposed approach, both
positions are bounded after they converge as shown in the first
subplot. This can be attributed to the high terminal weight
P, imposed on the terminal state. Moreover, chattering in the
control signal occurs in the SMC, but not in the BLSMC+MPC
approach due to the boundary layer design.
V. F IE LD EX PE RI ME NT S
The efficacy of the proposed system is validated through
a series of field experiments. The experimental setup, results
and discussions on the flight tests are presented in this section.
A. Experimental Setup
The outdoor field experiments were conducted in a lake park
at Gilbert, Arizona (lat:33.3589, lon:-111.7685). The weather
conditions were mostly sunny with some clouds in late after-
noons. The wind speeds varied between 0-5 mph with sporadic
wind gusts. Twenty-four (24) experimental trials, on a sunny
day, were conducted with three objects of different shapes
and sizes, namely juice carton (cuboidal, 400g), white bottle
7
0 5
Time (Seconds)
6
6.5
7
7.5
X-axis Position(m)
Xo
Xq
0 5
Time (Seconds)
-0.5
0
0.5
X-axis Velocity(m/s)
Vxo
Vxq
0 5
Time (Seconds)
-0.2
-0.1
0
0.1
0.2
Control Input(m/s 2)
Ux
Uy
0 5
Time (Seconds)
-2
-1.5
-1
-0.5
Y-axis Position(m)
Yo
Yq
0 5
Time (Seconds)
-0.5
0
0.5
Y-axis Velocity(m/s)
Vyo
Vyq
0 5
Time (Seconds)
0
0.5
1
1.5
2
Height from
water surface (m)
Open loop
Constant
velocity descent
Fig. 7: Results of one successful flight test on a sunny day.
(cylindrical with a slight taper at the neck, 250g) and silver
aluminum can (cylindrical, 150g) as shown in the following
video: https://youtu.be/Yj0 LIz027s. For each object, 8 trials
were conducted which included 4 trials during the morning
and 4 trials during the afternoon to study the efficacy of the
developed system in varying outdoor conditions like illumi-
nation, random water currents and wind gusts. Additionally,
experimental trials were conducted with a dark aluminum can
(cylindrical, 200g) on a cloudy day to demonstrate the system’s
potential. Due to very limited instances of cloudy days in
Arizona, object collection experiments couldn’t be conducted
with a dark carton (cuboidal, 300g).
B. Experimental Results and Discussion
The proposed aerial manipulation system achieved 22 suc-
cessful attempts and 2 failed attempts for experiments con-
ducted on sunny days. Due to the unavailability of outdoor
ground truth data, the error between the final position of the
multirotor and object was utilized to analyze the performance
of the system. The origin for all the experiments was set where
the GPS lock was found. The vision feedback is utilized to
estimate object’s states on the high-level computer. The vision
based control generates desired inertial accelerations which
are sent over UART to the low-level controller. Subsequently,
the attitude control module generates the desired torques.
Additionally, the high torque servo is activated after the
multirotor lands on water and the object is collected. The
experimental results are summarized in Table II, including
the landing time, battery capacity consumed for performing
landing, and norm of error between the final positions of
TABLE II. Experimental results for 22 successful trials and 2
failure attempts on sunny days.
Success 1st failure 2nd failure
Final distance (m) 0.15±0.06 0.38 0.85
Landing duration (s) 7.41±1.10 7.26 6.43
Used capacity (mAh) 80.50±11.63 76.5 72
0 5
Time (Seconds)
1.6
1.8
2
2.2
2.4
2.6
X-axis Position(m)
Xo
Xq
0 5
Time (Seconds)
0
0.2
0.4
0.6
0.8
X-axis Velocity(m/s)
Vxo
Vxq
0 5
Time (Seconds)
-1
-0.5
0
0.5
Control Input(m/s 2)
Ux
Uy
0 5
Time (Seconds)
-7.5
-7
-6.5
-6
Y-axis Position(m)
Yo
Yq
0 5
Time (Seconds)
-1
-0.5
0
0.5
Y-axis Velocity(m/s)
Vyo
Vyq
0 5
Time (Seconds)
0
0.5
1
1.5
2
Height from
water surface (m)
Open loop
Constant
velocity descent
Fig. 8: Results of one failed flight test on a sunny day.
the floating object and multirotor. Due to space constraints,
one successful and one failed trials are described thoroughly.
ζx=ζy= 2,λx=λy= 1.0and ηx=ηy= 0.5were used
for all the experiments. The experimental trials demonstrated
a high success rate for object collection using the proposed
net mechanism and vision-based detection algorithm. Table II
shows that the battery capacity consumed during autonomous
landing is 80.50 mAh, which is 14% of 575 mAh (the average
battery consumption during one trial). The two failed attempts
happened in the late afternoon; one with the can and the other
with the bottle. One success attempt and one failure attempt
are demonstrated in Figs. 7 and 8 respectively. From Fig. 7,
it can be noted that the multirotor reliably tracks the position
of the object along the Xand Yaxes. The aerial system starts
the descent from a height of 1.5-1.7 m above water surface.
The LiDAR is operative above 0.5 m, so once the multirotor
is within 0.5 m above the water surface, the multirotor is
programmed to descend at a constant and low velocity for
about 1 sec without LiDAR data, after which it drops on the
water surface. This is to ensure that the multirotor continues
to track the object when it is in proximity to the water surface
without causing water splash. In the 22 successful trials,
the total time taken, from object detection to autonomously
landing on it, is 7.41 sec.
Figure 8 illustrates one failed attempt. Similar to a suc-
cessful attempt, the multirotor reliably tracks the position and
velocity of the object along Xand Yaxes until 5.407 sec. At
that time, the multirotor is within 0.5 m from the water surface.
Right at this time, the object goes out of the frame along the
multirotor body’s X-axis. As a result, the multirotor pitches
forward, in an attempt to track the object, while continuing to
descend. Despite the pitching maneuver, the object is outside
the workspace of the net system due to erratic motion caused
by turbulent water flow. The final distance between the object
and multirotor’s position is 0.85 m which is outside the net
system’s workspace. Furthermore, both failures occurred in
the late afternoon when collecting objects with a cylindrical
surface, which can be attributed to partial object visibility.
8
0 5
Time (Seconds)
7.5
8
8.5
9
9.5
X-axis Position(m)
Xo
Xq
0 5
Time (Seconds)
-1
-0.5
0
0.5
1
X-axis Velocity(m/s)
Vxo
Vxq
0 5
Time (Seconds)
-0.4
-0.2
0
0.2
0.4
Control Input(m/s 2)
Ux
Uy
0 5
Time (Seconds)
-2
-1.8
-1.6
-1.4
-1.2
-1
Y-axis Position(m)
Yo
Yq
0 5
Time (Seconds)
-1
-0.5
0
0.5
1
Y-axis Velocity(m/s)
Vyo
Vyq
0 5
Time (Seconds)
0
0.5
1
1.5
2
Height from
water surface (m)
Open loop
Constant
velocity descent
Fig. 9: Result of one successful flight test with a dark shade
object on a cloudy day.
0 5
Time (Seconds)
9
9.5
10
10.5
11
X-axis Position(m)
Xo
Xq
0 5
Time (Seconds)
-0.2
0
0.2
0.4
0.6
0.8
X-axis Velocity(m/s)
Vxo
Vxq
0 5
Time (Seconds)
-0.3
-0.2
-0.1
0
0.1
0.2
Control Input(m/s 2)
Ux
Uy
0 5
Time (Seconds)
-6.5
-6
-5.5
-5
-4.5
-4
Y-axis Position(m)
Yo
Yq
0 5
Time (Seconds)
-0.8
-0.6
-0.4
-0.2
0
Y-axis Velocity(m/s)
Vyo
Vyq
0 5
Time (Seconds)
0
0.5
1
1.5
2
Height from
water surface (m)
Open loop
Constant
velocity descent
Fig. 10: State and control input plots for standard SMC with
a white bottle on sunny day.
Some potential methods to improve object collection include
usage of optical flow methods and a camera on a gimbal to
have a flexible field of view. Experimental trials were also
conducted on a cloudy day with a dark aluminum can. Fig.
9 demonstrates one successful attempt for dark shade object
collection on a cloudy day. The multirotor system successfully
lands on the object and the final distance between the object’s
and multirotor’s position is 0.12 m. For comparison, a standard
SMC was implemented and the results were shown in Fig.
10. The flight test for standard SMC was conducted with
ζx=ζy= 1,λx=λy= 0.5as the system was extremely
aggressive with ζx=ζy= 2,λx=λy= 1.0. During this
flight test, the final distance between the object and multirotor
was 0.39 m and the object was outside the net’s workspace.
The failure can be attributed to the jitters in control inputs.
VI. CONCLUSION AND FUTURE WO RK
In this paper, we designed a vision-based robust motion
control structure for a customized multirotor system to au-
tonomously collect objects from water surfaces. The proposed
control approach was insensitive to modeling uncertainties
from aerodynamic forces. The developed vision system was
resilient to extreme reflections and glare from the water
surfaces. Experimental trials were conducted under various
conditions to demonstrate the efficacy of the proposed system.
Future work includes design of a high-fidelity motion model
of the floating object for improved object’s position estimation.
The system will be extended to collect multiple floating objects
with flowing water currents.
REFERENCES
[1] H. Shakhatreh et al, “Unmanned aerial vehicles (uavs): A survey on
civil applications and key research challenges,” Ieee Access, vol. 7, pp.
48 572–48 634, 2019.
[2] D. Yang et al, “Design, planning, and control of an origami-inspired
foldable quadrotor,” in American Control Conference. IEEE, 2019, pp.
2551–2556.
[3] K. Patnaik et al, “Design and Control of SQUEEZE: A Spring-
augmented QUadrotor for intEractions with the Environment to
squeeZE-and-fly,” in Int. Conf. Intelligent Robots and Systems. IEEE,
2020, pp. 1364–1370.
[4] J.-P. Ore et al, “Autonomous aerial water sampling,” Journal of Field
Robotics, vol. 32, no. 8, pp. 1095–1113, 2015. [Online]. Available:
https://onlinelibrary.wiley.com/doi/abs/10.1002/rob.21591
[5] G. Heredia et al, “Control of a multirotor outdoor aerial manipulator,”
in 2014 IEEE/RSJ International Conference on Intelligent Robots and
Systems, 2014, pp. 3417–3422.
[6] P. E. I. Pounds et al, “Grasping from the air: Hovering capture and load
stability.” in ICRA. IEEE, 2011, pp. 2491–2498.
[7] S. Mishra et al, “Design and control of a hexacopter with soft grasper
for autonomous object detection and grasping,” in Dynamic Systems and
Control Conference, vol. 51913. ASME, 2018, p. V003T36A003.
[8] N. Zhao et al, “The deformable quad-rotor enabled and wasp-pedal-
carrying inspired aerial gripper,” in 2018 IEEE/RSJ International Con-
ference on Intelligent Robots and Systems (IROS), 2018, pp. 1–9.
[9] S.-J. Kim et al, “An origami-inspired, self-locking robotic arm that
can be folded flat,” Science Robotics, vol. 3, no. 16, 2018. [Online].
Available: https://robotics.sciencemag.org/content/3/16/eaar2915
[10] A. Rodriguez-Ramos et al, “A deep reinforcement learning strategy for
uav autonomous landing on a moving platform,” Journal of Intelligent
& Robotic Systems, vol. 93, no. 1-2, pp. 351–366, 2019.
[11] D. Falanga et al, “Vision-based autonomous quadrotor landing on a
moving platform,” in 2017 IEEE International Symposium on Safety,
Security and Rescue Robotics (SSRR). IEEE, 2017, pp. 200–207.
[12] H. Lee, S. Jung, and D. H. Shim, “Vision-based uav landing on the
moving vehicle,” in 2016 International conference on unmanned aircraft
systems (ICUAS). IEEE, 2016, pp. 1–7.
[13] D. Tzoumanikas et al, “Fully autonomous micro air vehicle flight and
landing on a moving target using visual-inertial estimation and model-
predictive control,” Journal of Field Robotics, vol. 36, no. 1, pp. 49–77,
2019.
[14] J. Thomas et al, “Autonomous flight for detection, localization, and
tracking of moving targets with a small quadrotor,” IEEE Robotics and
Automation Letters, vol. 2, no. 3, pp. 1762–1769, 2017.
[15] O. Araar, N. Aouf, and I. Vitanov, “Vision based autonomous landing
of multirotor uav on moving platform,” Journal of Intelligent & Robotic
Systems, vol. 85, no. 2, pp. 369–384, 2017.
[16] P. Vlantis et al, “Quadrotor landing on an inclined platform of a moving
ground vehicle,” in 2015 IEEE International Conference on Robotics and
Automation (ICRA). IEEE, 2015, pp. 2202–2207.
[17] D. Zheng et al, “Image-based visual servoing of a quadrotor using virtual
camera approach,” IEEE/ASME Transactions on Mechatronics, vol. 22,
no. 2, pp. 972–982, 2017.
[18] P. KaewTraKulPong and R. Bowden, An Improved Adaptive Background
Mixture Model for Real-time Tracking with Shadow Detection. Boston,
MA: Springer US, 2002, pp. 135–144.
9
[19] A. B. Godbehere et al, “Visual tracking of human visitors under variable-
lighting conditions for a responsive audio art installation,” in 2012
American Control Conference (ACC), 2012, pp. 4305–4312.
[20] A. M. McIvor, “Background subtraction techniques,” Proc. of Image and
Vision Computing, vol. 4, pp. 3099–3104, 2000.
[21] H.-L. Shen et al, “Simple and efficient method for specularity removal
in an image,” Applied optics, vol. 48, no. 14, pp. 2711–2719, 2009.
[22] T.-Y. Lin et al, “Microsoft coco: Common objects in context,” in
European conference on computer vision. Springer, 2014, pp. 740–
755.
[23] T. Lee, M. Leok, and N. H. McClamroch, “Geometric tracking control
of a quadrotor uav on se (3),” in 49th IEEE conference on decision and
control (CDC). IEEE, 2010, pp. 5420–5425.
[24] H. Ferreau, C. Kirches, A. Potschka, H. Bock, and M. Diehl, “qpOASES:
A parametric active-set algorithm for quadratic programming,” Mathe-
matical Programming Computation, vol. 6, no. 4, pp. 327–363, 2014.