Conference PaperPDF Available

Offloading Monocular Visual Odometry with Edge Computing: Optimizing Image Quality in Multi-Robot Systems

Authors:

Abstract and Figures

Fleets of autonomous mobile robots are becoming ubiquitous in industrial environments such as logistic warehouses. This ubiquity has led in the Internet of Things field towards more distributed net-work architectures, which have crystallized under the rising edge and fog computing paradigms. In this paper, we propose the combi-nation of an edge computing approach with computational offload-ing for mobile robot navigation. As smaller and relatively simpler robots become more capable, their penetration in different domains rises. These large multi-robot systems are often characterized by constrained computational and sensing resources. An efficient com-putational offloading scheme has the potential to bring multiple operational enhancements. However, with the most cost-effective autonomous navigation method being visual-inertial odometry,streaming high-quality images can induce latency increments with a consequent negative impact on operational performance. In this paper, we analyze the impact that image quality and compression have on the state-of-the-art on visual inertial odometry. Our results indicate that over one order of magnitude in image size and network bandwidth can be reduced without compromising the accuracy of the odometry methods even in challenging environments.This opens the door to further optimization by dynamically assessing the trade-off between image quality, network load, latency and performance of the visual-inertial odometry and localization accuracy.
Content may be subject to copyright.
Oloading Monocular Visual Odometry with Edge Computing:
Optimizing Image ality in Multi-Robot Systems
Li Qingqing
qingqli@utu.
University of Turku
Turku, Finland
Jorge Peña Queralta
jopequ@utu.
University of Turku
Turku, Finland
Tuan Nguyen Gia
tunggi@utu.
University of Turku
Turku, Finland
Tomi Westerlund
tovewe@utu.
University of Turku
Turku, Finland
ABSTRACT
Fleets of autonomous mobile robots are becoming ubiquitous in
industrial environments such as logistic warehouses. This ubiquity
has led in the Internet of Things eld towards more distributed net-
work architectures, which have crystallized under the rising edge
and fog computing paradigms. In this paper, we propose the combi-
nation of an edge computing approach with computational ooad-
ing for mobile robot navigation. As smaller and relatively simpler
robots become more capable, their penetration in dierent domains
rises. These large multi-robot systems are often characterized by
constrained computational and sensing resources. An ecient com-
putational ooading scheme has the potential to bring multiple
operational enhancements. However, with the most cost-eective
autonomous navigation method being visual-inertial odometry,
streaming high-quality images can induce latency increments with
a consequent negative impact on operational performance. In this
paper, we analyze the impact that image quality and compression
have on the state-of-the-art on visual inertial odometry. Our re-
sults indicate that over one order of magnitude in image size and
network bandwidth can be reduced without compromising the
accuracy of the odometry methods even in challenging environ-
ments.This opens the door to further optimization by dynamically
assessing the trade-o between image quality, network load, latency
and performance of the visual-inertial odometry and localization
accuracy.
CCS CONCEPTS
Computing methodologies Vision for robotics
;Tracking.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
ICSCC, December 21-23, 2019, Wuhan, China
©2019 Association for Computing Machinery.
ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00
https://doi.org/10.1145/nnnnnnn.nnnnnnn
KEYWORDS
Visual Odometry; Visual-Inertial Odometry; Monocular Visual Odom-
etry; Multi Robot Systems; Edge Computing; Computational Of-
oading; Internet of Robots; Internet of Vehicles; Image Compres-
sion; Image Quality
1 INTRODUCTION
Accurate localization and mapping are two of the pillars behind
fully autonomous systems [
1
,
2
]. Over the past two decades, much
attention has been put into solving the simultaneous localization
and mapping (SLAM) problem [
3
5
]. What has been a mostly oine
or ooaded method due to its computational complexity is now
a widely implemented real-time algorithm that runs on on-board
computers in mobile robots [
6
]. Among the dierent sensors that
can provide motion estimation, a visual-inertial system can produce
one of the best price-accuracy ratios [
7
], with cameras and inertial
measurement units having prices of several orders of magnitude
lower than 3D lidars [8].
In recent years, research into visual-inertial odometry (VIO) as
part of the SLAM problem has attracted increasing interest due to
the low price and ease for cross-platform implementation, among
other benets [
5
]. Visual-inertial odometry has potential for mul-
tiple applications, including augmented reality (AR) [
9
,
10
], aerial
robotic navigation. The current state of the art can achieve very
high accuracy even in a dynamic and challenging environment,
both for monocular [
11
] and stereo vision [
12
]. Mature algorithms
such as RAVIO [
13
] or VINS-Mono [
14
] have raised the level of
autonomy of drones and small robots with existing hardware, and
multiple open-source datasets such as EuroC has been published,
pushing the research in this area forward [15].
While visual-inertial odometry enables low-cost and accurate au-
tonomous operation for small mobile robots, it still requires robots
to have a minimum of computational resources available on their
on-board computers. Most of the current research eorts are fo-
cused on algorithmic level optimization to achieve higher levels
of accuracy and reliability in visual odometry on dierent hard-
ware platforms. This has led to high-accuracy methods enabling
long-term autonomy with ecient loop closure mechanisms [
14
].
However, small units such as ying robots usually have constrained
resources, including limited power and computational capabilities
or reduced storage. In this situation, an aspect to consider is how to
ICSCC, December 21-23, 2019, Wuhan, China Qingqing Li et al.
reduce the robots’ computational burden while maintaining the VIO
algorithm’s high performance. If multiple cameras are utilized to
reduce the blind angles for obstacle avoidance, path planning, and
mapping, then the computational burden can increase considerably.
This can have a signicant impact on the performance and ability
to autonomously navigate a complex environment in small mobile
robots, including aerial drones. If additionally, multiple robots are
operating in the same environment, accurate localization is essen-
tial to secure their operation and avoid collisions. In a multi-robot
system where robots have equivalent sensing capabilities, the of-
oading part of the data processing can be a solution that not only
increases the reliability of the system but also reduces the unit cost
of each robot as the hardware can be simplied. In an industrial
environment with large numbers of autonomous robots operating
within a controlled area, reducing the cost of each robot can have a
direct impact on the industrial ecosystem as a whole.
In recent years, some researchers have introduced the cloud ro-
botics concept, in which the capabilities of small mobile robots can
be enhanced by moving part or most of the computationally inten-
sive data analysis tasks to a cloud environment [
16
,
17
]. Nonetheless,
streaming data to the cloud has the potential to signicantly reduce
the overall system reliability with uncontrolled latency or unstable
network connection [
18
,
19
]. We extend the recent trend in the IoT
towards more decentralized network architectures with the fog and
edge computing paradigms [
20
22
]. Edge computing crystallizes
the idea of keeping the data processing as close as possible to where
the data originates. With this approach, raw data is processed at the
local network level instead of the cloud, decreasing the latency and
optimizing the network load [
23
]. Furthermore, savings in hard-
ware platforms and overall power consumption can be optimized
with proper integration of edge computing [
24
]. In this work, we
have moved the VIO computation towards a smart edge gateway to
open the possibility for more intelligent, yet simple, large teams of
autonomous robots that rely on edge services for ooading most
of their computationally intensive operation.
The main motivation behind this paper is to study the optimal
relationship between image quality and accuracy of a monocular
visual odometry algorithm in a computational ooading scheme.
Finding the proper trade-o between accuracy and image size has
a direct impact on the computational resource consumption, algo-
rithm runtime, network latency and, in consequence, the number
of robots that can be supported simultaneously from a single smart
edge gateway. Our goal is to provide a benchmark of the com-
pression rate’s inuence on the VIO algorithm. To address these
issues, we employ the state of art VIO algorithm VINS-Mono [
14
]
and analyze its performance on an open dataset, the EuRoC MAV
dataset [
15
], with varying image compression rate and picture qual-
ity. Our results show that the computational ooading scheme can
be optimized in terms of bandwidth usage without compromising
the accuracy of the visual odometry algorithm. Furthermore, de-
creasing the image quality reduces the processing time at the edge
gateway. Therefore, nding the appropriate compression rate not
only optimizes the network load but also enables a single gateway
to handle the odometry for a larger number of connected robots.
The main contribution of this paper is on analyzing the perfor-
mance of the state-of-the-art in monocular visual odometry with
varying image quality and compression settings. We utilize the
(a) (b)
Figure 1: EuRoC dataset samples. Subgure (a) shows a sam-
ple of the easier environment for odometry, while (b) shows
the harder dataset.
JPEG standard and examine the performance of a monocular vi-
sual odometry algorithm with the JPEG image compression setting
varying from 1% to 100%. The implications of this study can be
signicant in a computational ooading scheme; an image size
reduction of up to two orders of magnitude can be achieved without
a signicant compromise on odometry accuracy.
The remainder of this paper is organized as follows. In Section
2, we overview related works utilizing computational ooading
for visual odometry in mobile robots, mostly with a cloud-based
approach. In Section 3, we introduce the basic concepts behind
visual odometry, as well as specic algorithms utilized in this paper
(VINS-Mono). Section 4 then introduces the methodologies, experi-
mental setup and results, which provide insight towards the optimal
image quality to be chosen to minimize network.At last.Section 5,
conclude a conclusion and discuss the possible future work.
2 RELATED WORK
The problem of SLAM has been traditionally considered either as
an oine problem, where all accumulated data is utilized to rebuild
the path, or an online problem for real-time image analysis with an
on-board computer. However, if a large eet of robots is considered,
then a computational ooading scheme can considerably bring the
cost down. To the best of our knowledge, computational ooading
has been considered for mobile robot navigation a mapping only
from the cloud computing point of view with cloud-centric architec-
tures and data processing in powerful servers where the algorithms
can be easily run in parallel at maximum eciency. Yun et al. pro-
posed a robotics platform to be deployed in cloud servers, RSE-PF,
for distribution visual SLAM where data from dierent robots was
aggregated and combined in the cloud [
16
]. An average network
latency of approximately 150 ms was reported (round trip). Even
with almost instantaneous data processing at the cloud servers, this
either limits the image analysis rate to around 6 frames/second or
induces a delay when parallel RX/TX channels are utilized. In the
rst case, an on-board computer such as a Raspberry Pi 4 or an
NVIDIA TX2 could be able to provide a similar o better frame rate,
while in the second case an accurate estimation of network latency
must be available at the robot in order to interpret properly the
processed information that the cloud servers return. The maximum
number of robotic units that could be supported simultaneously
was not reported; however, the authors utilized WebSockets in order
to save bandwidth compared to HTTP. Dey et al. proposed a similar
ooading scheme in which a multi-tier edge+cloud architecture
was introduced [
17
]. Rather than concentrating on analyzing the
Image ality in Visual-Interial Odometry ICSCC, December 21-23, 2019, Wuhan, China
0 2 4 6 8 10
6
4
2
0
x (m)
y (m)
GT
1%
5%
10%
50%
80%
100%
Figure 2: Ground truth and odometry reconstructed paths
with the easier dataset.
performance, the authors shifted the research focus towards den-
ing and solving an optimization problem in order to maximize the
performance of the multi-tier architecture by ooading dierent
processes to dierent layers. Their approach was to utilize integer
linear programming for optimization of ooading design decisions
utilizing the network bandwidth as a variable and adding latency
constraints.
In this paper, we extend our previous work in progress report
where we analyzed the eect of image compression on the perfor-
mance of visual odometry [
25
]. In contrast with the cloud-centric
approach that can be found in the literature, we propose the ooad-
ing at the local network level following the main design ideas of
edge computing. With this method, we are able to keep the benets
of cloud-based ooading (optimization of energy consumption and
simplication of on-board hardware) while reducing the latency
and increasing the network reliability due to a single connection
being used and allowing for more tight bandwidth and network
management control. We focus on nding the right trade-o be-
tween odometry accuracy and performance in terms of frame rate
and latency.
3 MONOCULAR VISUAL INERTIAL
ODOMETRY
Visual Odometry(VO) is a part of Visual SLAM (VSLAM) .VO focuses
on the local consistency of the robot movement trajectory, using
real-time data to predict robot egomotion. The goal of SLAM is to
achieve global consistency between odometry and maps. So VO
can be used as a building block for VSLAM, before tracking all the
camera’s historical data to detect loop closure and optimize the
map.
3.1 The SLAM problem
SLAM is an abbreviation for Simultaneous Localization And Map-
ping. SLAM was a term rst utilized in the eld of robotics but
has been applied in many other elds afterward, mostly involding
computer vision, virtual reality or augmented reality. It enables
robots to construct a map of the surrounding environment in real
0 2 4 6 8 10 12
10
0
10
x (m)
y (m)
GT
5%
10%
50%
80%
100%
Figure 3: Ground truth and odometry reconstructed paths
with the easier dataset.
time based on sensor data without any prior knowledge, and to
speculate on its own relative location based on this map.
3.2 Visual Odometry: Monocular VS Binocular
Visual-Inertial Odometry (VIO) is an algorithm that combines cam-
era and IMU data to implement SLAM or state estimation. The
advantage of binocular VO is that it can accurately estimate the
motion trajectory and is able to recover the exact physical units. In
Monocular VO, it is only possible to obtain information regarding
what the object has moved as a certain number of relative units
in a given direction, while the binocular VO is able to map these
relative units to a metric system representing the real length or
size. However, for objects that are far away, the binocular system
degenerates into a monocular system. Monocular visual odome-
try has gained increasing attention in recent years because of the
lower price and ease of automatic calibration. However, the data
processing is more challenging.
3.3 VINS-Mono
VINS-Mono adopts a non-linear optimization-based sliding window
estimator to predict a robot’s position and orientation. This ap-
proach begins with the measurement preprocessing which will col-
lect sensor data to detect feature and IMU pre-integration. Through
the initialization procedure, all values for bootstrapping the subse-
quent nonlinear optimization-based VIO will be calculated. The VIO
with relocalization modules tightly fuse integrated IMU measure-
ment processing, feature observation, and redetected features from
a loop closure scheme. Finally, the pose graph module implements
global optimization to reduce drift.
4 EXPERIMENT AND RESULTS
We have utilized an open-source dataset, the EuRoC dataset, in
order to evaluate how the performance of the VINS-Mono algorithm
varies when the image quality is reduced [
15
]. This is an initial
approach and we have utilized the standard JPEG compression
algorithm since it provides a high range of possible compression
rates through its image quality parameter. For instance, given a
ICSCC, December 21-23, 2019, Wuhan, China Qingqing Li et al.
0 50 100 150
0.0
0.2
0.4
0.6
0.8
1.0
1.2
Time (s)
Error (m)
1%
5%
10%
50%
80%
100%
Figure 4: VINS-Mono error in the easier dataset.
0 20 40 60 80
0
1
2
3
Time (s)
Error (m)
1%
5%
10%
50%
80%
100%
Figure 5: VINS-Mono error in the harder dataset.
1% 5% 10% 30% 50% 80% 100% RAW
0
50
100
150
Image Quality
Execution Time (ms)
Figure 6: Execution times: feature extraction (red) and pose
estimation (blue).
sample from the EuRoC dataset that has a size of 362 kB in PNG
format, its size in JPEG ranges from 6.7 kB with 1% quality and
226 kB for 100% quality setting.
The EuRoC dataset is a binocular + IMU dataset for indoor micro
aerial vehicles (MAV). It contains two scenes, one is a machine hall,
and the other is a normal room. The dataset uses the ying robot
AscTec Firey as a data acquisition platform. It is equipped with
1% 5% 10% 30% 50% 80% 100% RAW
1
10
100
Image Quality
Round-Trip Latency (ms)
Figure 7: Average round trip latency with a UDP server.
Table 1: Execution time of the dierent processes and net-
work latency for a subset of image qualities.
Image Quality
1% 5% 10% 50% 100%
Image size (kB) 5.7 7.9 11.2 28.3 202.2
Network latency (ms) 1.99 2.11 1.35 77.81 545.71
Feature extraction (ms) 11.603 10.669 9.786 9.124 8.890
Pose Estimation (ms) 23.074 37.261 44.792 58.553 61.105
binocular camera MT9V034 and an IMU ADIS16448. The camera
frame rate frequency is 20hz, and the IMU frequency is 200hz. The
authors utilize a Vicon motion capture system and Leica Nova MS50
as ground-truth for benchmarking odometry algorithms. Due to the
stable and reliable data provided, it has currently become a popular
dataset [26, 27].
Our experiments have focused on the analysis of two parame-
ters: the latency of the network and the accuracy of the odometry
algorithm. We have also analyzed the processing time required
for the feature extraction process and the pose estimation process
for each of the image compression ratios. We have utilized two
subsets of the EuRoC dataset which are considered easy and hard
for visual inertial odometry algorithms, due to the extraction of
less or more features. Samples from these two subsets are shown
in Figure 1, where it can be seen that the image corresponding to
the harder set is much darker and less features can be consequently
detected. In fact, in this case, even if an image compression ratio
of 5% has an impact of around 25% of the error at the end of the
sample path (0.8 m error with 5% quality versus 0.65 m with 100%
quality), and 1% quality renders a nal error of around 1.1 m. In
the harder dataset, however, only up to 5% image quality allows for
a convergent path, as with 1% quality the algorithm is unable to
calibrate the camera and IMU and the path diverges from the start.
The errors accumulated with the VINS-Mono odometry algorithm
over the easier and harder paths are shown in Figures 4 and 3, re-
spectively. These indicate that the data quality can be reduced to
as little as 10% without compromising the performance, while 50%
quality gives the best performance in a harder environment. In the
easier case, a 10% quality image matches the best performance with
minimal odometry error while achieving two orders of magnitude
Image ality in Visual-Interial Odometry ICSCC, December 21-23, 2019, Wuhan, China
of reduction in the network latency with respect to broadcasting a
raw image.
The two main processes in which an odometry algorithm can
be divided are feature extraction and pose estimation. The distri-
bution of the execution times of these processes for a range of
image qualities (1% to 100%) is shown in the boxplot in Figure 6,
which have been obtained utilizing a 64-bit Intel Core i7-4710MQ
CPU with 8 cores at 2.50 GHz. Each of the distributions has been
calculated with 1000 images for which the dierent compression
rates have been applied. While the feature extraction process has
an execution time that remains constant with the increasing image
quality, the pose estimation increases as more features are found
in higher quality images. The network latency has an overhead
eect that varies from under 1% (image qualities under 10%) to over
700% (100% image quality) when compared to the data processing
time (feature extraction and pose estimation). The distribution of
round-trip latency for a subset of image qualities is shown in the
boxplot in Figure 7, where samples of 100 images have been utilized
to calculate each of the distributions.
5 CONCLUSION AND FUTURE WORK
We have evaluated the impact of image compression and quality in
a visual inertial odometry algorithm. Our results show that image
quality can be reduced up to a certain threshold, which depends
on the ability of the algorithm to extract features from the envi-
ronment, without a signicant impact on odometry accuracy. This
opens the door to the utilization of an ecient computational of-
oading scheme with edge computing. In turn, this enables the
simplication of hardware onboard robots, a consequent reduction
of power consumption and the ability to utilize a single edge gate-
way to ooad the odometry computation from multiple robots. The
latency of the network adds an overhead between 0.3% and 780%
with respect to the processing time. In both datasets considered,
a low accuracy loss could be achieved reducing the image quality
to as much as 10%, where the network overhead is below 1%. In
consequence, the ooading scheme does not induce signicant
delays to the odometry and has the potential to even improve the
performance in terms of frame rate with more powerful edge gate-
ways. The proposed edge computing ooading scheme can bring
multiple benets to a large multi-robot system, from cost reduction
and energy eciency to increased performance and reliability.
In future work, we will evaluate a wider range of odometry algo-
rithms and image compression methods. We will also compare the
execution time of the odometry algorithms on typical on-board com-
puters utilized in aerial robots with multiple instances of the same
algorithm running in parallel giving support to multiple robots.
REFERENCES
[1]
G. Bressonet al. Simultaneous localization and mapping: A survey of current
trends in autonomous driving. IEEE Trans. on Intelligent Vehicles, 2017.
[2]
L. Qingqing et al. Multi Sensor Fusion for Navigation and Mapping in Au-
tonomous Vehicles: Accurate Localization in Urban Environments. In The 9th
IEEE CIS-RAM, 2019.
[3]
T. Durrant et al. Simultaneous localization and mapping: part i. IEEE robotics &
automation magazine, 13(2):99–110, 2006.
[4]
T. Bailey et al. Simultaneous localization and mapping (slam): Part ii. IEEE robotics
& automation magazine, 13(3):108–117, 2006.
[5]
J. Fuentes-Pacheco et al. Visual simultaneous localization and mapping: a survey.
Articial Intelligence Review, 43(1):55–81, 2015.
[6] W. Hesset al. Real-time loop closure in 2d lidar slam. In 2016 IEEE International
Conference on Robotics and Automation (ICRA), pages 1271–1278. IEEE, 2016.
[7]
Sherif A.S. Mohamed et al. A survey on odometry for autonomous navigation
systems. IEEE Access, 2019.
[8]
J. Zhanget al. Visual-lidar odometry and mapping: Low-drift, robust, and fast.
In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages
2174–2181. IEEE, 2015.
[9]
T. Oskiperet al. Camslam: Vision aided inertial tracking and mapping framework
for large scale ar applications. In IEEE ISMAR, pages 216–217. IEEE, 2017.
[10]
S. Cortéset al. Advio: An authentic dataset for visual-inertial odometry. In
Proceedings of the European Conference on Computer Vision (ECCV), pages 419–
434, 2018.
[11]
A. Hardt-Stremayret al. Towards fully dense direct lter-based monocular visual-
inertial odometry. In IEEE ICRA. IEEE, 2019.
[12]
K. Sun et al. Robust stereo visual inertial odometry for fast autonomous ight.
IEEE Robotics and Automation Letters, 3(2):965–972, 2018.
[13]
M. Bloeschet al. Robust visual inertial odometry using a direct ekf-based approach.
In IEEE/RSJ IROS. IEEE, 2015.
[14]
T. Qin et al. Vins-mono: A robust and versatile monocular visual-inertial state
estimator. IEEE Transactions on Robotics, 34(4):1004–1020, 2018.
[15]
M. Burri et al. The euroc micro aerial vehicle datasets. The International Journal
of Robotics Research, 35(10):1157–1163, 2016.
[16]
P. Yun et al. Towards a cloud robotics platform for distributed visual slam. In
Computer Vision Systems. Springer, 2017.
[17]
S. Dey et al. Robotic slam: A review from fog computing and mobile edge
computing perspective. In MOBIQUITOUS. ACM, 2016.
[18]
L. Qingqing et al. Edge Computing for Mobile Robots: Multi-Robot Feature-Based
Lidar Odometry with FPGAs. In The 12th ICMU, 2019.
[19]
V. K. Sarker et al. Ooading slam for indoor mobile robots with edge-fog-cloud
computing. In 1st ICASERT, 2019.
[20]
A. Metwalyet al. Edge computing with embedded ai: Thermal image analysis for
occupancy estimation in intelligent buildings. In INTelligent Embedded Systems
Architectures and Applications, INTESA@ESWEEK 2019. ACM, 2019.
[21]
T. N. Gia et al. Articial Intelligence at the Edge in the Blockchain of Things. In
8th EAI Mobihealth, 2019.
[22]
A. Nawaz et al. Edge AI and Blockchain for Privacy-Critical and Data-Sensitive
Applications. In The 12th ICMU, 2019.
[23]
T. N. Gia et al. Edge AI in Smart Farming IoT: CNNs at the Edge and Fog
Computing with LoRa. In 2019 IEEE AFRICON, 2019.
[24]
J. Peña Queralta et al. Edge-AI in LoRa based healthcare monitoring: A case study
on fall detection system with LSTM Recurrent Neural Networks. In 2019 42nd
International Conference on Telecommunications, Signal Processing (TSP), 2019.
[25]
L. Qingqing et al. Visual Odometry Ooading in Internet of Vehicles with
Compression at the Edge of the Network. In The 12th ICMU, 2019.
[26]
C. Cadena et al. Past, present, and future of simultaneous localization and
mapping: Toward the robust-perception age. IEEE Trans. on robotics, 2016.
[27]
Raúl et al. Mur-Artal. Visual-inertial monocular slam with map reuse. IEEE
Robotics and Automation Letters, 2(2):796–803, 2017.
... In several studies, lidar point cloud data and image data have been used together in a variety of computer vision tasks, such as 3D object detection [5], [6], [7], [8]. However, while lidar odometry, localization and mapping are at the pinnacle of autonomous technology [9], the processing of point cloud data for object detection or semantic scene segmentation is not as mature as the algorithms, and machine learning (ML) approaches for vision sensors [10], [11]. ...
Preprint
Full-text available
Over the last decade, robotic perception algorithms have significantly benefited from the rapid advances in deep learning (DL). Indeed, a significant amount of the autonomy stack of different commercial and research platforms relies on DL for situational awareness, especially vision sensors. This work explores the potential of general-purpose DL perception algorithms, specifically detection and segmentation neural networks, for processing image-like outputs of advanced lidar sensors. Rather than processing the three-dimensional point cloud data, this is, to the best of our knowledge, the first work to focus on low-resolution images with 360\textdegree field of view obtained with lidar sensors by encoding either depth, reflectivity, or near-infrared light in the image pixels. We show that with adequate preprocessing, general-purpose DL models can process these images, opening the door to their usage in environmental conditions where vision sensors present inherent limitations. We provide both a qualitative and quantitative analysis of the performance of a variety of neural network architectures. We believe that using DL models built for visual cameras offers significant advantages due to the much wider availability and maturity compared to point cloud-based perception.
... Active research areas in TIERS include multi-robot coordination [1], [2], [3], [4], [5], swarm design [6], [7], [8], [9], UWB-based localization [10], [11], [12], [13], [14], [15], localization and navigation in unstructured environments [16], [17], [18], lightweight AI at the edge [19], [20], [21], [22], [23], distributed ledger technologies at the edge [24], [25], [26], [27], [28], [29], edge architectures [30], [31], [32], [33], [34], [35], offloading for mobile robots [36], [37], [38], [39], [40], [41], [42], LPWAN networks [43], [44], [45], [46], sensor fusion algorithms [47], [48], [49], and reinforcement and federated learning for multi-robot systems [50], [51], [52], [53]. ...
... Reports from participating teams indicate that localization and collaborative sensing were among the key challenges, with MAVs being deployed from UGVs dynamically during the challenge. Since MAVs often rely on visual-inertial odometry (VIO) for self and relative estate estimation [9], relying on external lidar-based tracking can also extend the operability to low-visibility or other domains where VIO has inherent limitations [10], [11]. ...
Preprint
Full-text available
Micro-aerial vehicles (MAVs) are becoming ubiquitous across multiple industries and application domains. Lightweight MAVs with only an onboard flight controller and a minimal sensor suite (e.g., IMU, vision, and vertical ranging sensors) have potential as mobile and easily deployable sensing platforms. When deployed from a ground robot, a key parameter is a relative localization between the ground robot and the MAV. This paper proposes a novel method for tracking MAVs in lidar point clouds. In lidar point clouds, we consider the speed and distance of the MAV to actively adapt the lidar's frame integration time and, in essence, the density and size of the point cloud to be processed. We show that this method enables more persistent and robust tracking when the speed of the MAV or its distance to the tracking sensor changes. In addition, we propose a multi-modal tracking method that relies on high-frequency scans for accurate state estimation, lower-frequency scans for robust and persistent tracking, and sub-Hz processing for trajectory and object identification. These three integration and processing modalities allow for an overall accurate and robust MAV tracking while ensuring the object being tracked meets shape and size constraints.
... The MEC layer is an inherently distributed computing platform that enables high-performance computing (HPC) services with minimal latency [11]. The most direct application is to extend existing offloading schemes [12], and integrate them within the 5G stack [13]. This has clear potential in vehicular and robotic navigation, especially when combined with predictive schemes [14]. ...
Preprint
Full-text available
This conceptual paper discusses how different aspects involving the autonomous operation of robots and vehicles will change when they have access to next-generation mobile networks. 5G and beyond connectivity is bringing together a myriad of technologies and industries under its umbrella. High-bandwidth, low-latency edge computing services through network slicing have the potential to support novel application scenarios in different domains including robotics, autonomous vehicles, and the Internet of Things. In particular, multi-tenant applications at the edge of the network will boost the development of autonomous robots and vehicles offering computational resources and intelligence through reliable offloading services. The integration of more distributed network architectures with distributed robotic systems can increase the degree of intelligence and level of autonomy of connected units. We argue that the last piece to put together a services framework with third-party integration will be next-generation low-latency blockchain networks. Blockchains will enable a transparent and secure way of providing services and managing resources at the Multi-Access Edge Computing (MEC) layer. We overview the state-of-the-art in MEC slicing, distributed robotic systems and blockchain technology to define a framework for services the MEC layer that will enhance the autonomous operations of connected robots and vehicles.
... In this paper, the blockchain MEC slice was the key slice managing the deployment of applications across other MEC slices supporting different verticals within the automotive sector. Further adoption of blockchain for computational offloading will require, however, higher-bandwidth and lower-latency blockchain frameworks enabling real-time sensor data to be streamed for applications such as autonomous mobile robots [30,106]. ...
Preprint
Mobile edge computing (MEC) and next-generation mobile networks are set to disrupt the way intelligent and autonomous systems are interconnected. This will have an effect on a wide range of domains, from the Internet of Things to autonomous mobile robots. The integration of such a variety of MEC services in a inherently distributed architecture requires a robust system for managing hardware resources, balancing the network load and securing the distributed applications. Blockchain technology has emerged a solution for managing MEC services, with consensus protocols and data integrity checks that enable transparent and efficient distributed decision-making. In addition to transparency, the benefits from a security point of view are evident. Nonetheless, blockchain technology faces significant challenges in terms of scalability. In this chapter, we review existing consensus protocols and scalability techniques in both well-established and next-generation blockchain architectures. From this, we evaluate the most suitable solutions for managing MEC services and discuss the benefits and drawbacks of the available alternatives.
... UWB-based localization systems provide an inexpensive alternative to highaccuracy motion capture systems for navigation in application scenarios where a localization accuracy of the order of tens of centimeters is sufficient [6]. In GNSS-denied environments, UWB-based localization systems can provide a robust alternative to visual odometry methods [7], or other methods that rely only on information acquired onboard mobile agents, such as lidar odometry [8], which present challenges in long-term autonomy. Therefore, UWB-based localization systems enable longer operations and tighter control over the behavior of mobile robots. ...
Preprint
Full-text available
Ultra-wideband (UWB) wireless technology has seen an increased penetration in the robotics field as a robust localization method in recent years. UWB enables high accuracy distance estimation from time-of-flight measurements of wireless signals, even in non-line-of-sight measurements. UWB-based localization systems have been utilized in various types of GNSS-denied environments for ground or aerial autonomous robots. However, most of the existing solutions rely on a fixed and well-calibrated set of UWB nodes, or anchors, to estimate accurately the position of other mobile nodes, or tags, through multilateration. This limits the applicability of such systems for dynamic and ad-hoc deployments, such as post-disaster scenarios where the UWB anchors could be mounted on mobile robots to aid the navigation of UAVs or other robots. We introduce a collaborative algorithm for online autocalibration of anchor positions, enabling not only ad-hoc deployments but also movable anchors, based on Decawave's DWM1001 UWB module. Compared to the built-in autocalibration process from Decawave, we drastically reduce the amount of calibration time and increase the accuracy at the same time. We provide both experimental measurements and simulation results to demonstrate the usability of this algorithm.
... In post-disaster scenarios, the presence of dust or smoke and potentially dark environments present significant challenges to localization with on-board sensors only. While efforts are being put in solving these problems, for instance with the utilization of thermal cameras or event-based cameras [7], challenges remain in position accuracy and robustness in long-term autonomy [8], [9]. In contrast, we study a wireless positioning system that requires active beacons both onboard the robots and in known positions in the operational environment. ...
Preprint
Small unmanned aerial vehicles (UAV) have penetrated multiple domains over the past years. In GNSS-denied or indoor environments, aerial robots require a robust and stable localization system, often with external feedback, in order to fly safely. Motion capture systems are typically utilized indoors when accurate localization is needed. However, these systems are expensive and most require a fixed setup. Recently, visual-inertial odometry and similar methods have advanced to a point where autonomous UAVs can rely on them for localization. The main limitation in this case comes from the environment, as well as in long-term autonomy due to accumulating error if loop closure cannot be performed efficiently. For instance, the impact of low visibility due to dust or smoke in post-disaster scenarios might render the odometry methods inapplicable. In this paper, we study and characterize an ultra-wideband (UWB) system for navigation and localization of aerial robots indoors based on Decawave's DWM1001 UWB node. The system is portable, inexpensive and can be battery powered in its totality. We show the viability of this system for autonomous flight of UAVs, and provide open-source methods and data that enable its widespread application even with movable anchor systems. We characterize the accuracy based on the position of the UAV with respect to the anchors, its altitude and speed, and the distribution of the anchors in space. Finally, we analyze the accuracy of the self-calibration of the anchors' positions.
... The MEC layer is an inherently distributed computing platform that enables high-performance computing (HPC) services with minimal latency [11]. The most direct application is to extend existing offloading schemes [12], and integrate them within the 5G stack [13]. This has clear potential in vehicular and robotic navigation, especially when combined with predictive schemes [14]. ...
Conference Paper
Full-text available
This conceptual paper discusses how different aspects involving the autonomous operation of robots and vehicles will change when they have access to next-generation mobile networks. 5G and beyond connectivity is bringing together a myriad of technologies and industries under its umbrella. High-bandwidth, low-latency edge computing services through network slicing have the potential to support novel application scenarios in different domains including robotics, autonomous vehicles, and the Internet of Things. In particular, multi-tenant applications at the edge of the network will boost the development of autonomous robots and vehicles offering computational resources and intelligence through reliable offloading services. The integration of more distributed network architectures with distributed robotic systems can increase the degree of intelligence and level of autonomy of connected units. We argue that the last piece to put together a services framework with third-party integration will be next-generation low-latency blockchain networks. Blockchains will enable a transparent and secure way of providing services and managing resources at the Multi-Access Edge Computing (MEC) layer. We overview the state-of-the-art in MEC slicing, distributed robotic systems and blockchain technology to define a framework for services the MEC layer that will enhance the autonomous operations of connected robots and vehicles.
Chapter
Mobile edge computing (MEC) and next-generation mobile networks are set to disrupt the way intelligent and autonomous systems are interconnected. This will have an effect on a wide range of domains, from the Internet of Things to autonomous mobile robots. The integration of such a variety of MEC services in an inherently distributed architecture requires a robust system for managing hardware resources, balancing the network load and securing the distributed applications. Blockchain technology has emerged a solution for managing MEC services, with consensus protocols and data integrity checks that enable transparent and efficient distributed decision-making. In addition to transparency, the benefits from a security point of view are evident. Nonetheless, blockchain technology faces significant challenges in terms of scalability. In this chapter, we review existing consensus protocols and scalability techniques in both well-established and next-generation blockchain architectures. From this, we evaluate the most suitable solutions for managing MEC services and discuss the benefits and drawbacks of the available alternatives.
Chapter
Full-text available
Traditional cloud-centric architectures for Internet-of-Things applications are being replaced by distributed approaches. The Edge and Fog computing paradigms crystallize the concept of moving computation towards the edge of the network, closer to where the data originates. This has important benefits in terms of energy efficiency, network load optimization and latency control. The combination of these paradigms with embedded artificial intelligence in edge devices, or Edge AI, enables further improvements. In turn, the development of blockchain technology and distributed architectures for peer-to-peer communication and trade allows for higher levels of security. This can have a significant impact on data-sensitive and mission-critical applications in the IoT. In this paper, we discuss the potential of an Edge AI capable system architecture for the Blockchain of Things. We show how this architecture can be utilized in health monitoring applications. Furthermore, by analyzing raw data directly at the edge layer, we inherently avoid the possibility of breaches of sensitive information, as raw data is never stored nor transferred outside of the local network.
Conference Paper
Full-text available
Traditional cloud-centric architectures for Internet-of-Things applications are being replaced by distributed approaches. The Edge and Fog computing paradigms crystallize the concept of moving computation towards the edge of the network, closer to where the data originates. This has important benefits in terms of energy efficiency, network load optimization and latency control. The combination of these paradigms with embedded artificial intelligence in edge devices, or Edge AI, enables further improvements. In turn, the development of blockchain technology and distributed architectures for peer-to-peer communication and trade allows for higher levels of security. This can have a significant impact on data-sensitive and mission-critical applications in the IoT. In this paper, we discuss the potential of an Edge AI capable system architecture for the Blockchain of Things. We show how this architecture can be utilized in health monitoring applications. Furthermore, by analyzing raw data directly at the edge layer, we inherently avoid the possibility of breaches of sensitive information, as raw data is never stored nor transferred outside of the local network.
Conference Paper
Full-text available
With the rise of the IoT, there has been a growing demand for people counting and occupancy estimation in Intelligent buildings for adapting their heating, ventilation and cooling systems. This can have a significant impact on energy consumption at a global scale as such systems consume about 40% of electricity and create about 36% of the CO2 emissions in Europe. Previous approaches to occupancy estimation either utilize methods that do not ensure people's privacy when obtaining high accuracy estimations, such as RGB cameras, or utilize thermal or radar sensors with lower accuracy. Thermal vision for people detection has several advantages. It protects people's privacy while being less affected by changes in the environment. In addition, most previous approaches relying on image processing stream data to the cloud to be analyzed. However, with the development of the more distributed network paradigms edge and fog computing, there has been a trend in moving computation towards the edge of the network. This process of embedding intelligence into end-devices enables more efficient energy consumption and network load distribution. In this work, we present an embedded algorithm for room occupancy estimation based on a thermal sensor with accuracy over the state-of-the-art. We study the performance of a variety of deep learning models on different embedded processors. We achieve a prediction accuracy of 98.9% for people counting estimation with a minimal 2 KB RAM utilization. Furthermore, the proposed algorithm has very low latency achieving execution times under 14 ms.
Conference Paper
Full-text available
A recent trend in the IoT is to shift from traditional cloud-centric applications towards more distributed approaches embracing the fog and edge computing paradigms. In autonomous robots and vehicles, much research has been put into the potential of offloading computationally intensive tasks to cloud computing. Visual odometry is a common example, as real-time analysis of one or multiple video feeds requires significant on-board computation. If this operations are offloaded, then the on-board hardware can be simplified, and the battery life extended. In the case of self-driving cars, efficient offloading can significantly decrease the price of the hardware. Nonetheless, offloading to cloud computing compromises the system's latency and poses serious reliability issues. Visual odometry offloading requires streaming of video-feeds in real-time. In a multi-vehicle scenario, enabling efficient data compression without compromising performance can help save bandwidth and increase reliability.
Conference Paper
Full-text available
The edge and fog computing paradigms enable more responsive and smarter systems without relying on cloud servers for data processing and storage. This reduces network load as well as latency. Nonetheless, the addition of new layers in the network architecture increases the number of security vulnerabilities. In privacy-critical systems, the appearance of new vulnerabilities is more significant. To cope with this issue, we propose and implement an Ethereum Blockchain based architecture with edge artificial intelligence to analyze data at the edge of the network and keep track of the parties that access the results of the analysis, which are stored in distributed databases. A use case of edge AI for ECG feature extraction and real-time support of multiple sensor nodes is analyzed in the experiments.
Conference Paper
Full-text available
Offloading computationally intensive tasks such as lidar or visual odometry from mobile robots has multiple benefits. Resource constrained robots can make use of their network capabilities to reduce the data processing load and be able to perform a larger number tasks in a more efficient manner. However, previous works have mostly focused on cloud offloading, which increases latency and reduces reliability, or high-end edge devices. Instead, we explore the utilization of FPGAs at the edge for computational offloading with minimal latency and high parallelism. We present the potential for modelling feature-based odometry in VHDL and utilizing FPGA implementations.
Conference Paper
Full-text available
The agricultural and farming industries have been widely influenced by the disruption of the Internet of Things. The impact of the IoT is more limited in countries with less penetration of mobile internet such as sub-Saharan countries, where agriculture commonly accounts for 10 to 50% of their GPD. The boom of low-power wide-area networks (LPWAN) in the last decade, with technologies such as LoRa or NB-IoT, has mitigated this providing a relatively cheap infrastructure that enables low-power and long-range transmissions. Nonetheless, the benefits that LPWAN technologies enable have the disadvantage of low-bandwidth transmissions. Therefore, the integration of Edge and Fog computing, moving data analytics and compression near end devices, is key in order to extend functionality. By integrating artificial intelligence at the local network layer, or Edge AI, we present a system architecture and implementation that expands the possibilities of smart agriculture and farming applications with Edge and Fog computing and LPWAN technology for large area coverage. We propose and implement a system consisting on a sensor node, an Edge gateway, LoRa repeaters, Fog gateway, cloud servers and end-user terminal application. At the Edge layer, we propose the implementation of a CNN-based image compression method in order to send in a single message information about hundreds or thousands of sensor nodes within the gateway's range. We use advanced compression techniques to reduce the size of data up to 67% with a decompression error below 5%, within a novel scheme for IoT data.
Article
Full-text available
The development of a navigation system is one of the major challenges in building a fully autonomous platform. Full autonomy requires a dependable navigation capability not only in a perfect situation with clear GPS signals, but also in situations where the GPS is unreliable. Therefore, self-contained odometry systems have attracted much attention recently. This paper provides a general and comprehensive overview of the state-of-the-art in the field of self-contained, i,e, GPS denied, odometry systems and identifies the out-coming challenges that demand further research in future. Self-contained odometry methods are categorized into five main types, i.e., wheel, inertial, laser, radar, and visual where such categorization is based on the type of the sensor data being used for the odometry. Most of the research in the field is focused on analyzing the sensor data exhaustively or partially to extract the vehicle pose. Different combination and fusion of sensor data in a tightly/loosely coupled manner and with filtering or optimizing fusion method have been investigated. We analyze the advantages and weaknesses of each approach in terms of different evaluation metrics such as performance, response time, energy efficiency, and accuracy that can be a useful guideline for researchers and engineers in the field. In the end, some future research challenges in the field are discussed.
Article
The combination of data from multiple sensors, also known as sensor fusion or data fusion, is a key aspect in the design of autonomous robots. In particular, algorithms able to accommodate sensor fusion techniques enable increased accuracy, and are more resilient against the malfunction of individual sensors. The development of algorithms for autonomous navigation, mapping and localization have seen big advancements over the past two decades. Nonetheless, challenges remain in developing robust solutions for accurate localization in dense urban environments, where the so-called last-mile delivery occurs. In these scenarios, local motion estimation is combined with the matching of real-time data with a detailed pre-built map. In this paper, we utilize data gathered with an autonomous delivery robot to compare different sensor fusion techniques and evaluate which are the algorithms providing the highest accuracy depending on the environment. The techniques we analyze and propose in this paper utilize 3D lidar data, inertial data, GNSS data and wheel encoder readings. We show how lidar scan matching combined with other sensor data can be used to increase the accuracy of the robot localization and, in consequence, its navigation. Moreover, we propose a strategy to reduce the impact on navigation performance when a change in the environment renders map data invalid or part of the available map is corrupted.