Content uploaded by Maleen Jayasuriya
Author content
All content in this area was uploaded by Maleen Jayasuriya on Mar 10, 2021
Content may be subject to copyright.
Towards Adapting Autonomous Vehicle Technology for the
Improvement of Personal Mobility Devices
Maleen Jayasuriya1, Janindu Arukgoda1, Ravindra Ranasinghe1, and Gamini Dissanayake1
Abstract— Personal Mobility Devices (PMDs) incorporated
with autonomy, have great potential in becoming an essential
building block of smart transportation infrastructures of the
future. However, autonomous vehicle technologies currently
employ large and expensive sensors / computers and resource
intensive algorithms, which are not suitable for low cost, small
form factor PMDs. In this paper, a mobility scooter is retrofitted
with a low cost sensing and computing package with the aim
of achieving autonomous driving capability. As a first step,
a novel, real time, low cost and resource efficient vision only
localisation framework based on Convolutional Neural Network
(CNN) oriented feature extraction and extended Kalman filter
oriented state estimation is presented. Real world experiments
in a suburban environment are presented to demonstrate the
effectiveness of the proposed localisation framework.
I. INTRODUCTION
The term “Personal Mobility Devices” (PMDs) refers
to compact electric vehicles that enable individual human
transportation [1], with examples ranging from mobility
scooters, to segways, and motorised wheel chairs. These
devices typically come in small form factors and function
at comparatively low speeds (under 15 Kmph) in pedestrian
spaces, such as pavements and footpaths. The popularity
of such devices has risen dramatically in recent times and
is predicted to skyrocket within the next decade [2]. This
boom in the PMD market is strongly correlated with rapid
urbanisation, developments in motor and battery technology,
interest in environmentally friendly transport, and a growing
ageing population in developed nations.
This rise in PMD usage has also been met with rising
safety concerns and calls for stringent regulations in many
nations [3]–[5] and has been viewed as an invasion of
spaces that have been traditionally reserved for pedestrians.
However, these PMDs offer many benefits, especially to
individuals with mobility restrictions. They also show great
promise in being an essential building block of a more eco-
friendly and smart transportation infrastructure, especially in
the terms of first / last-mile transportation [6], [7]. Within
this context, self-driving vehicle technology applied to the
reasonably low tech space of PMDs has tremendous potential
to balance this trade-off between their legitimate safety
concerns and numerous advantages. However, despite the
interest in self-driving vehicles, research into autonomously
driven PMDs operating in outdoor pedestrian environments
have been comparatively few and far between.
1Maleen Jayasuriya, Janindu Arukgoda, Ravindra Ranasinghe and
Gamini Dissanayake, are with the Faculty of Engineering and Information
Technology, University of Technology Sydney, Australia
978-1-7281-9437-0/20/$31.00 ©2020 IEEE
Autonomous vehicles in general consist of many complex
subsystems of which one can identify 5 primary functional
systems essential for autonomy [8]:
1) Localisation: Responsible for identifying the location
of the vehicle on a map.
2) Perception: Obtains information from the environment
and identifies elements required for localisation and
obstacle avoidance.
3) Motion planning and decision making: Uses inputs
from the localisation and perception systems to decide
the optimum trajectories the vehicle should follow.
4) Control: Converts the decisions taken by the planning
system, to the vehicle’s control commands (e.g. steer-
ing, accelerating and braking).
5) System Management: Manages all of the above sys-
tems and provides the Human-Machine Interface
(HMI).
The broad goal of our research has been to adapt au-
tonomous vehicle technology which generally targets large-
scale, high-speed vehicles, to the unique challenges associ-
ated with the resource constrained domain of PMDs. To this
end, our initial focus has been to address the localisation
and perception problems as they are fundamental building
blocks of autonomy, essential to the functioning of all other
subsystems. Most large-scale autonomous vehicles rely on
a combination of 2D / 3D LIDARS and vision sensors in
conjunction with high definition 3D point cloud maps for
this purpose, making localisation a resource intensive and
costly task. To address this issue, we propose a low cost,
computationally efficient vision only localisation framework
that utilises a sparse 2D map representation, which suits low
cost resource constrained applications such as PMDs [9],
[10]. It does so by utilising Convolutional Neural Network
(CNN) based detection of environmental landmarks such
as trees, lamposts, street signs, parking meters and ground
surface boundaries such as curbs, pavement edges, manhole
covers, fused with odometry information under an Extended
Kalaman Filter (EKF) based back-end. Localisation is carried
out on a 2D map consisting of the locations of the landmarks
and a Vector Distance Transform (VDT) representation of the
ground surface boundaries.
Testing of all concepts and algorithms have been carried
out on a modified mobility scooter platform (see figure 1),
retrofitted with a low cost computing and sensor package.
The purpose of this paper is to provide an overview of this
system and an insight into the unique challenges associated
with adapting autonomous technology to PMDs.
Fig. 1: Autonomous Mobility Scooter Platform
The remainder of this paper is structured as follows:
Section II discusses related work. Section III provides a
detailed systemic overview of the mobility scooter platform.
Section IV provides a summary of the developments made in
the localisation and perception fronts. Section V details the
next steps and future work, with concluding remarks offered
in section VI.
II. RELATED WORK
Navigating in outdoor urban settings remains a challenging
task due to the poor precision and unreliability of Global
Navigation Satellite Systems (GNSS) compelling most au-
tonomous vehicles to rely on combinations of additional
sensors such as 2D / 3D LIDARS and vision in conjunction
with point cloud based high definition 3D maps [11]. This
is important in the context of fast moving vehicles in
order to provide the robustness and accuracy required for
safe transportation [8], [12], even though these sensors are
expensive and constructing / maintaining 3D maps demand
heavy resources in terms of data collection, storage, and
processing [13]. However, we posit that given the operating
parameters of a typical PMD, such as low speeds (less than
15 Kmph), and constrained pedestrian environments such as
footpaths and pavements, this task can be accurately and
safely carried out with a low cost sensing and computing
package.
Investigating the unique challenges associated with auton-
omy in the space of PMDs has been largely under-examined.
However, noteworthy research in this space can be found in
the work carried out by the Massachusetts Institute of Tech-
nology (MIT) in conjunction with the National University
of Singapore (NUS), where an autonomous mobility scooter
was developed to investigating the concept of “Mobility on
Demand (MoD)” services such as car sharing or on demand
taxi services [14]. This is an extension of the work based
on the design of an autonomous golf cart created by the
same group [15]. As notable as this work is, the primary
sensing unit consists of expensive LIDARs, which is difficult
to account for, in terms of low-cost, resource constrained
PMDs.
For the specific application of autonomous PMDs, vision
sensors offer a superior alternative to laser based sensors due
to their low cost, low power and small form factor. Many
strides in vision only navigation have been made in the last
decade, with noteworthy state of the art techniques such as
ORB-SLAM [16], [17], VINS [18], Kimera [19], and RTAB
[20]). These state of the art techniques generally employ
low level handcrafted feature descriptors such as FAST,
SIFT, SURF, BRIEF and ORB, to track features required for
location estimation. However, these features are not invariant
to illumination changes and not constant in highly dynamic
environments such as urban outdoor settings, which means
building a map that can be reused over different time periods
becomes a difficult task. Thus all of these techniques rely
on Simultaneous Localisation and Mapping (SLAM) where
both the map of the environment and pose of the robot are
estimated simultaneously. SLAM becomes a computationally
expensive task as the scale of the environment increases
and is also prone to inevitable drift in location estimates.
This drift is often addressed through ”loop closure” when
traversing regions that have already been visited. However,
this cannot always be guaranteed and attempting to always
close loops is not the most efficient mode of transportation
between two given points.
In contrast, given a known environment, localising with a
pre-built map is more preferable to SLAM. The challenge
lies in building a map which contains features that are per-
sistent over long time frames. This is fairly straightforward
in laser based localisation systems as persistent geometrical
structures can be fitted to point cloud data and used in a
pre-built map. Examples include fitting cylinders to pole
like structures such as trees and lamposts [21] and planes
to building facades and walls [22].
This is less straightforward in the case of vision data
and handcrafted feature detectors and descriptors. How-
ever, recent developments in Convolutional Neural Networks
(CNNs) provide opportunities to detect and recognise high
level semantic information of an environment through se-
mantic segmentation [23]–[25] and object detection [26]–
[29].
Utilising these developments as the primary perception
front-end of our system, our proposed localisation framework
is capable of carrying out vision only localisation, on a
pre-built sparse 2D map of an environment that is resource
efficient and drift free. Further details of this framework are
provided in Section IV.
III. PLATFORM OVERVIEW
This section outlines a systemic overview of the proposed
autonomous mobility scooter platform and its low cost sensor
package (see figure 2).
A. Mobility Scooter
Selecting the base mobility scooter platform was an im-
portant decision since it would be the foundation all other
systems would be built on. One major factor considered was
selecting a reliable platform that is representative of a typical
scooter used by the general public, as the autonomous system
to be developed should be easily transferable and adaptable
to any platform.
Fig. 2: Hardware overview of the retrofitted mobility scooter
After many consultations with Independent Living Spe-
cialists (ILS) [30]; Australia’s largest provider of mobility
scooters, it was decided that the Pride Pathrider 10 would
make an ideal research platform. The physical specifications
of the Mobility Scooter are provided in Table I.
TABLE I: Pathrider 10 Physical Specifications
Dimensions (L x W x H) 1.9 x 0.56 x 1.65 m
Weight 105 Kg
Maximum speed 8.85 km/h
Turning Clearance Circle 1.575 m (Turning radius)
B. Vision Sensors
The primary vision based sensing unit comprises of three
Intel®Realsense™D435 cameras. These affordable sensors
are capable of providing depth information obtained through
a stereo infrared camera pair augmented with a structured
light projector.
Two of these sensors are mounted at 45° angles to the
heading of the scooter and are used to detect environmental
landmark features such as lamposts, street signs, trees and
parking meters. No depth information is used (as discussed in
section IV) since only the bearings to these landmark based
observations are required. The angled mounting ensures that
a larger field of view (FoV) is dedicated to the feature rich
sides of the path, and greater parallax to these features are
observed. In later iterations, we have been investigating the
possibility of replacing these two cameras with a consumer
grade omnidirectional camera which provides a 360° FoV.
For this purpose, a Ricoh Theta S omnidirectional camera is
also mounted on the platform.
The third Realsense sensor is mounted in the rear, facing
the ground in order to obtain ground surface boundary infor-
mation such as curbs, pavement edges and manhole covers.
Here, depth information is exploited as the operational range
to the surface boundaries provide less noisy depth data
compared with the typical ranges associated with landmark
features.
C. Odometry
The system is capable of obtaining Odometry data using
the vision sensors outlined in the previous section. However,
the platform also contains two rotary wheel encoders and
an MPU-9250 IMU. The comparative advantages between
utilising vision based odometry or inertial sensors (wheel
encoders / IMU) are currently being investigated.
D. Computing
NVIDIA®Jetson AGX Xavier embedded system is re-
sponsible for the primary perception based computational
tasks. The Jetson Xavier posses a 512-core Volta GPU with
Tensor Cores, an 8-core ARM v8.2 64-bit CPU and 16GB of
RAM. Its high power efficiency, GPU processing capabilities,
and memory make it suitable for carrying out the CNN
based perceptual tasks while being affordable and resource
efficient.
An Intel®UP2 board handles the back-end processing
associated with the EKF. The affordable price point and small
form factor of both these computing units make them ideal
candidates for a small scale PMD such as a mobility scooter.
The systems utilises the Robot Operating System (ROS
Melodic) to operate, interface and network.
E. Sensors for ground truth readings
Two sensors are utilised to obtain ground truth read-
ings during experiments. The first is a Piksi Multi Real
Time Kinematic (RTK) GPS unit. RTK positioning enhances
the precision of position data derived from satellite-based
positioning systems (Global Navigation Satellite Systems
/ GNSS) such as GPS, GLONASS, Galileo, and BeiDou.
The Piksi Multi is a multi-band, multi-constellation RTK
GNSS receiver. RTK GPS readings are available in one of
three levels of precision: RTK fixed, RTK float, SPP (Single
Point Positioning acts as regular GPS) in descending order
of precision. Centimetre level accuracy is only available in
RTK fixed readings. Continuous RTK fixed GPS readings are
generally difficult to observe in large urban environments due
to building and tree cover.
The second sensor is a Hokuyo UTM-30LX 2D laser.
Both these sensors are used only to obtain ground truth for
evaluation and map building.
IV. LOCALISATION OVERVIEW
As a fundamental step towards a fully autonomous PMD
platform, a novel low cost, resource efficient vision based
localisation system was developed. This system consists of
a CNN based visual processing front-end and an Extended
Kalman Filter based back-end that fuses observations with
odometry information in order to provide a pose estimate
and its associated uncertainty. The following subsections
summarise this process. In-depth details can be found in our
previous work [9], [10], [31].
A. Perceptual Front-End
The front-end of the proposed framework is responsible
for the acquisition and pre-processing of sensory informa-
tion. The first of such information involves odometry data
capturing the linear and angular velocities of the PMD. These
can be acquired using either the wheel encoders, IMU, Visual
Odometry techniques or a combination of these approaches.
In order to correct the inherent drift associated with the
odometry information, two CNN based high level visual
observations are made.
The first category of observation involves the detection
of reasonably persistent environmental landmarks consisting
of pole like structures such as trees, lampposts, parking
meters and street signs. This detection is carried out using the
YOLOv2 [28] object detection framework. This network was
trained through transfer learning utilising a custom database
of images curated while driving the platform through the
streets of Sydney. RGB images acquired from the side facing
Realsense cameras are then fed to this trained network
which outputs bounding boxes and labels when landmarks of
interest are observed. Bearings to these landmarks are then
extracted, and together with the corresponding semantic label
form the first category of CNN based observations used by
the framework. See figure 3a.
(a) (b) (c)
Fig. 3: (a) YOLO Output Image (b) Ground Surface Image
and (b) the corresponding HED Output Image
The second category of visual observations comprises
of ground surface boundaries corresponding to artefacts
such as pavement edges, curbs and manhole covers. These
features are detected using a CNN based edge detection
framework known as Holistically Nested Edge Detection
(HED) [32]. An RGB image of the ground surface is fed
to this network which outputs a binary image consisting of
the edge information observed on the ground surface. The
pixels corresponding to the edges are then projected to the
3D world relative to the PMD’s reference frame, using the
depth data acquired by the Realsense. This forms the second
category of CNN based observations used by the framework.
See figure 3c.
B. EKF based Back-End
The back-end of the framework which consists of an
Extended Kalman Filter utilises the odometry data to carry
out a prediction of the pose of the PMD using an odometry
motion model. Then updates are carried out using two
observation models that correspond to the two types of visual
observations described in section IV-A.
In the case of the landmark based observations, a straight-
forward bearing only observation model is utilised based on
a pre-built 2D map of the locations of these landmarks. For
the ground surface observations, a novel Vector Distance
Transform (VDT) based representation of the ground sur-
face boundaries provides a map that implicitly captures the
geometry of the edges without the need for curve/line fitting
or explicit data association. The projected edge data is related
to this map using a range bearing observation model.
The EKF asynchronously updates and corrects the odom-
etry data as the data is made available, using the two types
of observations described in order to provide a 3 Degrees of
Freedom (3-DOF) pose estimate of the PMD platform and
it’s associated uncertainty.
C. Active vision
Subsequent developments to the front-end of the system
were made to improve the field of view associated with the
landmark observations. This was a critical step forward as
these landmarks are generally sparse and a larger the FoV
increases the possibility of acquiring more observation data.
As outlined in section III, an increase of the field of view
assiciated with the landmark observations was achieved by
using an affordable consumer grade omnidirectional camera.
However, directly using the raw distorted dual fisheye images
provided by the camera proved to reduce the accuracy of the
detection drastically. Thus, the dual fisheye images were re-
projected onto multiple perspective images. However, this
introduced a bottleneck in terms of processing limitation as
only a single image can be processed by the network in order
to provide the accuracy and real-time detection performance
required. As a solution, the dual fisheye images were first
projected onto a unit sphere with a virtual perspective camera
projection at the centre, capable of panning around the
sphere. An active vision strategy was then adopted utilising
the trace of the covariance matrix as an information metric
to determine the best viewpoint within this sphere that
reduces the overall covariance of the pose estimate. Once
this viewpoint is decided, the virtual perspective camera pans
to this viewpoint and creates a single perspective image
which is then fed to the Yolo framework (see figure 4).
Results outlined in [31] indicate that this approach leverages
the advantage of having a higher field of view without
compromising on performance.
D. Performance
Experiments were carried out in an ∼8000m2area in
Glebe, Sydney, Australia (Figure 5) which is representative
of a typical suburban neighbourhood in order to demonstrate
the performance of the localisation framework. As discussed
in section III, continuous RTK fixed readings for ground
truth measurements are difficult to come by. Thus, the
accuracy of the system is calculated at sporadic intervals
when fixed RTK GPS readings are available. The overall
reported root mean square error (RMSE) of the trajectory
based on these measurements is 0.22mwhich is sufficient
for safe navigation.
Fig. 4: Projection Pipeline
Fig. 5: Localisation results at Glebe, Sydney
The environment landmark map and ground surface map
for this environment was constructed 6 months prior to the
experiments depicted in figure 5. The map was constructed
utilising the vehicle poses obtained using the open-source im-
plementation of the Real-Time Appearance-Based Mapping
(RTAB) RGBD and LIDAR Graph-Based SLAM framework
[20]. The vehicle poses obtained from RTAB were used to
locate the environmental landmarks and stitch the ground
surface boundaries to form the required maps.
V. FUTURE WORK
Although the current localisation framework is promising,
we hope to improve on some of the possible failure cases.
The first possible failure case involves extended areas with
very sparse to no landmark and ground surface boundaries.
An example of such an area can be observed in section A-
H-C in figure 5. The use of Visual Odometry mitigates this
problem to a certain extent. We also hope to constrain the
estimator to known traversable areas along a path, which
we posit will improve the overall localisation estimate.
Furthermore, fusing cellular assisted GPS, which initial
investigations have shown to be superior to regular GPS in
urban settings, is another avenue being explored. Although
the landmark and ground surface boundaries are relatively
persistent in comparison to low level handcrafted features,
a less likely but possible failure case is a drastic change to
the infrastructure that may alter the landmarks and ground
surface boundaries in the environment. Thus, potential map
management strategies are also being investigated in order
to tackle this potential scenario.
Out of the five subsystems outlined in the introduction,
we also hope to implement and test control systems for
the autonomous PMD next. Great care will have to be
placed in this aspect as the perceived independence of the
driver is an important factor to consider, since mobility
scooters have important psychological therapeutic value for
those with mobility difficulties. Thus a shared control or
parallel autonomy paradigm is being investigated, where the
autonomous system acts more as a supervisory unit that
ensures safe operation instead of completely removing the
sense of independence and agency of the driver.
VI. CONCLUSION
Although there has been immense research and commer-
cial interest in self-driving vehicle technology as of late, their
applicability in the domain of PMDs requires more attention.
A level of autonomy can enhance both the functionality as
well as safety of these diverse class of electrical vehicles that
have value both in terms of their importance as therapeutic
assistive technologies as well as their wider adoption within
the urban transport infrastructure of tomorrow.
To this end, we have been rethinking and adapting au-
tonomous vehicle technology to the small form factor, re-
source constrained, low-cost PMD platform. As a first step
towards this an off the shelf mobility scooter platform was
retrofitted with a low cost sensing and computing package. A
novel, resource efficient vision only localisation system was
proposed that can function over long time frames utilising
only a sparse 2D map, which we hope can function as the
backbone for future developments.
Fig. 6: Autonomous Mobility Scooter Team
ACKNOWLEDGMENT
This work is supported by the Centre for Autonomous
System at University of Technology Sydney. Special thanks
to Mr. Nathanael Gandhi and Mr. Peter Morris for their
assistance with the presented work.
REFERENCES
[1] “Personal Mobility Device,” 2019. [Online]. Available: https:
//medical-dictionary.thefreedictionary.com/personal+mobility+device
[2] PR-Newswire, “Personal mobility devices market worldwide 2014-
2023,” 2015. [Online]. Available: https://www.statista.com/statistics/
485524/global-personal- mobility-devices-market-size/
[3] R. Mealey, “Brisbane Proposes 6kph Speed Limit
for Mobility Scooters to Protect Pedestrians,” 2018.
[Online]. Available: http://www.abc.net.au/news/2018-06- 12/
brisbane-city- council-proposes-mobility- scooter-speed-limit/9860676
[4] S. Chin, “Singapore Reins in Personal Mobility De-
vices,” 2018. [Online]. Available: https://theaseanpost.com/article/
singapore-reins- personal-mobility-devices
[5] S. O’Kane, “2017 Will Be an Important Year for
Personal Electric Vehicles of All Sizes,” 2017. [On-
line]. Available: https://www.theverge.com/2016/12/31/14134924/
electric-skateboards- boosted-bikes-vehicles-hoverboards
[6] T. Birtchnell, G. Waitt, and T. Harada, “Don’t Ignore the
Mobility Scooter. It May Just Be the Future of Transport,” The
Conversation, 2017. [Online]. Available: https://theconversation.com/
dont-ignore- the-mobility- scooter-it-may- just-be-the- future-of- transport- 85170
[7] R. Dowling, J. D. Irwin, I. J. Faulks, and R. Howitt, “Use of Personal
Mobility Devices for First-And-Last Mile Travel: The Macquarie-
Ryde Trial,” in 2015 Australasian Road Safety Conference, 2015,
p. 13.
[8] S. Kuutti, S. Fallah, K. Katsaros, M. Dianati, F. Mccullough, and
A. Mouzakitis, “A Survey of the State-Of-The-Art Localization Tech-
niques and Their Potentials for Autonomous Vehicle Applications,”
IEEE Internet of Things Journal, vol. 5, no. 2, pp. 829–846, Apr.
2018.
[9] M. Jayasuriya, G. Dissanayake, R. Ranasinge, and N. Gandhi, “Lever-
aging Deep Learning Based Object Detection for Localising Au-
tonomous Personal Mobility Devices in Sparse Maps,” in 2019 IEEE
Intelligent Transportation Systems Conference (ITSC), Oct. 2019, pp.
4081–4086.
[10] M. Jayasuriya, J. Arukgoda, R. Ranasinge, and G. Dissanayake,
“Localising PMDs through CNN Based Perception of Urban
Streets,” in (Accepted) 2020 International Conference on
Robotics and Automation (ICRA), 2020. [Online]. Avail-
able: https://www.researchgate.net/publication/339615610 Localising
PMDs through CNN Based Perception of Urban Streets
[11] B. Templeton, “Many Different Approaches to Robo-
car Mapping,” 2017. [Online]. Available: http://robohub.org/
many-different-approaches- to-robocar-mapping/
[12] F. Poggenhans, J. Pauls, J. Janosovits, S. Orf, M. Naumann, F. Kuhnt,
and M. Mayr, “lanelet2: A High-Definition Map Framework for the
Future of Automated Driving,” in 2018 21st International Conference
on Intelligent Transportation Systems (ITSC), Nov. 2018, pp. 1672–
1679.
[13] H. G. Seif and X. Hu, “Autonomous driving in the iCity: HD
maps as a key challenge of the automotive Industry,” Engineering,
vol. 2, no. 2, pp. 159–162, Jun. 2016. [Online]. Available:
http://www.sciencedirect.com/science/article/pii/S2095809916309432
[14] H. Andersen, Eng, Y. H., Leong, W. K., Zhang, C., Kong, H. X.,
Pendleton, S., Ang, M. H., and Rus, D., “Autonomous personal
mobility scooter for multi-class mobility-on-demand service,” in 2016
IEEE 19th International Conference on Intelligent Transportation
Systems (ITSC). IEEE, Nov. 2016, pp. 1753–1760.
[15] S. Pendleton, T. Uthaicharoenpong, Z. J. Chong, G. M. J. Fu, B. Qin,
W. Liu, X. Shen, Z. Weng, C. Kamin, M. A. Ang, L. T. Kuwae, K. A.
Marczuk, H. Andersen, M. Feng, G. Butron, Z. Z. Chong, J. Ang,
E. Frazzoli, and D. Rus, “Autonomous golf cars for public trial of
Mobility on Demand Service,” Rus, Sep. 2015.
[16] R. Mur-Artal, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM: A
Versatile and Accurate Monocular SLAM System,” IEEE Transactions
on Robotics, vol. 31, no. 5, pp. 1147–1163, Oct. 2015.
[17] R. Mur-Artal and J. D. Tard´
os, “ORB-SLAM2: An Open-Source
SLAM System for Monocular, Stereo, and RGB-D Cameras,” IEEE
Transactions on Robotics, vol. 33, no. 5, pp. 1255–1262, Oct. 2017.
[18] T. Qin, P. Li, and S. Shen, “VINS-Mono: A Robust and Versatile
Monocular Visual-Inertial State Estimator,” IEEE Transactions on
Robotics, vol. 34, no. 4, pp. 1004–1020, Aug. 2018.
[19] A. Rosinol, M. Abate, Y. Chang, and L. Carlone, “Kimera: an
Open-Source Library for Real-Time Metric-Semantic Localization
and Mapping,” arXiv:1910.02490 [cs], Dec. 2019, arXiv: 1910.02490.
[Online]. Available: http://arxiv.org/abs/1910.02490
[20] M. Labbe and F. Michaud, “Rtab-Map as an Open-Source Lidar and
Visual Simultaneous Localization and Mapping Library for Large-
Scale and Long-Term Online Operation,” Journal of Field Robotics,
vol. 36, no. 2, pp. 416–446, Mar. 2019.
[21] M. Sefati, M. Daum, B. Sondermann, K. D. Kreiskother, and
A. Kampker, “Improving vehicle localization using semantic and
pole-like landmarks,” in 2017 IEEE Intelligent Vehicles Symposium
(IV). Los Angeles, CA, USA: IEEE, Jun. 2017, pp. 13–19. [Online].
Available: http://ieeexplore.ieee.org/document/7995692/
[22] J. Kummerle, M. Sons, F. Poggenhans, T. Kuhner, M. Lauer, and
C. Stiller, “Accurate and Efficient Self-Localization on Roads using
Basic Geometric Primitives,” in 2019 International Conference on
Robotics and Automation (ICRA). Montreal, QC, Canada: IEEE,
May 2019, pp. 5965–5971.
[23] K. He, G. Gkioxari, P. Doll´
ar, and R. Girshick, “Mask R-CNN,”
arXiv:1703.06870 [cs], Jan. 2018, arXiv: 1703.06870. [Online].
Available: http://arxiv.org/abs/1703.06870
[24] A. Milioto and C. Stachniss, “Bonnet: An Open-Source Training and
Deployment Framework for Semantic Segmentation in Robotics using
CNNs,” in 2019 International Conference on Robotics and Automation
(ICRA). Montreal, QC, Canada: IEEE, May 2019, pp. 7094–7100.
[Online]. Available: https://ieeexplore.ieee.org/document/8793510/
[25] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time
Instance Segmentation,” arXiv:1904.02689 [cs], Oct. 2019, arXiv:
1904.02689. [Online]. Available: http://arxiv.org/abs/1904.02689
[26] R. Girshick, “Fast R-CNN,” arXiv:1504.08083 [cs], Sep. 2015, arXiv:
1504.08083. [Online]. Available: http://arxiv.org/abs/1504.08083
[27] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only
Look Once: Unified, Real-Time Object Detection,” arXiv:1506.02640
[cs], May 2016, arXiv: 1506.02640. [Online]. Available: http:
//arxiv.org/abs/1506.02640
[28] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,” in
2017 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR). Honolulu, HI: IEEE, Jul. 2017, pp. 6517–6525.
[29] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,”
arXiv:1804.02767 [cs], Apr. 2018, arXiv: 1804.02767. [Online].
Available: http://arxiv.org/abs/1804.02767
[30] “Mobility Scooters, Lift Chairs, Aged Care Equipment Solutions.”
[Online]. Available: https://ilsau.com.au/
[31] M. Jayasuriya, R. Ranasinge, and G. Dissanayake, “Active Perception
for Outdoor Localisation with an Omnidirectional Camera,” in
(Accepted) 2020 EEE/RSJ International Conference on Intelligent
Robots and Systems (IROS), 2020. [Online]. Available: https:
//www.researchgate.net/publication/343788282 Active Perception
for Outdoor Localisation with an Omnidirectional Camera
[32] S. Xie and Z. Tu, “Holistically-Nested Edge Detection,” in 2015 IEEE
International Conference on Computer Vision (ICCV). Santiago,
Chile: IEEE, Dec. 2015, pp. 1395–1403.