- A preview of this full-text is provided by Springer Nature.
- Learn more
Preview content only
Content available from Neural Computing and Applications
This content is subject to copyright. Terms and conditions apply.
ORIGINAL ARTICLE
YOLO-SLAM: A semantic SLAM system towards dynamic environment
with geometric constraint
Wenxin Wu
1
•Liang Guo
1
•Hongli Gao
1
•Zhichao You
1
•Yuekai Liu
1
•Zhiqiang Chen
1
Received: 25 February 2021 / Accepted: 15 November 2021 / Published online: 8 January 2022
ÓThe Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021
Abstract
Simultaneous localization and mapping (SLAM), as one of the core prerequisite technologies for intelligent mobile robots,
has attracted much attention in recent years. However, the traditional SLAM systems rely on the static environment
assumption, which becomes unstable for the dynamic environment and further limits the real-world practical applications.
To deal with the problem, this paper presents a dynamic-environment-robust visual SLAM system named YOLO-SLAM.
In YOLO-SLAM, a lightweight object detection network named Darknet19-YOLOv3 is designed, which adopts a low-
latency backbone to accelerate and generate essential semantic information for the SLAM system. Then, a new geometric
constraint method is proposed to filter dynamic features in the detecting areas, where dynamic features can be distinguished
by utilizing the depth difference with Random Sample Consensus (RANSAC). YOLO-SLAM composes the object
detection approach and the geometric constraint method in a tightly coupled manner, which is able to effectively reduce the
impact of dynamic objects. Experiments are conducted on the challenging dynamic sequences of TUM dataset and Bonn
dataset to evaluate the performance of YOLO-SLAM. The results demonstrate that the RMSE index of absolute trajectory
error can be significantly reduced to 98.13% compared with ORB-SLAM2 and 51.28% compared with DS-SLAM,
indicating that YOLO-SLAM is able to effectively improve stability and accuracy in the highly dynamic environment.
Keywords Visual SLAM Dynamic environment Object detection Geometric constraint
1 Introduction
In the past years, mobile robots and automatic driving
technology have made significant progress. Simultaneous
localization and mapping (SLAM) as a prerequisite tech-
nology for many robotic applications is attracting wide-
spread interest in this field [1–3]. SLAM technology plays
an important role in robot positioning estimation and
mapping establishment, which can help robots to position
themselves in an unknown environment without any prior
information and could simultaneously create a map of the
surroundings [4,5]. The position information helps robots
to know where they are in the process of moving, even the
place where robots have never been there. The map allows
robots to restore essential environmental information that
can be applied for the relocation process when robots come
back to the same place.
SLAM can be subdivided into laser-based SLAM and
visual-based SLAM according to the different sensors used
[6]. Visual SLAM, whose main sensor is a camera, com-
monly including monocular camera, stereo camera and
RGB-D camera, has been extensively explored [7]. That’s
because images can store wider scene information than
laser sensors. After data mining, the obtained information
can be widely used in object detection, semantic segmen-
tation, or disease diagnosis [8,9]. The visual SLAM has
developed over thirty years and becomes quite mature in
some specific scenarios. Some advanced visual SLAM
systems have achieved pretty decent performances, such as
ORB-SLAM2 [10], LSD-SLAM [11], RGBD-SLAM-V2
[12].
However, most of the traditional SLAM systems are
fragile when confronted with some extreme environments,
such as the dynamic or rough environment. The dynamic
environment [13] refers to the scene where moving objects
&Liang Guo
guoliang@swjtu.edu.cn
1
School of Mechanical Engineering, Southwest Jiaotong
University, Chengdu 610031, China
123
Neural Computing and Applications (2022) 34:6011–6026
https://doi.org/10.1007/s00521-021-06764-3(0123456789().,-volV)(0123456789().,-volV)
Content courtesy of Springer Nature, terms of use apply. Rights reserved.