Conference PaperPDF Available

The Normal Distributions Transform: A New Approach to Laser Scan Matching

Authors:

Abstract and Figures

Matching 2D range scans is a basic component of many localization and mapping algorithms. Most scan match algorithms require finding correspondences between the used features, i.e. points or lines. We propose an alternative representation for a range scan, the normal distributions transform. Similar to an occupancy grid, we subdivide the 2D plane into cells. To each cell, we assign a normal distribution, which locally models the probability of measuring a point. The result of the transform is a piecewise continuous and differentiable probability density, that can be used to match another scan using Newton's algorithm. Thereby, no explicit correspondences have to be established. We present the algorithm in detail and show the application to relative position tracking and simultaneous localization and map building (SLAM). First results on real data demonstrate, that the algorithm is capable to map unmodified indoor environments reliable and in real time, even without using odometry data.
Content may be subject to copyright.
The Normal Distributions Transform: A New Approach to Laser Scan
Matching
Peter Biber (biber@gris.uni-tuebingen.de)
Wolfgang Straßer
WSI/GRIS
University of T
¨
ubingen
Sand 14, 72070 T
¨
ubingen, Germany
Abstract Matching 2D range scans is a basic component
of many localization and mapping algorithms. Most scan
match algorithms require finding correspondences between
the used features, i.e. points or lines. We propose an alterna-
tive representation for a range scan, the Normal Distributions
Transform. Similar to an occupancy grid, we subdivide the 2D
plane into cells. To each cell, we assign a normal distribution,
which locally models the probability of measuring a point.
The result of the transform is a piecewise continuous and
differentiable probability density, that can be used to match
another scan using Newton’s algorithm. Thereby, no explicit
correspondences have to be established.
We present the algorithm in detail and show the ap-
plication to relative position tracking and simultaneous
localization and map building (SLAM). First results on real
data demonstrate, that the algorithm is capable to map
unmodified indoor environments reliable and in real time,
even without using odometry data (see video).
I. INTRODUCTION
Undoubtedly, simultaneous localization and mapping
(SLAM) is a basic capability of a mobile robot system.
Laser range scanners are popular sensors to get the needed
input, mainly for their high reliability and their low noise
in a broad class of situations. Many SLAM algorithms are
based on the ability to match two range scans or to match
a range scan to a map. Here we present a new approach
to the low-level task of scan matching and show, how we
use it to build maps.
The main purpose of this paper is the introduction of the
Normal Distributions Transform (in the following called
NDT) and its application to the matching of one scan
to another scan or of one scan to several other scans.
The NDT transforms the discrete set of 2D points recon-
structed from a single scan into a piecewise continuous
and differentiable probability density defined on the 2D
plane. This probability density consists of a set of normal
distributions, that can be easily calculated. Matching a
second scan to the NDT is then defined as maximizing
the sum, that the aligned points of the second scan score
on this density.
We also present a simple algorithm for the SLAM
problem, that fits nicely into our matching scheme. How-
ever, the proposed matching scheme does not depend on
this algorithm. For this reason, we review only methods
concerning the matching of two scans in the related
work in section II and not methods for building whole
maps like the approaches of Thrun [15] or Gutmann [6].
Nevertheless, one component of such approaches is a
perceptual model, that is the likelihood of a single scan
given a map and a pose estimate. Since our methods
yields exactly a measure for such a model, we believe,
that our scan matcher could also be integrated into more
sophisticated SLAM algorithms.
The rest of this paper is organized as follows: Section
III introduces the NDT, section IV gives an overview
of the scan matching approach and defines the measure
for comparing a scan to a NDT, which is optimized in
section V using the Newton algorithm. The scan matcher
is applied to position tracking in section VI and to a simple
SLAM approach in section VII. We finally show some
results on real data and conclude the paper with an outlook
on future work.
II. PREVIOUS WORK
The goal of matching two range scans is to find the
relative pose between the two positions, at which the scans
were taken. The basis of most successful algorithms is the
establishment of correspondences between primitives of
the two scans. Out of this, an error measure can be derived
and minimized. Cox used points as primitives and matched
them to lines, which were given in an a priori model [3].
In the Amos Project [7], these lines were extracted from
the scans. Gutmann matched lines extracted from a scan to
lines in a model [8]. The most general approach, matching
points to points, was introduced by Lu and Milios [9]. This
is essentially a variant of the ICP (Iterated Closest Point)
algorithm ([1],[2],[18]) applied to laser scan matching. We
share with Lu and Milios our mapping strategy. As in [10],
we do not build an explicit map, but use a collection of
selected scans with their recovered poses as an implicit
map.
In all of these approaches, explicit correspondences
have to be established. Our approach differs in this point,
as we never need to establish a correspondence between
primitives. There are also other approaches, that avoid
solving the correspondence problem. In [12], Mojaev com-
bines the correlation of local polar occupancy grids with
a probabilistic odometry model for pose determination
(using laser scanner and sonar). Weiss and Puttkammer
[17] used angular histograms to recover the rotation be-
tween two poses. Then x- and y-histograms, which were
calculated after finding the most common direction were
used to recover the translation. This approach can be
extended by using a second main direction [7].
Our work was also inspired by computer vision tech-
niques. If the word probability density is replaced by
image intensity, our approach shares a similar structure
to feature tracking [13] or composing of panoramas [14].
These techniques use the image gradient at each relevant
position to estimate the parameters. Here, derivatives of
normal distributions are used. Opposed to image gradients,
these can be calculated analytically.
III. THE NORMAL DISTRIBUTIONS TRANSFORM
This section describes the Normal Distributions Trans-
form (NDT) of a single laser scan. This is meant to be the
central contribution of the paper. The use of the NDT for
position tracking and SLAM, described in the following
sections, is then relatively straightforward.
The NDT models the distribution of all reconstructed
2D-Points of one laser scan by a collection of local normal
distributions. First, the 2D space around the robot is
subdivided regularly into cells with constant size. Then for
each cell, that contains at least three points, the following
is done:
1) Collect all 2D-Points x
i=1..n
contained in this box.
2) Calculate the mean q =
1
n
P
i
x
i
.
3) Calculate the covariance matrix
Σ =
1
n
P
i
(x
i
q)(x
i
q)
t
.
The probability of measuring a sample at 2D-point
x contained in this cell is now modeled by the normal
distribution N(q, Σ):
p(x) exp(
(x q)
t
Σ
1
(x q)
2
). (1)
Similar to an occupancy grid, the NDT establishes a
regular subdivision of the plane. But where the occupancy
grid represents the probability of a cell being occupied, the
NDT represents the probability of measuring a sample for
each position within the cell. We use a cell size of 100
cm by 100 cm.
What‘s the use for this representation? We now have a
piecewise continuous and differentiable description of the
2D plane in the form of a probability density. Before we
show an example, we have to note two implementation
details.
To minimize effects of discretization, we decided to use
four overlapping grids. That is, one grid with side length
l of a single cell is placed first, then a second one, shifted
Fig. 1. An example of the NDT: The original laser scan and the resulting
probability density.
by
l
2
horizontally, a third one, shifted by
l
2
vertically and
finally a fourth one, shifted by
l
2
horizontally and verti-
cally. Now each 2D point falls into four cells. This will not
be taken into account for the rest of the paper explicitly
and we will describe our algorithm, as if there were only
one cell per point. So if the probability density of a point
is calculated, it is done with the tacit understanding, that
the densities of all four cells are evaluated and the result
is summed up.
A second issue is, that for a noise free measured world
line, the covariance matrix will get singular and can not be
inverted. In practice, the covariance matrix can sometimes
get near singular. To prevent this effect, we check, whether
the smaller eigenvalue of Σ is at least 0.001 times the
larger eigenvalue. If not, it is set to this value.
Fig. 1 shows an example laser scan and a visualization
of the resulting NDT. The visualization is created by
evaluating the probability density at each point, bright
areas indicate high probability densities. The next section
shows, how this transformation is used to align two laser
scans.
IV. SCAN ALIGNMENT
The spatial mapping T between two robot coordinate
frames is given by
T :
x
0
y
0
=
cos φ sin φ
sin φ cos φ
x
y
+
t
x
t
y
, (2)
where (t
x
, t
y
)
t
describes the translation and φ the
rotation between the two frames. The goal of the scan
alignment is to recover these parameters using the laser
scans taken at two positions. The outline of the proposed
approach, given two scans (the first one and the second
one), is as follows:
1) Build the NDT of the first scan.
2) Initialize the estimate for the parameters (by zero or
by using odometry data).
3) For each sample of the second scan: Map the
reconstructed 2D point into the coordinate frame of
the first scan according to the parameters.
4) Determine the corresponding normal distributions
for each mapped point.
5) The score for the parameters is determined by
evaluating the distribution for each mapped point
and summing the result.
6) Calculate a new parameter estimate by trying to
optimize the score. This is done by performing one
step of Newton’s Algorithm.
7) Goto 3 until a convergence criterion is met.
The first four steps are straightforward: Building the
NDT was described in the last section. As noted above,
odometry data could be used to initialize the estimate.
Mapping the second scan is done using T and finding the
corresponding normal distribution is a simple lookup in
the grid of the NDT.
The rest is now described in detail using the following
notation:
p = (p
i
)
t
i=1..3
= (t
x
, t
y
, φ)
t
: The vector of the
parameters to estimate.
x
i
: The reconstructed 2D point of laser scan sample
i of the second scan in the coordinate frame of the
second scan.
x
0
i
: The point x
i
mapped into the coordinate frame
of the first scan according to the parameters p, that
is x
0
i
= T (x
i
, p).
Σ
i
, q
i
: The covariance matrix and the mean of the
corresponding normal distribution to point x
0
i
, looked
up in the NDT of the first scan.
The mapping according to p could be considered opti-
mal, if the sum evaluating the normal distributions of all
points x
0
i
with parameters Σ
i
and q
i
is a maximum. We
call this sum the score of p. It is defined as:
score(p) =
X
i
exp(
(x
0
i
q
i
)
t
Σ
1
i
(x
0
i
q
i
)
2
). (3)
This score is optimized in the next section.
V. OPTIMIZATION USING NEWTONS ALGORITHM
Since optimization problems normally are described as
minimization problems, we will adopt our notation to this
convention. Thus the function to be minimized in this
section is score. Newton’s algorithm iteratively finds
the parameters p = (p
i
)
t
, that minimize a function f.
Each iteration solves the following equation:
H∆p = g (4)
Where g is the transposed gradient of f with entries
g
i
=
f
p
i
(5)
and H is the Hessian of f with entries
H
ij
=
f
p
i
p
j
. (6)
The solution of this linear system is an increment ∆p,
which is added to the current estimate:
p p + ∆p (7)
If H is positive definite, f(p) will initially decrease in
the direction of ∆p. If this is not the case, H is replaced
by H
0
= H + λI, with λ chosen such, that H
0
is safely
positive definite. Practical details on the minimization
algorithm itself can be found for example in [4].
This algorithm is now applied to the function score.
The gradient and the Hessian are built by collecting the
partial derivatives of all summands of equation 3. For
a shorter notation and to avoid confusing the parameter
number i and the index of the laser scan sample i, the
index i for the sample number is dropped. Additionally,
we write
q = x
0
i
q
i
(8)
As can be verified easily, the partial derivatives of q
with respect to p equal the partial derivatives of x
0
i
. One
summand s of score is then given by
s = exp
q
t
Σ
1
q
2
. (9)
For on such summand, the entries of the gradient are
then (using the chain rule):
˜g
i
=
s
p
i
=
s
q
q
p
i
(10)
= q
t
Σ
1
q
p
i
exp
q
t
Σ
1
q
2
.
The partial derivatives of q with respect to p
i
are given
by the Jacobi matrix J
T
of T (see equation 2):
J
T
=
1 0 x sin φ y cos φ
0 1 x cos φ y sin φ
. (11)
A summand’s entries in the Hessian H are given by:
˜
H
ij
=
s
p
i
p
j
= exp
q
t
Σ
1
q
2
((q
t
Σ
1
q
p
i
)(q
t
Σ
1
q
p
j
) + (12)
(q
t
Σ
1
2
q
p
i
p
j
) + (
q
t
p
j
Σ
1
q
p
i
))
The second derivatives of q are (see eq. 11):
2
q
p
i
p
j
=
x cos φ + y sin φ
x sin φ y cos φ
i = j = 3
0
0
otherwise
(13)
As can be seen from these equations, the computational
costs to build the gradient and the Hessian are low. There
is only one call to the exponential function per point
and a small number of multiplications. The trigonometric
functions only depend on φ (the current estimate for the
angle) and must therefore be called only once per iteration.
The next two sections will now use this algorithm for
position tracking and for SLAM.
VI. POSITION TRACKING
This section describes, how the scan match algorithm
can be applied to tracking the current position from a
given time t = t
start
. The next section then extends this
approach to SLAM.
The global reference coordinate frame is defined by the
local robot coordinate frame at this time. The respective
laser scan is called the keyframe in the following. Tracking
is performed with respect to this keyframe. At time t
k
, the
algorithm performs the following steps:
1) Let δ be the estimate for the movement between
time t
k1
and t
k
(for example from the odometry).
2) Map the position estimate of time t
k1
according to
δ.
3) Perform the optimization algorithm using the current
scan, the NDT of the keyframe and the new position
estimate.
4) Check, whether the the keyframe is “near” enough
to the current scan. If yes, iterate. Otherwise take the
last successfully matched scan as new keyframe.
The decision, whether a scan is still near enough is
based on a simple empiric criterion involving the transla-
tional and the angular distance between the keyframe and
the current frame and the resulting score. To be useful for
position tracking, the algorithm has to be performed in
real-time: Building the NDT of a scan needs around 10
ms on a 1.4 GHz machine. For small movements between
scans the optimization algorithm typically needs around
1-5 iterations (and rarely more then ten). One iteration
needs around 2 ms, so real-time is no problem.
VII. APPLICATION TO SLAM
We define a map as a collection of keyframes together
with their global poses. This section describes how to
localize with respect to this map and how to extend
and optimize this map, when the robot reaches unknown
territory.
A. Localizing with respect to multiple scans
To each scan i in the map, an angle φ
i
(or a rotation
matrix R
i
) and a translation vector (t
x
, t
y
)
t
i
= T
i
is
associated. These describe the pose of scan i in the global
coordinate frame. The current robot pose is denoted by a
rotation matrix R and a translation vector T. The mapping
T
0
from the robot coordinate frame to the coordinate frame
of scan i is then given by:
T
0
:
x
0
y
0
= R
t
i
(R
x
y
+ T T
i
) (14)
Only small changes are required to adapt the algorithm
of section V to this situation. The mapping of a 2D-point
of scan i is now calculated by applying T
0
. Further, the
Jacobian and the second partial derivatives of T
0
get now
slightly more complicated. The Jacobian of the mapping
is now given by:
J
T
0
= R
i
t
J
T
(15)
and second partial derivatives of T
0
are now given by:
2
x
0
p
i
p
j
=
R
i
t
x cos φ + y sin φ
x sin φ y cos φ
i = j = 3
0
0
otherwise
(16)
The gradient and the Hessian for the optimization
algorithm could be built by summing over all overlapping
scans. But we found an alternative, that is faster and yields
equally good results: For each sample of the scan taken at
the robot position, determine the scan, where the result of
evaluating the probability density is maximal. Only this
scan is used for this sample and for the current iteration.
This way, the operations needed to build the gradient and
the Hessian for the optimization algorithm are independent
of the number of overlapping keyframes, except for finding
the mentioned maximum.
B. Adding a new keyframe and optimizing the map
At each time step, the map consists of a set of keyframes
with their poses in the global coordinate frame. If the
overlap of the current scan with the map is too small, the
map is extended by the last successfully matched scan.
Then every overlapping scan is matched separately to the
new keyframe, yielding the relative pose between the two
scans. A graph is maintained, that holds the information
of the pairwise matching result.
In this graph, every keyframe is represented by a node.
A node holds the estimate for the pose of the keyframe
in the global coordinate frame. An edge between two
nodes indicates that the corresponding scans have been
pairwise matched and holds the relative pose between the
two scans.
After a new keyframe is added, the map is refined by
optimizing an error function defined over the parameters
of all keyframes. The results of the pairwise registration
are used to define a quadratic error model for each
matched pair as follows: The global parameters of two
scans also define the relative pose between two scans. Let
p be the difference between the relative pose defined by
the global parameters and the relative pose defined by the
result of the pairwise matching. Then we model the score
-1000
-500
0
500
1000
1500
-2500 -2000 -1500 -1000 -500 0 500 1000 1500 2000 2500
All Keyframes
Reconstructed trajectory
Fig. 2: A map, that was built using our scan matcher. Lengths are given in cm. Shown are the set of keyframes and
the estimated trajectory (see video). More videos can be found on the authors’ homepage [11].
of this two scans as function of p, using the quadratic
model
score
0
(∆p) = score +
1
2
(∆p)
t
H(∆p). (17)
Thereby score is the final score, when the pairwise
matching had converged and H is the thereby obtained
Hessian. This model is derived by a Taylor expansion of
score around ∆p = 0 up to the quadratic term. Notice,
that the linear term is missing, because we expanded about
an extreme point. This score is now summed over all edges
and optimized.
If the number of keyframes gets large, this minimization
can no longer be performed under realtime conditions (the
number of free parameters is 3N 3, where N is the
number of keyframes). We therefore optimize only on a
subgraph of the map. This subgraph is built by collecting
all keyframes, which can be reached from the node of the
new keyframe by traversing no more than three edges. We
optimize the error function above now only with respect
to the parameters, which belong to the keyframe contained
in this subgraph. Of course, if a cycle had to be closed,
we would have to optimize over all keyframes.
VIII. RESULTS
The results we present now (and also the example in
section 6) were performed without using odometry. This
should demonstrate the robustness of the approach. Of
course, as Thrun already noted in [15], this is only possible
as long as any 2D structure is present in the world.
The built map presented in fig. 2 was acquired by
driving the robot out of its lab, up the corridor, down
the corridor and then back into the lab.
So the situation requires both the extension of the
map and the localization with respect to the map. The
robot collected 28430 laser scans during a travel of 20
minutes, where it traversed approximately 83 meters. The
scans were taken with a SICK laser scanner covering
180 degree with an angular resolution of one degree. To
simulate a higher speed, only every fifth scan was used.
The simulated speed is then around 35 cm/s and the
number of scans per second is around 23. The map was
built using a combined strategy. The position tracker of
section 4 was applied to every scan, whereas we initialize
the parameters by extrapolating the result of the last time
step linearly. Every tenth scan, the procedure of section
VII was applied.
Fig. 2 shows the resulting map. Shown are the 33
keyframes, that the final map consisted of. A closer look
reveals also, that our scan match algorithm is tolerant
against small changes in the environment like opened
or closed doors. Processing all frames offline needs 58
seconds on a 1.4 GHz machine, that’s 97 scans per second.
More speed could perhaps be gained by porting our current
implementation from Java to a faster language.
IX. CONCLUSION AND FUTURE WORK
We have presented a new representation of range scans,
the Normal Distributions Transform (NDT). This trans-
form can be used to derive analytic expressions for match-
ing another scan. We also showed, how our scan matcher
could be incorporated for the problem of position tracking
and for the SLAM problem. The major advantages of our
method are:
No explicit correspondences between points or fea-
tures have to established. Since this is the most error-
prone part in most approaches, we are more robust
without correspondences.
All derivatives can be calculated analytically. This is
both fast and correct.
The question is of course: Can everything be modeled well
enough by local normal distributions?
Up to now, our tests were performed in indoor envi-
ronments, where this was never a problem. Further tests
in less structured environments, preferably outdoors, are
planned. We also intend to systematically compare the
radius of convergence of our method with the method of
Lu and Milios.
X. ACKNOWLEDGMENTS
Peter Biber is funded by a grant of the Land Baden-
Wuerttemberg, which is gratefully acknowledged. We
would also like to thank Prof. Andreas Zell and Achim
Lilienthal for the possibility to use the robot and the data
acquisition software. We would also like to thank Sven
Fleck for many fruitful discussions and help with the
video.
XI. REFERENCES
[1] P.J. Besl and N.D. McKay. A method for registration
of 3d shapes. IEEE Transactions on Pattern Analysis
and Machine Intelligence, 14(2):239–256, 1992.
[2] Y. Chen and G.G. Medioni. Object modeling by
registration of multiple range images. Image and
Vision Computing, 10(3):145–155, 1992.
[3] I.J. Cox. Blanche: An experiment in guidance and
navigation of an autonomous robot vehicle. IEEE
Transactions on Robotics and Automation, 7(2):193–
204, 1991.
[4] J.E. Dennis and R. B. Schnabel. Numerical Meth-
ods for Unconstrained Optimization and Nonlinear
Equations. SIAM Classics in Applied Mathematics,
1996.
[5] J. Gutmann, W. Burghard, D. Fox, and K. Konolige.
An experimental comparison of localization meth-
ods. In Proc. International Conference on Intelligent
Robots and System, 1998.
[6] J. Gutmann and K. Konolige. Incremental mapping
of large cyclic environments. In Proceedings of the
IEEE International Symposium on Computational
Intelligence in Robotics and Automation (CIRA),
2000.
[7] J. Gutmann and C. Schlegel. Amos: Comparison
of scan matching approaches for self-localization
in indoor environments. In Proceedings of the 1st
Euromicro Workshop on Advanced Mobile Robots.
IEEE Computer Society Press, 1996.
[8] J. Gutmann, T. Weigel, and B. Nebel. Fast, accurate,
and robust self-localization in polygonal environ-
ments. In Robocup workshop at IJCAI-99, 1999.
[9] F. Lu and E. Milios. Robot pose estimation in
unknown environments by matching 2d range scans.
In CVPR94, pages 935–938, 1994.
[10] F. Lu and E. Milios. Globally consistent range scan
alignment for environment mapping. Autonomous
Robots, 4:333–349, 1997.
[11] Peter Biber’s homepage.
www.gris.uni tuebingen.de/ biber.
[12] Alexander Mojaev. Umgebungswahrnehmung,
Selbstlokalisation und Navigation mit einem mobilen
Roboter. PhD thesis, University of T
¨
ubingen, 2001.
[13] Jianbo Shi and Carlo Tomasi. Good features to
track. In IEEE Conference on Computer Vision
and Pattern Recognition (CVPR’94), pages 593–600,
Seattle, June 1994.
[14] R. Szeliski and H.Y. Shum. Creating full view
panoramic image mosaics and environment map. In
Computer Graphics (SIGGRAPH 97), pages 251–
258, 1997.
[15] S. Thrun, W. Burgard, and D.Fox. A real-time al-
gorithm for mobile robot mapping with applications
to multi-robot and 3d mapping. In Proceedings
of IEEE International Conference on Robotics and
Automation, 2000.
[16] Sebastian Thrun, Dieter Fox, Wolfram Burgard, and
Frank Dellaert. Robust monte carlo localization for
mobile robots. Artificial Intelligence, 128(1-2):99–
141, 2001.
[17] G. Weiss and E. von Puttkamer. A map based
on laserscans without geometric interpretation. In
Proceedings of Intelligent Autonomous Systems 4
(IAS-4), pages 403–407, 1995.
[18] Z. Zhang. Iterative point matching for registration of
free-from curves and surfaces. International Journal
of Computer Vision, 13(2):119–152, 1994.
... The Normal Distribution Transform is an alternative to ICP and ICP Point-to-Plane that simplifies data association using an unstructured representation of the scene: a voxel grid [4]. The grid enforces spatial relationships among grid clusters without imposing specific geometric models to represent objects in the scene. ...
... To demonstrate the central contributions of this paper, ICET was benchmarked against NDT. An implementation of NDT was programmed from scratch using the standard algorithm and optimization routine provided by Biber [4]. Monte-Carlo (MC) simulations of NDT and ICET were conducted through two scenarios featuring simple, simulated driving environments, as illustrated in Figure 5. ...
Preprint
Full-text available
Lidar data can be used to generate point clouds for the navigation of autonomous vehicles or mobile robotics platforms. Scan matching, the process of estimating the rigid transformation that best aligns two point clouds, is the basis for lidar odometry, a form of dead reckoning. Lidar odometry is particularly useful when absolute sensors, like GPS, are not available. Here we propose the Iterative Closest Ellipsoidal Transform (ICET), a scan matching algorithm which provides two novel improvements over the current state-of-the-art Normal Distributions Transform (NDT). Like NDT, ICET decomposes lidar data into voxels and fits a Gaussian distribution to the points within each voxel. The first innovation of ICET reduces geometric ambiguity along large flat surfaces by suppressing the solution along those directions. The second innovation of ICET is to infer the output error covariance associated with the position and orientation transformation between successive point clouds; the error covariance is particularly useful when ICET is incorporated into a state-estimation routine such as an extended Kalman filter. We constructed a simulation to compare the performance of ICET and NDT in 2D space both with and without geometric ambiguity and found that ICET produces superior estimates while accurately predicting solution accuracy.
... Considering recent researches [18,15], the SLAM community [19] divides these algorithms into two categories regarding their purpose, as local and global search methods, respectively. The local search category [6,7] typically constructs a non-convex optimization problem by greedily associating nearest entities to align. This often relies on a good initial guess, and thus mostly used for incremental positioning modules such as the LiDAR odometry [41] and map-based localization [32]. ...
Preprint
Poles and building edges are frequently observable objects on urban roads, conveying reliable hints for various computer vision tasks. To repetitively extract them as features and perform association between discrete LiDAR frames for registration, we propose the first learning-based feature segmentation and description model for 3D lines in LiDAR point cloud. To train our model without the time consuming and tedious data labeling process, we first generate synthetic primitives for the basic appearance of target lines, and build an iterative line auto-labeling process to gradually refine line labels on real LiDAR scans. Our segmentation model can extract lines under arbitrary scale perturbations, and we use shared EdgeConv encoder layers to train the two segmentation and descriptor heads jointly. Base on the model, we can build a highly-available global registration module for point cloud registration, in conditions without initial transformation hints. Experiments have demonstrated that our line-based registration method is highly competitive to state-of-the-art point-based approaches. Our code is available at https://github.com/zxrzju/SuperLine3D.git.
... The distribution of points within each voxel is characterized, and scans are aligned to maximize the similarity of the distributions within each voxel. The best known voxel-based method is the Normal Distributions Transform (NDT), which describes the distribution of points in each voxel using a Gaussian density function [16]. Unlike feature-based methods, NDT does not ascribe any specific shape to characterize surfaces, a property which makes it more robust to processing scenes with arbitrary terrain [17]. ...
Preprint
In this paper we propose an approach to mitigate shadowing errors in Lidar scan matching, by introducing a preprocessing step based on spherical gridding. Because the grid aligns with the Lidar beam, it is relatively easy to eliminate shadow edges which cause systematic errors in Lidar scan matching. As we show through simulation, our proposed algorithm provides better results than ground-plane removal, the most common existing strategy for shadow mitigation. Unlike ground plane removal, our method applies to arbitrary terrains (e.g. shadows on urban walls, shadows in hilly terrain) while retaining key Lidar points on the ground that are critical for estimating changes in height, pitch, and roll. Our preprocessing algorithm can be used with a range of scan-matching methods; however, for voxel-based scan matching methods, it provides additional benefits by reducing computation costs and more evenly distributing Lidar points among voxels.
... For LiDAR features extraction, the points are divided into different lines, boundary line features are obtained by distance discontinuity.In order to extract sufficient line features on low beam LiDAR, a local mapping method is applied to combine three frames of point cloud into one, which can present more points in one frame. Depending on the strength of the GPS signal and the accuracy requirement of the autonomous driving scenario, we propose two local mapping methods: the GPS-based approach and the Normal Distribution Transform (NDT)-based approach [21]. The former method is less computationally intensive with less accuracy, while the latter method is more computationally intensive with more accuracy. ...
Preprint
Full-text available
While camera and LiDAR are widely used in most of the assisted and autonomous driving systems, only a few works have been proposed to associate the temporal synchronization and extrinsic calibration for camera and LiDAR which are dedicated to online sensors data fusion. The temporal and spatial calibration technologies are facing the challenges of lack of relevance and real-time. In this paper, we introduce the pose estimation model and environmental robust line features extraction to improve the relevance of data fusion and instant online ability of correction. Dynamic targets eliminating aims to seek optimal policy considering the correspondence of point cloud matching between adjacent moments. The searching optimization process aims to provide accurate parameters with both computation accuracy and efficiency. To demonstrate the benefits of this method, we evaluate it on the KITTI benchmark with ground truth value. In online experiments, our approach improves the accuracy by 38.5\% than the soft synchronization method in temporal calibration. While in spatial calibration, our approach automatically corrects disturbance errors within 0.4 second and achieves an accuracy of 0.3-degree. This work can promote the research and application of sensor fusion.
... However, the interest point extraction is still a costly process and requires the point cloud to present a somehow organised distribution for them to be extracted efficiently. As an alternative, a stochastic method based on normal approximation of the point cloud distribution named Normal Distribution Transform (NDT) was introduced in [34]. This makes the SLAM algorithm appropriate for evolution in unstructured environment as well as in geometrically rich ones, and requires limited computational power with no feature point extraction processing. ...
Preprint
Full-text available
This paper introduces an Online Localisation and Colored Mesh Reconstruction (OLCMR) ROS perception architecture for ground exploration robots aiming to perform robust Simultaneous Localisation And Mapping (SLAM) in challenging unknown environments and provide an associated colored 3D mesh representation in real time. It is intended to be used by a remote human operator to easily visualise the mapped environment during or after the mission or as a development base for further researches in the field of exploration robotics. The architecture is mainly composed of carefully-selected open-source ROS implementations of a LiDAR-based SLAM algorithm alongside a colored surface reconstruction procedure using a point cloud and RGB camera images projected into the 3D space. The overall performances are evaluated on the Newer College handheld LiDAR-Vision reference dataset and on two experimental trajectories gathered on board of representative wheeled robots in respectively urban and countryside outdoor environments. Index Terms: Field Robots, Mapping, SLAM, Colored Surface Reconstruction
... However, this representation makes use of a flat surface assumption that becomes insufficient when modelling objects smaller than the grid resolution. A similar representation is the normal distribution transform (NDT) [13,18,19] that maintains the whole covariance matrix of the distribution, enabling to model the spatial extend inside a voxel. ...
Article
3D point cloud registration is a computational process that aligns two 3D point clouds through transformation. i.e. finding matching translation and rotation. Existing state-of-the-art learning-based methods require ground-truth transformation as supervision and often perform poorly in dealing with partial point clouds and large scenes that are not trained, resulting in poor generalization for neural networks. In this paper, we propose a novel unsupervised deep learning network - Binary Tree Network (BTreeNet) that consists of a novel forward propagation, which learns features for the rotation separately from the translation and avoids the interference between the estimations of rotation and translation in one single matrix. We then propose an Iterative Binary Tree Network (IBTreeNet) to continuously improve the registration accuracy for large and dense 3D point clouds. The Chamfer Distance and the Earth Mover’s Distance are adopted as the loss function for unsupervised learning. We show that BTreeNet and IBTreeNet outperform state-of-the-art learning-based and traditional methods on partial and noisy point clouds without training them in such scenarios. Most importantly, the proposed methods exhibit remarkable generalization and robustness to unseen large and dense scenes that are never trained.
Article
Map-based self-localization estimates the pose of the self-driving vehicle in an environment, becoming an essential part of autonomous driving tasks. Generally, maps used in self-localization have detailed geometric information on an environment in formats such as point cloud maps and Gaussian mixture model (GMM) maps. As other maps are widely developed for autonomous driving, vector maps store more object-focused information, such as buildings and road facilities, for navigation and scene understanding in autonomous driving tasks. However, it is not compatible with self-localization due to the lack of detailed geometric information. The two different map formats of vector maps and maps for self-localization complicate the management, preventing the development of the area where a self-driving vehicle can drive stably. This paper proposes a unified map format with a hierarchical structure that enables both vector maps and self-localization maps (i.e., GMM maps) to be managed more easily. Because proposed maps can be treated as vector maps at the high-level layer, various tasks related to navigation and scene understanding in autonomous driving can utilize. A GMM map is stored at the low-level layer associated with a vector map component, enabling accurate self-localization in an environment. The proposed map format is compatible with vector maps widely developed by mapping companies on the surface and facilitates future map management. The experimental results of self-localization in urban areas showed that the proposed map gives the competitive self-localization accuracy compared with the GMM map even with fewer cells that link to vector components. The proposed maps enable self-localization with sufficient accuracy for safe autonomous driving operations.
Conference Paper
Full-text available
We present an incremental method for concurrent mapping and localization for mobile robots equipped with 2D laser range finders. The approach uses a fast implementation of scan-matching for mapping, paired with a sample-based probabilistic method for localization. Compact 3D maps are generated using a multi-resolution approach adopted from the computer graphics literature, fed by data from a dual laser system. Our approach builds 3D maps of large, cyclic environments in real-time, and it is robust. Experimental results illustrate that accurate maps of large, cyclic environments can be generated even in the absence of any odometric data
Article
Full-text available
The problem of creating a complete model of a physical object is studied. Although this may be possible using intensity images, the authors use range images which directly provide access to three-dimensional information. The first problem that needs to be solved is to find the transformation between the different views. Previous approaches have either assumed this transformation to be known (which is extremely difficult for a complete model) or computed it with feature matching (which is not accurate enough for integration. The authors propose an approach that works on range data directly and registers successive views with enough overlapping area to get an accurate transformation between views. This is performed by minimizing a functional that does not require point-to-point matches. Details are given of the registration method and modeling procedure, and they are illustrated on range images of complex objects
Article
The principal components and capabilities of Blanche, an autonomous robot vehicle, are described. Blanche is designed for use in structured office or factory environments rather than unstructured natural environments, and it is assumed that an offline path planner provides the vehicle with a series of collision-free maneuvers, consisting of line and arc segments, to move the vehicle to a desired position. These segments are sent to a low-level trajectory generator and closed-loop motion control. The controller assumes accurate knowledge of the vehicle's position. Blanche's position estimation system consists of a priori map of its environment and a robust matching algorithm. The matching algorithm also estimates the precision of the corresponding match/correction that is then optimally (in a maximum-likelihood sense) combined with the current odometric position to provide an improved estimate of the vehicle's position. The system does not use passive or active beacons. Experimental results are reported
Article
We study the problem of creating a complete model of a physical object. Although this may be possible using intensity images, we here use images which directly provide access to three dimensional information. The first problem that we need to solve is to find the transformation between the different views. Previous approaches either assume this transformation to be known (which is extremely difficult for a complete model), or compute it with feature matching (which is not accurate enough for integration). In this paper, we propose a new approach which works on range data directly, and registers successive views with enough overlapping area to get an accurate transformation between views. This is performed by minimizing a functional which does not require point-to-point matches. We give the details of the registration method and modelling procedure, and illustrate them on real range images of complex objects.
Article
A heuristic method has been developed for registering two sets of 3-D curves obtained by using an edge-based stereo system, or two dense 3-D maps obtained by using a correlation-based stereo system. Geometric matching in general is a difficult unsolved problem in computer vision. Fortunately, in many practical applications, some a priori knowledge exists which considerably simplifies the problem. In visual navigation, for example, the motion between successive positions is usually approximately known. From this initial estimate, our algorithm computes observer motion with very good precision, which is required for environment modeling (e.g., building a Digital Elevation Map). Objects are represented by a set of 3-D points, which are considered as the samples of a surface. No constraint is imposed on the form of the objects. The proposed algorithm is based on iteratively matching points in one set to the closest points in the other. A statistical method based on the distance distribution is used to deal with outliers, occlusion, appearance and disappearance, which allows us to do subset-subset matching. A least-squares technique is used to estimate 3-D motion from the point correspondences, which reduces the average distance between points in the two sets. Both synthetic and real data have been used to test the algorithm, and the results show that it is efficient and robust, and yields an accurate motion estimate.
Conference Paper
A map for an autonomous mobile robot (AMR) in an indoor environment for the purpose of continuous position and orientation estimation is discussed. Unlike many other approaches, this map is not based on geometrical primitives like lines and polygons. An algorithm is shown, where the sensor data of a laser range fi nder can be used to establish this map without a geometrical interpretation of the data. This is done by converting single laser radar scans to statistical representations of the environ- ment, so that a crosscorrelation of an actual converted scan and this representative results into the actual position and orientation in a global coordinate system. The map itself is build of representative scans for the positions where the AMR has been, so that it is able to fi nd its position and orientation by com- paring the actual scan with a scan stored in the map.
Article
A mobile robot exploring an unknown environment has no absolute frame of reference for its position, other than features it detects through its sensors. Using distinguishable landmarks is one possible approach, but it requires solving the object recognition problem. In particular, when the robot uses two-dimensional laser range scans for localization, it is difficult to accurately detect and localize landmarks in the environment (such as corners and occlusions) from the range scans. In this paper, we develop two new iterative algorithms to register a range scan to a previous scan so as to compute relative robot positions in an unknown environment, that avoid the above problems. The first algorithm is based on matching data points with tangent directions in two scans and minimizing a distance function in order to solve the displacement between the scans. The second algorithm establishes correspondences between points in the two scans and then solves the point-to-point least-squares problem to compute the relative pose of the two scans. Our methods work in curved environments and can handle partial occlusions by rejecting outliers.
Article
Mobile robot localization is the problem of determining a robot's pose from sensor data. This article presents a family of probabilistic localization algorithms known as Monte Carlo Localization (MCL). MCL algorithms represent a robot's belief by a set of weighted hypotheses (samples), which approximate the posterior under a common Bayesian formulation of the localization problem. Building on the basic MCL algorithm, this article develops a more robust algorithm called Mixture-MCL, which integrates two complimentary ways of generating samples in the estimation. To apply this algorithm to mobile robots equipped with range finders, a kernel density tree is learned that permits fast sampling. Systematic empirical results illustrate the robustness and computational efficiency of the approach.