Content uploaded by Daniel Adolfsson

Author content

All content in this area was uploaded by Daniel Adolfsson on Sep 02, 2021

Content may be subject to copyright.

CorAl – Are the point clouds Correctly Aligned?

Daniel Adolfsson, Martin Magnusson, Qianfang Liao, Achim J. Lilienthal, Henrik Andreasson

Abstract— In robotics perception, numerous tasks rely on

point cloud registration. However, currently there is no method

that can automatically detect misaligned point clouds reli-

ably and without environment-speciﬁc parameters. We propose

“CorAl”, an alignment quality measure and alignment classiﬁer

for point cloud pairs, which facilitates the ability to introspec-

tively assess the performance of registration. CorAl compares

the joint and the separate entropy of the two point clouds.

The separate entropy provides a measure of the entropy that

can be expected to be inherent to the environment. The joint

entropy should therefore not be substantially higher if the point

clouds are properly aligned. Computing the expected entropy

makes the method sensitive also to small alignment errors,

which are particularly hard to detect, and applicable in a range

of different environments. We found that CorAl is able to detect

small alignment errors in previously unseen environments with

an accuracy of 95% and achieve a substantial improvement to

previous methods.

I. INTRODUCTION

In order to create safe and efﬁcient mobile robots, in-

trospective and reliability-aware capabilities are required to

assess and recover from perception failures. Many perception

tasks, including localization [1], scene understanding and

sensor calibration [2], rely on point cloud registration. How-

ever, registration may provide incorrect estimates due to local

minima of the registration cost function [3], uncompensated

motion distortion [4], noise or when the registration problem

is geometrically under-constrained [5], [6]. Consequently,

it is essential to measure alignment quality and to reject

or re-estimate alignment when quality is low. In the past,

an extensive number of methods have been proposed to

assess the alignment quality of point cloud pairs [7]–[17].

These metrics can typically be used to measure a relative

alignment error in the process of registration, but provide

limited information on whether the point clouds are correctly

aligned once registration has been carried out [18]. Until

today, few studies have targeted the measurement of align-

ment correctness after registration [18], [19] and previous

works report that alignment correctness classiﬁcation based

on AdaBoost and NDT score function decrease when applied

to point clouds acquired from new environments [19].

In this paper, we propose “CorAl” (Correctly Aligned?):

A method to introspectively measure and detect misalign-

ment between previously registered point cloud pairs. CorAl

The authors are with the MRO lab of the AASS research centre at ¨

Orebro

University, Sweden. E-mail: Daniel.Adolfsson@oru.se

This work has received funding from the Swedish Knowledge Foundation

(KKS) project “Semantic Robots” and European Union’s Horizon 2020

research and innovation programme under grant agreement No 732737

(ILIAD) and 101017274 (DARKO).

978-1-6654-1213-1/21/$31.00 c

2021 IEEE

Fig. 1: CorAl depicted in blue, operates on a pair of point clouds Pa,Pb

and can classify misalignment (ypred) by comparing the differential entropy

in the point clouds separately and jointly. Additionally, CorAl outputs a per-

point quality measure Q(Pa∪ Pb)that highlights misaligned parts.

(a) Pacolored by entropy. (b) Pbcolored by entropy.

(c) Correctly aligned Pa∪ Pb

colored by quality measure.

(d) Misaligned Pa∪ Pbcolored by

quality measure.

Fig. 2: Top: Differential entropy in point clouds separately. Bottom:

The joint point cloud (Pa∪ Pb) colored by per-point quality measure

qk(Pa,Pb)when aligned (c) and when misaligned (d). Blue and red

indicate alignment and misalignment respectively.

speciﬁcally aims to bridge the gap in classiﬁcation perfor-

mance when applied to new unseen environments.

Our method is well grounded in information theory and

gives an intuitive alignment correctness measure. CorAl mea-

sures the difference between the average differential entropy

in the joint and separate point clouds. For well-aligned point

clouds, the joint and the separate point clouds have similar

entropy. In contrast, misaligned point clouds tend to “blur”

the scene which can be measured as an increase in joint

entropy as depicted in ﬁg. 3. By using the separate point

clouds to estimate the entropy inherent in the scene, our

proposed method can assess quality in a range of different

environments.

Fig. 3: Example how uncertainty (entropy) is preserved when joining aligned

Pa∪ Pb(left), but increases when joining misaligned point clouds (right).

The entropy for aligned point clouds should be similar to the entropy in the

separate point clouds and can be used when quantifying alignment quality.

The contribution of this paper is an intuitive and simple

measure of alignment correctness between point cloud pairs.

We demonstrate how to use this quality measure to train a

simple model that detects small alignment errors between

point clouds, large errors are not considered in this paper.

To train our model, we use previously corrected scans that

are assumed to have no alignment error.

We make the following claims: (i) Our proposed method

CorAl measures the correctness of point cloud alignment by

accounting for the expected scene entropy. (ii) Our method is

accurate in a wide range of environments and can generalize

well to new environments without retraining.

II. REL ATED WOR K

Several methods have been used to assess alignment qual-

ity in the literature. However, in most cases these methods

are used in an ad-hoc manner. Few systematic evaluations

of their general ability to be used as a classiﬁer to detect

aligned vs. misaligned point clouds have been made.

One well-used alignment measure is the root-mean-

squared (RMS) point-to-point distance, truncated by some

outlier rejection threshold. This is also the function that is

minimized by iterative closest point registration [7]. How-

ever, this measure has been shown to be highly sensitive

to the environment and the choice of the outlier threshold

[19], [20]. Consequently, this is a poor measure for alignment

correctness classiﬁcation.

One family of methods instead attempts to estimate the

alignment uncertainty between point cloud pairs in the form

of a covariance matrix [21]–[24]. Some use Monte Carlo

strategies to estimate uncertainty by sampling registrations in

a region [22]. This exhaustive search is unpractical in mobile

robotics. Others attempt to estimate uncertainty in closed

form using the Hessian [19], [25], [26], representing the

steepness of the alignment score function around a minimum.

These methods assume that the registration has reached a

global maximum, which is not necessarily true. Until today,

alignment classiﬁcation based on uncertainty covariance have

been less accurate compared to matching score [19].

Almqvist et al. [19] explored alignment classiﬁers based

on RMS as well as other existing methods [16], [20], [26]–

[29], including the NDT score function [26], and investigated

how to combine the measures with AdaBoost into a stronger

classiﬁer. The classiﬁers were evaluated on two outdoor data

sets, and although their classiﬁers reached almost 90 % accu-

racy for the hardest cases on each data set individually, accu-

racy drops to around 80 % when cross-evaluating between the

data sets. In their evaluations, the NDT score function proved

to be the best individual measure for alignment assessment.

The combined AdaBoost classiﬁer did not have signiﬁcantly

higher accuracy, but reduced parameter sensitivity.

Liao et al. [11] recently proposed a registration method

based on fuzzy clusters, which involves a registration qual-

ity assessment. This fuzzy cluster-based quality assessment

(FuzzyQA) compares the similarity of dispersion and dis-

position of points around fuzzy cluster centers. It has been

used to detect if the point clouds are coarsely aligned, Coral

instead attempts to detect small alignment errors.

Nobili et al. [5] proposed a method to predict alignment

risk prior to registration by combining overlap information

and an alignment metric. The alignment metric quantiﬁes

the geometric constraints in the registration problem. The

alignment metric is based on point-to-plane residuals and

has been evaluated in structured scenes with planar surfaces,

while our method can operate well even in unstructured

environments. Additionally, our method seeks to estimate

the alignment after registration has been completed to in-

trospectively measure the registration success, as opposed to

predicting the risk prior to registration.

Bogoslavskyi et al. [18] deﬁned a quality metric based

on positive and negative point information, and used it to

measure alignment error and cluster three known object types

in a controlled experiment. Rather than focusing on objects,

our method aims to classify alignment quality of observed

scenes in different environments. Additionally, their method

operates on range images, which might not be available,

while our method operates on unorganized point clouds.

To the best of our knowledge, there is no method for binary

point cloud alignment classiﬁcation that performs accurately

and transfer well to new environments without parameter

tuning or retraining.

III. COR AL METHOD

Our work is inspired by the Mean-Map-Entropy (MME)

measure proposed by Droeschel and Behnke [30] for map

quality assessment. MME is based on differential en-

tropy [31] and measures the randomness of multivariate

Gaussian distributions. Droeschel and Behnke used MME

in absence of accurate ground truth when evaluating map

reﬁnement. As shown in our evaluation, MME cannot be

used as a general alignment quality measure as it is also

affected by measurement noise, sample density and envi-

ronment geometry. MME is more affected by changes in

the environment compared to CorAl. Hence, the measure

is not expected to generalize between, e.g., a structured

warehouse and an unstructured outdoor forest environment.

We overcome this effect using dual entropy measurements

computed 1) in both point clouds separately and 2) in the

joint point cloud. The intuition is that joining two well-

aligned point clouds should not introduce additional uncer-

tainty and entropy should remain constant if the point clouds

overlap sufﬁciently.

A. Computing joint and separate entropy

Our method operates on the dense point clouds Pa,Pb,

given in a common ﬁxed world frame, that contain a set of

points in the Cartesian space pk=x y z. For later use,

we deﬁne the joint point cloud Pj=Pa∪ Pb; i.e., all points

in Paand Pbtogether.

From all points within a radius raround each point pk, we

compute the sample covariance Σ(pk). From the determinant

of the sample covariance det(Σ(pk)) we can then compute

the differential entropy as:

hi(pk) = 1

2ln(2πe det(Σ(pk))) (1)

for the point cloud i=a, b that contains pk. An example

of point clouds colored according to eq. (1) can be seen in

Fig. 2a and 2b. The sum of differential entropy for Pican

then be computed as

Hi(Pi) =

|Pi|

X

k=1

hi(pk),(2)

where |Pi|is the number of points in the point cloud Pi.

Using Eq. 2 we can derive measures of the separate and

joint average differential entropy of two point clouds Pa,Pb.

Hsep =Ha(Pa) + Hb(Pb)

|Pa|+|Pb|,(3)

Hjoint =Hj(Pj)

|Pj|=H(Pa∪ Pb)

|Pa|+|Pb|.(4)

Our ﬁrst alignment quality measure uses the difference be-

tween the joint and the separate average differential entropy:

Q(Pa,Pb) = Hjoint(Pj)−Hsep(Pa,Pb),(5)

which can also be given per-point by

qk(pk) = hj(pk)−hi(pk),(6)

where the point entropy is evaluated on the joint point cloud

jand the separate point cloud i= (aor b) where pk

originates from. An example of point clouds colored by per-

point entropy difference according to eq. (6) is depicted in

Fig. 2c and 2d. Typically, Q(Pa,Pb)is close to zero for

well-aligned point clouds and increases with the alignment

error as depicted in ﬁg. 4, which visualizes the function’s

surface for position and angular alignment errors around the

correct alignment.

Well-aligned point clouds Pa∪ Pbacquired in structured

environments have low differential entropy for most query

points pk. This is reﬂected by low values for the determi-

nant of the sample covariance. As the determinant can be

expressed as the product of the eigenvalues of the sample

covariance det(Σ(pk)) = λ1λ2λ3, we see that the measure

is sensitive to an increase in the lowest of the eigenvalues

when larger eigenvalues are constant. For example, the

entropy of points on a planar surface is represented with

(a) Top view of ﬁrst two aligned point clouds Pa(blue) and Pb(red) in the

ETH “stairs” dataset.

(b) CorAl measure Q(Pa,Pb)

eq. (5) visualized by color for

various (x,y) displacements.

(c) CorAl measure Q(Pa,Pb)for

various (x,θ) displacements.

Fig. 4: Example of CorAl measure Q(Pa,Pb)for various induced (x,y,θ)

alignment errors. Qhas a minimum at the true position. The steepness of

the surface around the true alignment indicates that CorAl is sensitive to

small misalignments.

(a) Aligned point clouds: differential

entropy distributions are similar.

(b) Misaligned point clouds: joint dif-

ferential entropy is higher.

Fig. 5: Probability distribution of per-point entropy eq. (1) for joint and

separate point clouds when (a) aligned and (b) misaligned. Aligned point

clouds have similar joint and separate entropy distributions, while joint

entropy is higher compared to separate entropy when point clouds are

misaligned. Joining misaligned point clouds blurs the scene which can be

observed by an entropy increase. An exception can be seen in (a) region

(−12,−8) where the entropy increases by joining aligned point clouds.

a ﬂat distribution with two large (λ1, λ2) and one small

(λ3) eigenvalue. Misalignment changes the point distribution

in the joint point cloud from ﬂat to ellipsoidal which can

be observed as an increase of the smallest eigenvalue λ3.

This makes the measure sensitive to misalignment of planar

surfaces, but generalizes well to other geometries. As shown

in the evaluation, the measure can capture discrepancies

between point clouds regardless of whether these are due to

rigid misalignments or distortions which can can occur when

scanning while moving, e.g. because of vibrations or sensor

velocity estimation errors. That means that the method can

be overly sensitive when used together with a registration

method or odometry framework that does not compensate

movement distortion or has a low accuracy.

Overlap is required between point clouds to produce

evidence of alignment. For that reason, we classify point

clouds with less than 10% overlap as misaligned. By deﬁning

the overlap as all points with a neighbor within rin the other

point cloud, non overlapping points have no effect on the

quality measure in eq. (5).

B. Dynamic radius selection and outlier rejection

For well aligned point clouds, the quality measure is close

to zero, meaning that the joint and separate point clouds

have similar mean and probability distributions of per-point

entropy as depicted in Fig. 5a. Unfortunately, the entropy in

eq. (1) is ill-posed when the determinant det(Σ(pk)) is close

to zero and a small increase of the determinant causes a large

increase of the entropy. Accordingly, the lowest measured

entropies can increase (which indicates misalignment) even

when joining well aligned point clouds as depicted in ﬁg. 5a.

The ill-posed entropies are found where point density is

low, typically for solitary points or far from the sensor

where the radius ris not large enough to include points

that represent the geometry in the environment. The effect of

the problem with entropies can be mitigated by maximizing

the ratio Qs=Qmisaligned (Pa,Pb)/Qaligned(Pa,Pb). A

larger ratio indicates that the measure is able to discriminate

between aligned and misaligned point clouds. We propose

three strategies to address the ill-posed entropies due to

variations in sampling density originating from the sensor.

(1): Eq. 1 is modiﬁed to hi(pk) = 1

2ln(2πe det(Σ(pk))+

)where limits the lowest possible entropy. This make sure

that entropy is similar for points distributed along a line and

a plane. The improvement can be seen by comparing ﬁg. 6(a-

b).

(2): Radius ris chosen based on the distance dbetween

the point pkand the sensor location, to account for that

point density decrease over distance. The radius is hence

selected as: r=dsin(α)in the range rmin < r < rmax

where αis the vertical resolution of the sensor. For other

sensor types e.g. RGB-D, the resolution could be chosen

similarly according to the angular sensing resolution. A

dynamic radius enables the quality measure to include more

points far from the sensor and correctly detect alignment and

misalignment for these as seen in ﬁg. 6(c).

(3): Remove Ereject percent of points pkwith the lowest

entropies. The effect is depicted in ﬁg. 6(d).

C. Classiﬁcation

We use logistic regression as a model for classiﬁcation:

p=1

1 + e−z,

z=β0+β1x1+β2x2

(7)

ypred =(aligned if p≥th

misaligned if p<th,(8)

where x1, x2are input variables (described for each method

in section IV-A). Instead of passing the quality measure

x1=Q(Pa,Pb),Hsep and Hjoint are passed separately

to x1=Hjoint and x2=Hsep .β0, β1, β2are learned

model parameters, pis the class probability and this a class

probability threshold and can be adjusted to the application

needs. For example, in mobile robotics, it is desired that

misaligned point clouds are not accidentally reported as

aligned (false positives), potentially causing a system failure.

In contrast, aligned point clouds classiﬁed as misaligned are

typically harmless. For that reason, thcan be increased to

reject false positives and hence improve robustness. We used

the default threshold th= 0.5.

IV. EVALUATI ON

We evaluate an equal portion of aligned and misaligned

point clouds. Misaligned point clouds are created by adding

an offset for each point cloud pair: an angular offset (eθ=

0.57◦) around the sensor’s vertical axis and a random trans-

lational (x, y)offset at a distance (ed= 0.1m) from the

ground truth. These errors are large enough to be meaningful

to detect in various environments, yet challenging to classify.

A. Evaluated methods

The evaluated methods are summarized here together with

their most important parameters.

a) MME: Mean Map Entropy as proposed by

Droeschel and Behnke [30] summarized in eq. (2). The

parameter is the radius rfor associating points.

b) CorAl (proposed in the paper): Separate and reg-

istered entropy Hs, Hjas described in eqs. (3) and (4).

Parameters are rmin,rmax and αto determine nearby points

radius, and Ereject to set outlier rejection ratio and

c) CorAl-Median (proposed in the paper): Hs, Hjare

modiﬁed to calculate the median entropy rather than the

mean entropy, we hypothesize that this modiﬁcation can be

more robust. The parameters are unchanged.

d) NDT (point-to-distribution normal-distributions

transform): The method uses the 3D NDT [32]

representation similarly to Almqvist [19] (NDT3), which

constructs a voxel grid over one point cloud, and computes

a Gaussian function based on the points in each voxel.

The likelihood of ﬁnding the points in Pb, given the NDT

representation of Pa, is computed as

s=Pn

k=1 ˜p(pk)

n,(9)

where nthe number of overlapping points, deﬁned as those

points (which fall in an occupied NDT voxel, or in a voxel

that is a direct neighbor of an occupied voxel) and ˜pis

the probability density function associated with the nearest

overlapping NDT-cell. The most important parameter for

NDT is the voxel size vwhich is set equal to 2∗rin

our evaluation as this makes the sample covariance of NDT

cells and entropy computed from points in a similarly large

volume.

e) Rel-NDT (proposed in the paper): We wanted to in-

vestigate if entropy can be used to improve generalization of

NDT to different environments. The idea is that environment

type is reﬂected in the average entropy of the scene and

can be combined with NDT score to improve classiﬁcation.

We did this by computing the average entropy of all NDT-

covariances associated with pkin the point-likelihood terms

and feed that together with the NDT score (9) to the classiﬁer.

No additional parameters to NDT are required.

(a) (b) (c) (d)

(e) (f) (g) (h)

Fig. 6: Joint point cloud colored by per-point quality, qkranging from blue (aligned) to red (misaligned). The location of the point cloud origin is

highlighted in ﬁg. 7. Columns depict the same parameters for aligned (top) and misaligned (bottom) point clouds. (a,e): A ﬁxed radius r= 0.3mgives

(Qs= 1.46). (b,f): radius dynamically adjusted rmin = 0.3m,α= 1.33, rmax = 0.7gives (Qs= 2.93). (c,g): = 10−8is added (Qs= 4.06). (d,h):

Ereject = 10% is added (Qs= 4.3).

Fig. 7: Data acquired by a truck in a warehouse environment. The sensor

trajectory is drawn in red. The environment in the ﬁgure is 50 m ×50 m

and the sequence length is 102 m. In the ﬁrst segment of the trajectory,

starting at the bottom left, the walls are clearly visible. The ﬁnal segment

is located between aisles where walls are typically out of sight and the

sensor observe complex structures such as shelves. The truck traverses over

a rough ﬂoor with height differences causing vibrations on the sensor.

f) FuzzyQA: FuzzyQA [11] measures the alignment

quality by a ratio ρ=AFCCD

AFPCD , where AFCCD and AFPCD

are two indexes describing the points’ disposition and disper-

sion around fuzzy cluster centers. The two point clouds are

coarsely aligned if ρ < 1. However, AFCCD and AFPCD

are passed separately to the classiﬁer input x1, x2.

g) Input to the classiﬁer: CorAl, FuzzyQA and Rel-

NDT output two decision variables that are passed as input

variables x1, x2to the classiﬁer (III-C). The other evaluated

methods output a single variable x1, and x2= 0 is ﬁxed.

B. Qualitative evaluation, live robot data

First, we present qualitative results from real-world data

in a structured warehouse environment. A forklift equipped

with a Velodyne HDL-32E spinning laser scanner was manu-

ally driven at fast walking speed in the environment depicted

in ﬁg. 7. The environment in the sequence varies from large

and open with visible walls, to small and narrow between

ailes of pallets. To generate ground-truth alignments for the

warehouse dataset, we ﬁrst aligned the point clouds using a

scan-to-map approach [6]. We then inspected the alignment

between subsequent scans and found that at least 40/484

(8.3%) point clouds were impaired by rigid misalignments

or non-rigid distortions from vibrations and motion to the

extent that these could be easily visually located.

Alignment classiﬁcation was then performed on the re-

maining scans by inducing errors as described in section IV.

We used the following parameters as they provided a rela-

tively high value of Qsfor the ﬁrst scan pair in the dataset:

α= 0.92◦,Ereject = 0.2,rmin = 0.2,rmax = 1.0and

voxel size v= 2rmin = 0.4. We found that CorAl-mean,

MME and NDT reached an accuracy of 96%,70% and 99%

respectively. In this case, NDT performs slightly better than

CorAl. We believe that CorAl is more sensitive to the typical

alignment noise that is still present in the aligned scans. This

typical alignment noise introduces a variance in the CorAl

score and makes it hard to train a classiﬁer that is sensitive

to small misalignment’s. Whether this is desired behavior

depends on the application.

C. Quantitative evaluation, ETH benchmark data set

Our main quantitative evaluation is done using the pub-

lic ETH registration dataset [33]. This dataset includes 3

sequences in structured (blue) environments (Apartments,

ETH Hauptgebaude, Stairs), 3 sequences in semi-structured

(brown) environments (Gazebo in summer, Gazebo in winter,

Mountain plain) and 2 challenging sequences in unstructured

(green) environments (Wood in summer, Wood in autumn).

Each sequence contains between 31 and 47 scans acquired

from stationary positions. The dataset contains accurate

ground truth positions, required to evaluate the different

methods. In order to make the evaluation fairer, more realistic

and applicable to real applications, we downsample the

original, dense, point clouds using a voxel grid of 0.08 m. As

the dataset has less variation in sampling density compared

to the warehouse dataset, we used a ﬁxed radius r= 0.3and

Fig. 8: Separate training. The overall accuracy was CorAl: 98%, CorAl-

median: 98% FuzzyQA: 53% MME: 77% NDT: 78% Rel-NDT: 80%

set Ereject = 20%, = 0. NDT voxel size was set equal to

the diameter v= 2r= 0.6to create a fair comparison.

a) Performance: CorAl has an overall run-time of

0.246 ±0.095 seconds per point cloud pair on an Intel Core

i7 and depends on the point cloud density.

1) Separate training: The ﬁrst test evaluates the capability

to learn classiﬁcation in a speciﬁc type of environment and

serves as a reference for further evaluations. The classiﬁers

were trained and evaluated on each sequence separately,

using 5-fold cross validation.

Results are shown in ﬁg. 8. We found that all methods

except FuzzyQA performed well on the structured environ-

ments. We did not expect that FuzzyQA would handle this

as it is speciﬁcally designed to classify coarse alignment.

Surprisingly, even MME scored 90–100% on the structured

environment. This indicates that even naive methods can

assess alignment quality in a highly structured environment.

In the semi-structured and unstructured sequences, only

CorAl and CorAl-median performed well, with consistently

>90% accuracy, even in the most challenging sequences.

All other methods are only slightly better than random,

except for the gazebo sequences. Rel-NDT improves NDT

in most cases, however not consistently. We believe this is

because entropy alone provides little information about the

environment. This is supported by the low overall accuracy

of MME. Both NDT methods performed decently (77–90%)

in the gazebo sequence, indicating that NDT requires at least

some structure or surfaces free from foliage to be effective

as an alignment correctness measure.

2) Joint training: The second test evaluates how the

methods are able to learn alignment classiﬁcation when

trained in a variety of environments. To do that, the methods

need to be versatile. Training was performed on all the ETH

sequences, evaluation was then performed on each sequence

individually. The results are shown in ﬁg. 9. The accuracy

of all classiﬁers decreased compared to the previous test.

CorAl performed best, with accuracy 85–100% in all cases.

CorAl-median reached a slightly lower accuracy compared to

CorAl. Rel-NDT performed better than NDT in most cases,

however not consistently. The generally high accuracy of

CorAl indicates that it is possible to ﬁnd general parameters

that makes the method valid in various environments.

Fig. 9: Joint training. Overall accuracy CorAl: 96%, Coral-median: 95%

FuzzyQA: 52% MME: 60% NDT: 75% Rel-NDT: 78%

Fig. 10: Evaluation on unseen environments. Overall accuracy: 83%, CorAl-

median: 79% FuzzyQA: 50% MME: 54% NDT: 72% Rel-NDT: 72%,

In structured and semi-structured environments: 95%, Coral-median: 88%

FuzzyQA: 50% MME: 56% NDT: 78% Rel-NDT: 79%.

3) Generalization to unseen environments: The ﬁnal test

evaluates how classiﬁers perform in environments with dif-

ferent characteristics than those observed in the training

set. We trained and evaluated on different sequences and

environments. The 3 structured environments were used

for training and the remaining 5 (semi-structured and un-

structured) were used for evaluation and vice versa. The

classiﬁcation accuracy is depicted in ﬁg. 10. When trained

on structured and evaluated on semi-structured environments,

CorAl performed accurately(85–98%) and other methods

performed close to random except NDT for Gazebo sum-

mer (74%) No method generalized well from structured to

unstructured environments. On the other hand, learning from

semi-structured and unstructured environments was enough

to afford very high accuracy in structured environments with

CorAl – very close to what was attained with joint training

on all sequences. The previous joint evaluation show that

it’s possible to train a model that is simultaneously accurate

in all environment types. For that reason, we believe that

the reason the classiﬁer trained in a structured environment

does not generalize to an unstructured environment is that

the model overﬁts when not using sufﬁciently diverse and

challenging data.

V. CONCLUSIONS

In this paper we introduced CorAl, a principled and

intuitive measure of alignment correctness between point

clouds. Using dual entropy measurement that compares the

expected entropy found in the separate point clouds with the

actual entropy, CorAl can measure point cloud alignment

correctness and substantially outperforms previous methods

when evaluated on a public data set. Speciﬁcally, we were

able to use CorAl to train a classiﬁer based on logistic

regression that is simultaneously accurate in a diverse range

of environments. Our experiments shows that our method

generalizes well from (i) unstructured and semi-structured

to structured environments, and (ii) from structured to semi-

structured. None of the evaluated methods generalized well

from structured to unstructured environments. Therefore we

conclude it is possible to train a general and accurate

alignment classiﬁer given that training data is sufﬁciently

diverse. Relatively modest results 96% was achieved on live

data. We think that the poor quality of the ground truth

(obtained by lidar odometry and manual inspection) causes

high variance in the CorAl score. The score is sensitive to

small misalignment’s, therefore a higher quality ground truth

is required to make a fair evaluation. We believe that CorAl

per-point quality and classiﬁcation can be a useful tool for

alignment evaluation and can improve robustness in various

perception tasks by serving as a fault detection step.

In the future we will investigate how to automatically learn

sensor speciﬁc parameters or use the range image to ﬁnd

neighbouring points for covariance computation. This could

address variations in point density owed to different sensors

and environment scales.

REFERENCES

[1] D. Adolfsson, S. Lowry, M. Magnusson, A. Lilienthal, and H. An-

dreasson, “A Submap per Perspective - Selecting Subsets for SuPer

Mapping that Afford Superior Localization Quality,” in 2019 European

Conference on Mobile Robots (ECMR), Sept. 2019, pp. 1–7.

[2] B. Della Corte, H. Andreasson, T. Stoyanov, and G. Grisetti, “Uniﬁed

Motion-Based Calibration of Mobile Multi-Sensor Platforms With

Time Delay Estimation,” IEEE Robotics and Automation Letters,

vol. 4, no. 2, pp. 902–909, Apr. 2019.

[3] A. C. M. Tavares, F. J. Lawin, and P. Forss´

en, “Assessing losses for

point set registration,” IEEE Robotics and Automation Letters, vol. 5,

no. 2, pp. 3360–3367, 2020.

[4] J. Zhang and S. Singh, “LOAM: Lidar odometry and mapping in real-

time,” in Robotics: Science and Systems, 2014.

[5] S. Nobili, G. Tinchev, and M. Fallon, “Predicting alignment risk to

prevent localization failure,” in 2018 IEEE International Conference

on Robotics and Automation (ICRA), 2018, pp. 1003–1010.

[6] H. Andreasson, D. Adolfsson, T. Stoyanov, M. Magnusson, and A. J.

Lilienthal, “Incorporating ego-motion uncertainty estimates in range

data registration,” in 2017 (IROS), Sep. 2017, pp. 1389–1395.

[7] P. J. Besl and N. D. McKay, “A method for registration of 3-d shapes,”

IEEE TPAMI, vol. 14, no. 2, pp. 239–256, Feb 1992.

[8] A. Segal, D. Haehnel, and S. Thrun, “Generalized-ICP,” in Robotics:

Science and Systems V. Robotics: Science and Systems Foundation,

jun 2009. [Online]. Available: https://doi.org/10.15607%2Frss.2009.v.

021

[9] S. Billings and R. Taylor, “Generalized iterative most likely oriented-

point (g-imlop) registration,” International journal of computer as-

sisted radiology and surgery, vol. 10, 05 2015.

[10] N. Tustison, S. Awate, G. Song, T. Cook, and J. Gee, “Point set

registration using havrda–charvat–tsallis entropy measures,” IEEE

transactions on medical imaging, vol. 30, pp. 451–60, 10 2010.

[11] Q. Liao, D. Sun, and H. Andreasson, “Point set registration for 3d

range scans using fuzzy cluster-based metric and efﬁcient global

optimization,” IEEE Transactions on Pattern Analysis and Machine

Intelligence, vol. 43, no. 9, pp. 3229–3246, 2021.

[12] Y. Aoki, H. Goforth, R. A. Srivatsan, and S. Lucey, “Pointnetlk: Robust

& efﬁcient point cloud registration using pointnet,” in Proceedings of

the IEEE Conference on Computer Vision and Pattern Recognition,

2019, pp. 7163–7172.

[13] G. D. Evangelidis and R. Horaud, “Joint alignment of multiple point

sets with batch and incremental expectation-maximization,” IEEE

Transactions on Pattern Analysis and Machine Intelligence, vol. 40,

no. 6, pp. 1397–1410, 2018.

[14] S. Bouaziz, A. Tagliasacchi, and M. Pauly, “Sparse iterative closest

point,” in Proceedings of the Eleventh Eurographics/ACMSIGGRAPH

Symposium on Geometry Processing. Eurographics Association,

2013, pp. 113–123.

[15] T. Stoyanov, M. Magnusson, H. Andreasson, and A. J. Lilienthal, “Fast

and accurate scan registration through minimization of the distance

between compact 3D NDT representations,” The International Journal

of Robotics Research, vol. 31, no. 12, pp. 1377–1393, 2012.

[16] S. M. Rusinkiewicz, “Efﬁcient variants of the ICP algorithm,” in The

Third International Conference on 3D Digital Imaging and Modeling,

2001, pp. 145–152.

[17] B. Eckart, K. Kim, A. Troccoli, A. Kelly, and J. Kautz, “Mlmd:

Maximum likelihood mixture decoupling for fast and accurate point

cloud registration,” in 2015 International Conference on 3D Vision,

2015, pp. 241–249.

[18] I. Bogoslavskyi and C. Stachniss, “Analyzing the quality of matched

3d point clouds of objects,” 2017 IEEE/RSJ International Conference

on Intelligent Robots and Systems (IROS), pp. 6685–6690, 2017.

[19] H. Almqvist, M. Magnusson, T. P. Kucner, and A. J. Lilienthal, “Learn-

ing to detect misaligned point clouds,” Journal of Field Robotics,

vol. 35, no. 5, pp. 662–677, 2018.

[20] L. Silva, O. R. Bellon, and K. L. Boyer, “Precision range image regis-

tration using a robust surface interpenetration measure and enhanced

genetic algorithms,” TPAMI, vol. 27, no. 5, pp. 762–776, May 2005.

[21] D. Landry, F. Pomerleau, and P. Gigu`

ere, “CELLO-3D: Estimating

the covariance of ICP in the real world,” in IEEE (ICRA), May 2019,

pp. 8190–8196.

[22] O. Bengtsson and A.-J. Baerveldt, “Robot localization based on scan-

matching—estimating the covariance matrix for the IDC algorithm,”

Robotics and Autonomous Systems, vol. 44, no. 1, pp. 29–40, 2003.

[23] J. Nieto, T. Bailey, and E. Nebot, “Scan-SLAM: Combining EKF-

SLAM and scan correlation,” in Field and Service Robotics, ser.

Springer Tracts in Advanced Robotics, P. Corke and S. Sukkariah,

Eds. Springer, 2006, pp. 167–178.

[24] S. M. Prakhya, L. Bingbing, Y. Rui, and W. Lin, “A closed-form

estimate of 3d ICP covariance,” in 2015 14th IAPR (MVA), 2015, pp.

526–529.

[25] A. Censi, “An accurate closed-form estimate of ICP’s covariance,” in

IEEE (ICRA), apr 2007, pp. 3167–3172.

[26] M. Magnusson, “The three-dimensional normal-distributions transform

— an efﬁcient representation for registration, surface analysis, and

loop detection,” Ph.D. dissertation, ¨

Orebro University, Dec. 2009,

¨

Orebro Studies in Technology 36.

[27] P. Biber and W. Strasser, “The normal distributions transform: a new

approach to laser scan matching,” in Proceedings 2003 IEEE/RSJ

(IROS 2003), vol. 3, Oct. 2003, pp. 2743–2748 vol.3.

[28] M. Chandran-Ramesh and P. Newman, “Assessing map quality and

error causation using conditional random ﬁelds,” IFAC Proceedings

Volumes, vol. 40, no. 15, pp. 463–468, Jan. 2007.

[29] A. Makadia, A. Patterson, and K. Daniilidis, “Fully automatic registra-

tion of 3d point clouds,” in 2006 IEEE Computer Society Conference

on Computer Vision and Pattern Recognition (CVPR’06), vol. 1, June

2006, pp. 1297–1304.

[30] D. Droeschel and S. Behnke, “Efﬁcient continuous-time slam for 3d

lidar-based online mapping,” in ICRA, May 2018, pp. 1–9.

[31] G. Darbellay and I. Vajda, “Entropy expressions for multivariate

continuous distributions,” Information Theory, IEEE Transactions on,

vol. 46, pp. 709 – 712, 04 2000.

[32] M. Magnusson, A. J. Lilienthal, and T. Duckett, “Scan registration

for autonomous mining vehicles using 3D-NDT,” Journal of Field

Robotics, vol. 24, no. 10, pp. 803–827, Oct. 2007.

[33] F. Pomerleau, M. Liu, F. Colas, and R. Siegwart, “Challenging data

sets for point cloud registration algorithms,” IJRR, vol. 31, no. 14, pp.

1705–1711, Dec. 2012.