Robust SLAM in large-scale environments requires fault resilience and awareness at multiple stages, from sensing and odometry estimation to loop closure. In this work, we present TBV (Trust But Verify) Radar SLAM, a method for radar SLAM that introspectively verifies loop closure candidates. TBV Radar SLAM achieves a high correct-loop-retrieval rate by combining multiple place-recognition techniques: tightly coupled place similarity and odometry uncertainty search, creating loop descriptors from origin-shifted scans, and delaying loop selection until after verification. Robustness to false constraints is achieved by carefully verifying and selecting the most likely ones from multiple loop constraints. Importantly, the verification and selection are carried out after registration when additional sources of loop evidence can easily be computed. We integrate our loop retrieval and verification method with a fault-resilient odometry pipeline within a pose graph framework. By evaluating on public benchmarks we found that TBV Radar SLAM achieves 65% lower error than the previous state of the art. We also show that it's generalizing across environments without needing to change any parameters.
TBV Radar SLAM trust but verify loop candidates
Daniel Adolfsson, Mattias Karlsson, Vladim´
ır Kubelka, Martin Magnusson, Henrik Andreasson
Abstract Robust SLAM in large-scale environments re-
quires fault resilience and awareness at multiple stages, from
sensing and odometry estimation to loop closure. In this work,
we present TBV (Trust But Verify) Radar SLAM, a method
for radar SLAM that introspectively verifies loop closure candi-
dates. TBV Radar SLAM achieves a high correct-loop-retrieval
rate by combining multiple place-recognition techniques: tightly
coupled place similarity and odometry uncertainty search, cre-
ating loop descriptors from origin-shifted scans, and delaying
loop selection until after verification. Robustness to false con-
straints is achieved by carefully verifying and selecting the most
likely ones from multiple loop constraints. Importantly, the
verification and selection are carried out after registration when
additional sources of loop evidence can easily be computed. We
integrate our loop retrieval and verification method with a fault-
resilient odometry pipeline within a pose graph framework. By
evaluating on public benchmarks we found that TBV Radar
SLAM achieves 65% lower error than the previous state of the
art. We also show that it’s generalizing across environments
without needing to change any parameters.
Robust localization is key to enabling safe and reliable
autonomous systems. Achieving robustness requires careful
design at multiple stages of a localization pipeline, from
environment-tolerant sensing and pose estimation, to place
recognition and pose refinement. At each stage, a localization
and mapping pipeline should be designed for fault awareness
to detect failures and fault resilience to mitigate failures as
they inevitably occur. Today, active exteroceptive sensors
such as lidar and radar are suitable when robust uninterrupted
localization is required. Of these two, radar perception is
significantly less affected when operating within dusty envi-
ronments or under harsh weather conditions. It is currently
debated how the sensing properties affect localization and
mapping performance [1], [2]. Compared to lidar, limited
work focuses on robust and accurate radar SLAM, and none
of the existing methods include introspective fault detection.
In this letter, we propose TBV (Trust But Verify) Radar
SLAM a 2D pose-graph localization and mapping pipeline
which integrates fault resilience and fault-aware robustness
at multiple stages. A radar-only odometry front-end adds
pose nodes to the graph. In parallel, a robust loop-detection
module adds loop closure constraints such that the SLAM
back-end can optimize the graph to correct drift. TBV Radar
SLAM uses radar odometry (further only odometry), place
descriptors, and scan alignment measures to retrieve, verify,
and select between loop constraints. We achieve a high
correct-loop-retrieval rate by combining: a tightly coupled
The authors are with the MRO lab of the AASS research centre at ¨
University, Sweden. E-mail:
This work has received funding...
Ground Truth, Est. Trajectory, Loop Closure
Fig. 1: Overview and demonstration of TBV Radar SLAM
place similarity and odometry uncertainty search, creating
place descriptors computed from origin-shifted scans, and by
delaying loop selection until after verification. Robustness,
with a high loop detection rate, is achieved by unifying
the process of place recognition and verification. Rather
than rejecting candidates early during place recognition, we
register and verify multiple loop constraints in parallel. Our
verification combines place similarity, odometry consistency,
and an alignment quality assessment automatically learned
from odometry and scans. We integrate our loop retrieval and
verification module with a robust method for radar odometry,
into a full-fledged SLAM pipeline visualized in Fig. 1. The
contributions of this letter are:
A combination of techniques for a high rate of correct
loop retrievals, including: coupled place similarity and
odometry uncertainty search, creating place descriptors
from origin-shifted scans, and selection between multi-
ple competing loop constraints.
A unified loop retrieval and verification step that jointly
considers place similarity, odometry uncertainty, and
alignment quality after registration. Multiple loop can-
didates are retrieved, registered, and verified.
We integrate these techniques with a robust odometry
estimator into a SLAM framework that pushes state
of the art in radar SLAM accuracy while generalizing
between environments without parameter tuning.
Early methods for radar SLAM employ filtering [3] and
landmark graph SLAM [4]. We instead use pose-graph
SLAM based on odometry and loop constraints obtained by
arXiv:2301.04397v1 [cs.RO] 11 Jan 2023
Loop retrieval and verification (Sec.III.B & D)
Place recognition (Sec.III.B)
Pose graph optimization
Verification of loop candidates (Sec.III.D)
Odometry estimation (Sec.III.A)
Radar odometry
verification Pose graph
Scan Context
Learning of
Input radar data
Radar scans
scans database
Fig. 2: Overview of TBV Radar SLAM. The main contribution is the Loop retrieval and verification module.
registering scans. Holder et al. [5] proposed a pose graph
SLAM based on automotive radar, using GLARE [6] for
place recognition. Their method uses gyroscope and wheel
odometry to provide an initial alignment guess to Iterative
Closest Point (ICP) for scan matching. Our method uses a
pose graph back-end but relies solely on a spinning radar.
Recent progress in the development of spinning 2D radar
has quickly inspired numerous works on radar odometry
estimation [1], [7]–[19], place recognition [11], [20]–[25],
topological localization [2], [21], [25] localization in over-
head imagery [26], [27] and SLAM [1], [10], [28]. Most
similar to ours is the preliminary work by Wang et a. [28],
and the seminal work on radar SLAM by Hong et al. [1].
Hong et al. [1] estimates odometry and registers loop con-
straints using KLT tracker keypoints. For place recognition,
they use adaptive thresholding to filter data, and M2DP [29]
to compute and match descriptors. The trajectory is corrected
using pose graph optimization. Our work brings a larger
focus on the introspective verification of loop constraints.
Martini et al. [21] proposed a teach-and-repeat localization
framework using a hierarchical approach. First, a place
candidate is retrieved via nearest neighbor search within
a learned metric space using the method by Saftescu et
al. [20]. Second, sensor pose is estimated (without the need
for an initial guess) by finding and registering the (globally)
optimal set of landmark matches as described by Cen et
al. [7]. Unlike [21], we use a fast local registration method,
motivated by the access of an initial alignment guess from
the place recognition module. However, we carefully verify
registration success before accepting new loop constraints.
Verification of pose estimates is essential for safety. E.g.,
such tests could have prevented a reported accident that
occurred during an automated driving test [30]. Holder
et al. [5] verify the detected loop candidates using radar
data by the condition that ICP residuals cannot exceed a
set threshold. Rather than verifying loops as a final step
(separated from loop detection), we unify loop retrieval
and verification. Some works [21], [31] use the landmark
matching compatibility matrix [7] to assess quality. Martini
et al. [21] reject place candidates based on the quality
score of the matrix. Aldera et al. [31] learn detection of
pose estimate failures from features extracted from the
eigenvectors of the compatibility matrix. Training labels are
provided by an external ground truth system. We build on
the method CorAl [32] to detect false loop candidates or
constraints. In the lidar domain, Behley et al. [33] accept
only reappearing loop candidates that are consistent with
odometry over multiple consecutive scans, we instead focus
on verification using only individual scans.
Some methods attempt to measure how observations
would constrain registration in the current scene; a low
level of constraints in one direction suggests registration
could be unstable. The measure has been used to predict the
risk of registration failure [34], [35], abstain from closing
loops when risk is high [1], or prioritize the inclusion of
loop closures accordingly [36]. We instead use odometry
uncertainty as a prior, which we combine with additional
loop evidence computed after registration. Finally, a range of
methods aims at verifying pose estimates by detecting point
cloud misalignment [30], [32], [37]. We fuse misalignment
detection [32] together with place similarity and odometry
uncertainty to achieve robust unified loop detection and
An overview of TBV Radar SLAM is presented in Fig. 2.
In this section, we detail the components including CFEAR
Radar Odometry (Sec. III-A), place recognition (Sec. III-
B), verification of loop candidates (Sec. III-D), and pose
graph optimization (Sec. III-E). The self-supervised training
of alignment verification (Sec. III-C) runs once during a
learning phase and is not required for online SLAM.
A. CFEAR Radar Odometry
We use the radar odometry pipeline CFEAR Radar Odom-
etry [17] (specifically the CFEAR-3 configuration). This
method takes raw polar radar sweeps as input and estimates
sensor pose and velocity. As an intermediate step, the method
computes a set of radar peaks Ptand a set of oriented surface
points Mtat time t. These representations will be referred
to in the later stages of our pipeline. CFEAR filters the radar
data by keeping the k-strongest range bins per azimuth. The
filtered point set is compensated for motion distortion and
transformed into Cartesian space, - this is also the point set
from which we extract radar peaks (Pt). A grid method is
then used to compute a set of oriented surface points Mt
from the filtered set. Odometry is estimated by finding the
pose xtin SE(2) which minimizes the sum of surface point
distances between the current scan Mtand the |K| latest
keyframes MKjointly as
f(MK,Mt,xt) = X
k∈K X
wi,j ρg(mk
j, mt
where wi,j are surface point similarity weights, ρis the
Huber loss function, and gis the pairwise distance between
surface points in the correspondence set Ccorr . A keyframe
k K is created when the distance to the previous exceeds
1.5m. This technique reduces drift and removes excessive
scans acquired when the robot remains stationary. Upon
creation of a new keyframe, odometry constraints are created
from the relative alignment between the current pose and the
latest keyframe. Odometry constraints are added to Codom
and used to correct trajectory (via pose graph optimization)
as detailed in Sec. III-E.
B. Place recognition
We attempt to unify the stages of loop closure, from
detection to constraint verification. Accordingly, we retrieve,
register, verify multiple candidates, of which one can is
selected. For that reason, we do not discard potential loop
candidates based on place similarity. Instead, we trust multi-
ple candidates to be meaningful until verified. At this point,
we select the verifiably best loop candidate, if such exist.
1) Scan Context: We build upon the place recognition
method Scan Context by Giseop Kim et al. [38], [39]. Their
method detects loops and relative orientation by matching
the query (q) scan with candidates ({c}) stored in a database
using scan descriptors. Scenes are encoded by their polar
representations into 2D descriptors Iring×sec . The 2D de-
scriptor is in turn used to create a rotation-invariant 1D
descriptor (ring key) via a ring encoding function. Loops
are detected via a two-step search: First, the query 1D
ring key is matched with the top 10 candidates via a fast
nearest neighbor (NN) search. Second, for each of the top
candidates, a sparse optimization finds the relative rotation
that minimizes a distance metric dsc(Iq,Ic): the sum of
cosine distances between descriptors columns. The candidate
cwith the lowest score (dsc) which does not exceed the fixed
threshold τis selected as loop candidate.
c= argmin
dsc(Iq,Ic), s.t. dsc < τ. (2)
In our implementation, query descriptors are created,
matched, and stored in the database for each keyframe rather
than for each scan.
2) Descriptor generation: As described in [28], [40],
a raw polar representation such as those produced by a
spinning 2D radar, can be used directly as a Scan Context
descriptor. However, we believe that doing so poses multiple
challenges, including sensor noise, motion distortion, scene
dynamics and translation sensitivity. Thus, we create our
descriptor from multiple filtered and motion-compensated
scans. Conveniently, such processed scans are already pro-
vided by the CFEAR. We aggregate the peak detections from
keyframe qand its two neighbours in the odometry frame.
Having the radar scan represented as a sparse point cloud
in Cartesian space allows us to address translation sensitivity
in place recognition by applying the data augmentation step
(Augmented PC) from [39] before computing place descrip-
tors. We perform data augmentation by shifting the sensor
origin, i.e. by transforming Pˆqwith ±2 and ±4 m lateral
translation offsets. The original, and the 4 augmented point
clouds, are each used to compute and match descriptors, after
which the best matching pair of query/candidate is selected.
Note that by using the aggregated sparse point cloud, rather
than the dense raw radar scan, we can efficiently compute all
augmentations and corresponding descriptors. As such, the
main computational load from the augmentation technique
is due to matching of additional descriptors and not the
computation of these. The descriptor itself is created by
populating the Scan Context Iwith radar intensity readings.
Specifically, for each grid cell I(i, j)we sum the intensity
of all coinciding point intensities (of radar peaks) divided by
1000. Empty cells are set to I(i, j) = 1, which we found
increased descriptiveness compared to I(i, j)=0.
3) Coupled odometry/appearance matching: When re-
trieving loop candidates, odometry information can be used
to filter unlikely candidates. This could be done by rejecting
unlikely loop constraints. For example, if the likelihood
of the loop constraint xq,c
loop (given the estimated odometry
trajectory vc:q
odom between cand q) is close to zero:
loop |vc:q
While this strategy may provide higher tolerance to spatial
aliasing by rejecting false positives, it does not provide
means to detect the correct candidate under such circum-
stances. For that reason, we propose a coupled place similar-
ity / odometry uncertainty search, which combines Eq. 2 and
Eq. 3. Candidates are thus selected jointly by the similarity of
appearance dsc(Iq,Ic)and the similarity of odometry dq,c
c= argmin
dsc(Iq,Ic) + dq,c
odom = 1 p(xq ,c
loop |vc:q
We estimate p(xq,c
loop |vc:q
odom) = exp (t2
terr =max(||transl(xq)transl(xc)|| , 0)
Here, transl is the translation component, is the expected
maximum spacing between loop candidates (fixed to = 5
i.e. slightly higher than the lateral augmentation distance),
and dist(vc:q
odom)is the traversed distance between the query
and loop candidate estimated by the odometry. Note that
terr quantifies the relative final position error, thus σcan be
chosen according to expected odometry quality to penalize
unlikely loops. We refrained, however, from making strong
assumptions on odometry quality, and fixed σ= 0.05; i.e., a
pessimistic assumption of 5% relative translation error.
Note that the two-step search of Scan Context requires
that odometry uncertainty is integrated already in the 1D
NN search. We do this by extending all 1D descriptors (of
size ring = 40) with odometry similarity scores (dodom) as
an extra element. (dodom) is scaled with a factor (ring/4)
to balance odometry uncertainty and appearance similarity.
C. Automatic learning of alignment verification
To improve loop closure verification, we build upon the
system CorAl [32] which learns to detect alignment errors
between two registered scans. CorAl allows us to determine
if a loop constraint is correct by formulating loop constraint
verification as a misalignment detection problem. Specifi-
cally, a loop (between scan nr qand c) is verified as correct
only if the scans are correctly aligned via the relative align-
ment xq,c
loop. During the learning phase, CorAl automatically
generates labeled training data. The input to CorAl is pairs of
odometry estimates, radar peak detections (P), and computed
sets of oriented surface points (M). These entities are used
to extract alignment quality residuals from which alignment
quality can be assessed. After the learning phase, CorAl can
verify loops by detecting alignment errors (caused e.g. by
heavy motion distortion or incorrect convergence). CorAl
also aids in distinguishing between places that differ by small
geometric details. We generate training data by repeating the
following process for each pair of consecutive keyframes.
Positive training labels (yaligned =true) and training data
Xquality are computed using the scan alignment provided by
the odometry. For each pair, the alignment quality measures
in Eq. 6 are extracted. Negative training labels (yaligned =
false) and training data are extracted similarly. However,
before extracting the alignment quality, an error is induced in
the alignment in both translation and rotation. This allows us
to learn the detection of different types of errors. Specifically,
we distribute 12 translation errors symmetrically in either the
positive or negative x or y-axis. We use 4 small (±0.5m), 4
medium (±1m) and 4 large (±2m) errors. To these errors,
we additionally induce a clockwise rotation with matching
rotation errors: small (0.5), medium (2) or large (15).
Note that the class ratio 1:12, between positive to negative
training is alleviated during learning by assigning weights
according to the inverse of class frequency.
1) Alignment measure: We extract the following align-
ment measures between each pair of scans:
Xquality = [HjHsHoCfCoCa1]T.(6)
The joint entropy (Hj) and separate entropy (Hs) are average
per-point differential entropies, extracted from point cloud
pairs of radar peak detections (Pq,Pc). These metrics are
described in-depth in [32]. We complement these measures
with a measure of overlap Ho: (Hoverlap), defined as the
portion of peaks in Pqor Pcwith a neighboring point within
the radius rin the other point cloud.
In this work, we combine these CorAl measures with
additional ones, extracted from (Mq,Mc), i.e. from pairs of
scans represented as oriented surface points. The measures
are obtained from the registration cost(Eq. 1), but with a
single keyframe and with the point-to-line cost function
(gP2L[17]). Note that these measures are already computed
during the final iteration of the registration, and this step
brings little computational overhead. Specifically, from Eq. 1
we reuse Cf:f(Mq,Mc,xq,c), the number of correspon-
dences (overlap) Co:|Ccorr|, and average surface point set
size Ca:1/2(|Mq|+|Mc|). The intuition of combining
these quality measures is that the CorAl measures (which
use small-region data association) are specialized in detecting
small errors whereas the CFEAR measures are more suitable
for larger alignment errors. We refer to [32] for details.
2) Assessing alignment: Once training data has been
computed, we train a logistic regression classifier
palign = 1/(1 + edalign ),(7a)
dalign =βXquality ,(7b)
where β1×7are the learned model parameters. We train
on discrete alignment classification here as we consider all
visible errors to be undesired. However, dalign is passed to
our loop verification module rather than palign. We found
dalign to be more suitable, as the sigmoid output palign is
insensitive to alignment change close to 0or 1.
D. Verification of loop candidates
We allow for multiple competing loop candidates ckper
query qas illustrated in Fig. 1. Each of the Ncand = 3
best matching pairs {(q, ck)}provided by the place recog-
nition module is used to compute and verify potential loop
constraints. A constraint is computed by finding the relative
alignment xq,ck
loop that minimizes Eq. 1 i.e. the distance be-
tween correspondences, similarly to the odometry module.
As an initial guess, we use the relative orientation provided
by the place recognition module. If the loop candidate was
retrieved from a match with an augmented query scan, the
corresponding augmented lateral offset is used together with
the rotation as an initial guess. Note that the local registration
method is motivated by the access to an initial guess, required
for convergence. After registration, we extract and assess
the alignment quality dalign =βXq,ck
quality following the
procedure in Sec. (III-C.1&III-C.2). Each constraint is finally
verified by combining the Scan Context distance (dsc) with
odometry uncertainty (dodom) and alignment quality (dalig n)
with a logistic regression classifier
loop =1
1 + eΘXq,ck
, s.t. yq,ck
loop > yth,
loop = [dodom dsc dalign 1]T.
The model parameters Θcan be learned via ground truth
loop labels, or simply tuned as the 4 parameters have intuitive
meaning. yth is the sensitivity threshold we rarely observe
false positives when fixed to 0.9.
We investigated two strategies for selecting loop closures
after a successful verification: (i) We select the first candidate
retrieved from the place recognition module (Ncand = 1)
the lowest Scan Context distance score dsc. (ii) We use
Ncand = 3 candidates and select the best candidate according
to our verifier the highest probability yq,ck
loop. The intuition
for strategy (i) is that the first retrieved place candidate is of-
ten the best guess, without considering registration. However,
there are cases where one of the latter candidates is preferred.
For example, registration of query and the first candidate
may fail, or subtle scene differences that distinguish places
can be hard to detect until a more thorough local analysis of
alignment has been carried out. Thus, selecting the verifiably
better loop constraint candidate is desired. We compare these
two strategies in Sec. IV-B ( T.6 and T.8) Once a loop
constraint has been verified, the loop is added to Cloop.
E. Pose Graph Optimization
We correct the odometry by solving a sparse least squares
optimization problem. We do so by minimizing Eq. 9, using
the odometry and loop constraints Codom,Cloop :
J(Y) = X
i,j C1
| {z }
odometry constraints
i,j C1
loopei,j )
| {z }
loop constraints
Y= [y1y2...yn]is the vector of optimization parameters,
e=yi,j xi,j is the difference between parameter and con-
straint, Cis the covariance matrix. We used two strategies to
compute covariance; fixed: computed from registration error
statistics C=diag([vxx =1e-2, vyy =1e-2, vθ θ =1e-3]);
dynamic: obtained from the Hessian, approximated from the
Jacobian (C= (H)1(JTJ)1) of the registration cost
(Eq. 1). Note that the dynamically obtained Cwas tuned by
a factor γto provide realistic uncertainties, discussed in [9].
Loop constraints are scaled by an additional optional factor
for (5e-5). This factor retains the high odometry quality
through pose graph optimization, and alleviates the need for
a more accurate but computationally more expensive multi-
scan loop registration
Finally, we solve argminYJ(Y)using Levenberg-
Marquardt. Note that we do not need a robust back-end
to mitigate outlier constraints (such as dynamic covariance
scaling [41] or switchable loop constraints [42]).
We evaluate our method on the Oxford [43] and
MulRan [40] datasets. Both datasets were collected by
driving a car with a roof-mounted radar. The Oxford dataset
contains 30 repetitions of a 10 km urban route. The MulRan
dataset has a wider mix of routes; from structured urban to
partly feature poor areas such as open fields and bridge cross-
ings. In MulRan, places are generally revisited in the same
driving direction. This is not the case in Oxford where
loop closure is significantly harder. From these datasets,
we selected the previously most evaluated sequences, see
Tab. (I&II). The radars used are Navtech CTS350-X, con-
figured with 4.38 cm resolution in the Oxford dataset, and
CIR204-H, with 5.9 cm resolution in the MulRan dataset.
We use standard parameters for CFEAR-3 [17], CorAl [32],
and Scan Context [38], except where explicitly mentioned.
A. Run-time performance
After the initial generation of training data, the full
pipeline including odometry and loop closure runs in real-
time. Pose graph optimization runs in a separate thread, either
continuously as new verified loop constraints are computed,
or once at the end. Run-time performance is as follows (mea-
sured on an i7-6700K CPU). Odometry: 37 ms; Generation of
training data: 236 ms/keyframe pair; Pose graph optimization
(once in the end with only odometry as prior): 992 ms @ 4k
keyframes; Loop closure (128 ms @ Ncand = 3); Descriptor
generation: 1 ms/keyframe; Detect loops: 25 ms/keyframe;
Registration: 6 ms/candidate; Verification 19 ms/candidate.
B. Ablation study loop detection
In this section, we evaluate the effect of the various aspects
of our loop detection pipeline. Loops are classified as correct
if the difference between estimated alignment and ground
truth does not exceed 4m or 2.5. This error limit was set
slightly higher than the largest pose error that we found in
the ground truth at loop locations. While we would like to
demonstrate the full capability of our system by analysis
of smaller errors, we found ground truth accuracy to be
a limiting factor. Note that the ground truth quality has
previously been discussed as a limitation [13].
We compare the impact of each technique summarized
below. Note that a later technique in the list either adapts or
replaces the former technique.
T.1 Radar Scan Context: Polar radar data is, without pre-
processing, directly downsampled into a Scan Context
descriptor [40]. We found the OpenCV interpolation op-
tion INTER AREA to yield the best results. Verification
includes only place similarity dsc.
T.2 Aggregated point cloud map: We instead create the Scan
Context descriptor by aggregating motion-compensated
peak detections from multiple scans (Sec. III-B.2).
T.3 Origin augmentation: Additional augmentations (origin-
shifted copies with lateral offsets) are matched. Out of
these, the best match is selected (Sec. III-B.2).
T.4 Alignment loop verification: Verification includes align-
ment quality dalign from Sec. III-D.
T.5 Odometry decoupled: Verification includes dodom.
T.6 Odometry coupled:dodom is embedded into the loop
retrieval search as described in Sec. III-B.3.
T.7 Separate verification: Instead of unified verification,
loops are verified separately by alignment (dalign).
T.8 Multiple candidate selection: Based on item T.6.
Ncand = 3 competing candidates {(q , ck)}are used
(previously 1). The most likely loop (yq,ck
loop) is selected.
The impact is presented in Fig. 4a for the eight Oxford
(a) T.1 (b) T.2 (c) T.3 (d) T.4 (e) T.5 (f) T.6 (g) T.8
Fig. 3: Detected loops in the Oxford dataset using the methods in Sec. IV-B. All potential loops are colored and ordered
according to success: red (dangerous failure False Positive), orange (safe failure False Negative), blue (safe failure False
Negative), green (success True Positive). Blue-colored loops differ compared to orange in that the suggested candidates
are actually correct, but no loop is predicted as likelihood doesn’t exceed the decision boundary: (yq,ck
loop < yth).
0.0 0.2 0.4 0.6 0.8 1.0
1) Radar Scan Context
2) Aggregated point cloud map
3) Origin augmentation
4) Alignment loop verification
5) Odometry decoupled
6) Odometry coupled
7) Cascaded classifier
8) Multiple candidate selection
(a) Oxford
0.0 0.2 0.4 0.6 0.8 1.0
1) Radar Scan Context
2) Aggregated point cloud map
3) Origin augmentation
4) Alignment loop verification
5) Odometry decoupled
6) Odometry coupled
7) Cascaded classifier
8) Multiple candidate selection
(b) MulRan
Fig. 4: Loop closure performance over all sequences.
sequences, and Fig. 4b for the nine MulRan sequences. Loop
detections are visualized in Fig. 3 for the Oxford sequence
16-13-09 with yth = 0.9. The raw Scan Context ( T.1)
achieves higher recall compared to the sparse local map (T.2).
The difference is larger in MulRan, where scans are acquired
in the same moving direction, and motion distortion is less
of a challenge. Additionally, we noted that the difference
is highest within the feature-poor Riverside sequences.
This suggests that maintaining information-rich radar data is
largely advantageous compared to using a sparse local map,
especially when features are scarce. Note however that our
local mapping technique is primarily motivated by the need
for a sparse Cartesian point cloud for efficient augmentation.
Oxford is more challenging compared to MulRan as
a majority of the revisits occur in opposite road lanes
and directions. However, the augmentation technique (T.3)
allows the detection of additional candidates with higher
lateral displacement, and as expected, increases the highest
top recall, yet at a cost of lower precision. This loss of
precision can however be alleviated via alignment loop
verification (T.4). The improvement is larger in Oxford
and we believe the more structured scenes are favorable
for alignment analysis. The decoupled odometry approach
(T.5), which extends verification by including odometry
uncertainty, gives a higher tolerance to false positives. At this
point, the decision boundary can be chosen such that almost
all candidates are correctly classified. Unified verification is
preferred over separate verification (T.7). Selecting the can-
didate with the highest probability (T.8), rather than the first
place recognition candidate (T.6) yields a clear improvement
in Oxford. We believe this improvement is because our
alignment quality aids in distinguishing between places and
detecting registration failures, especially in structured scenes.
C. SLAM performance - comparative evaluation
We compare TBV Radar SLAM to previous meth-
ods for radar, lidar, and visual SLAM within the Ox-
ford and Mulran dataset. We primarily compare methods
over full trajectories i.e. Absolute Trajectory Error (ATE)
i=1 ||transl(xest
Additionally, we provide the KITTI odometry metric [48],
which computes the relative error between 100-800 m,
e.g. error over a shorter distance. ATE metrics for method
Type/modality Method Evaluation 10-12-32Training 16-13-09 17-13-26 18-14-14 18-15-20 10-11-46 16-11-53 18-14-46 Mean
SLAM/camera ORB-SLAM2 [44] [45] 7.96 7.59 7.61 24.63 12.17 7.30 3.54 9.72 10.07
Odometry/Radar CFEAR-3 (odometry used) 7.29 23.32 15.58 20.95 20.02 16.87 15.47 28.58 18.51
SLAM/Radar RadarSLAM-Full [1] [45] 9.59 11.18 5.84 21.21 7.74 13.78 7.14 6.01 10.31
SLAM/Radar MAROAM [28] From author 13.76 6.95 8.36 10.34 10.96 12.42 12.51 7.71 10.38
SLAM/Radar TBV Radar SLAM-dyn-cov-T.8 (ours) 4.22 4.30 3.37 4.04 4.27 3.38 4.98 4.27 4.10
SLAM/Radar TBV Radar SLAM-T.8 (ours) 4.07 4.04 3.58 3.79 3.83 3.14 4.39 4.33 3.90
SLAM/Lidar SuMa (Lidar - SLAM) [46] [45] 1.1/0.3p1.2/0.4p1.1/0.3p0.9/0.1p1.0/0.2p1.1/0.3p0.9/0.3p1.0/0.1p1.03/0.3p
Odometry/Radar CFEAR-3-S50 [17] [17] 1.05/0.34 1.08/0.34 1.07/0.36 1.11/0.37 1.03/0.37 1.05/0.36 1.18/0.36 1.11/0.36 1.09/0.36
Odometry/Radar CFEAR-3 (odometry used) 1.20/0.36 1.24/0.40 1.23/0.39 1.35/0.42 1.24/0.41 1.22/0.39 1.39/0.40 1.39/0.44 1.28/0.40
SLAM/Radar RadarSLAM-Full [1] [45] 1.98/0.6 1.48/0.5 1.71/0.5 2.22/0.7 1.77/0.6 1.96/0.7 1.81/0.6 1.68/0.5 1.83/0.6
SLAM/Radar MAROAM-Full [28] [28] 1.63/0.46 1.83/0.56 1.49/0.47 1.54/0.47 1.61/0.50 1.55/0.53 1.78/0.54 1.55/0.50 1.62/0.50
SLAM/Radar TBV Radar SLAM-T.8 (ours) 1.17/0.35 1.15/0.35 1.06/0.35 1.12/0.37 1.09/0.36 1.18/0.35 1.32/0.36 1.10/0.36 1.15/0.36
TABLE I: Top: Absolute Trajectory error (ATE) [m], Bottom: Drift (% translation error / deg/100 m) on the Oxford
Radar RobotCar dataset [43]. Methods marked with p finished prematurely. Methods for Radar SLAM are shadowed.
x (m)
y (m)
Ground truth
(a) 10-12-32
x (m)
y (m)
Ground truth
(b) 16-13-09
x (m)
y (m)
Ground truth
(c) 17-13-26
x (m)
y (m)
Ground truth
(d) 10-11-46
(e) 18-14-14
x (m)
y (m)
Ground truth
(f) 18-15-20
x (m)
y (m)
Ground truth
(g) 16-11-53
x (m)
y (m)
Ground truth
(h) 18-14-46
Fig. 5: Oxford trajectories using the proposed method TBV Radar SLAM, compared with CFEAR-3 [17] (odometry only)
and Ground truth. Initial and final pose are marked with ×and . Trajectories can be directly compared to [10], [17].
Type/Modality Method Evaluation KAIST01 KAIST02 KAIST03 DCC01 DCC02 DCC03 RIV.01 RIV.02 RIV.03 Mean
SLAM/Lidar SuMa Full [46] [1] 38.70 31.90 46.00 13.50 17.80 29.60 - - - 22.90
Odometry/Lidar KISS-ICP (odometry) [47] 17.40 17.40 17.40 15.16 15.16 15.16 49.02 49.02 49.02 27.2
Odometry/Radar CFEAR-3 [17] (odometry used) 7.53 7.58 12.21 6.39 3.67 5.40 6.45 3.87 19.44 8.06
SLAM/Radar RadarSLAM Full [45] [1] 6.90 6.00 4.20 12.90 9.90 3.90 9.00 7.00 10.70 7.80
SLAM/Radar MAROAM Full [28] [28] - - - - 5.81 - - 4.85 - -
SLAM/Radar TBV SLAM-T.8-dyn-cov (ours) 1.71 1.42 1.52 5.41 3.29 2.61 2.66 2.49 2.52 2.63
SLAM/Radar TBV SLAM-T.8-cov (ours) 1.66 1.39 1.50 5.44 3.32 2.66 2.61 2.36 1.48 2.49
SLAM/Lidar SuMa Full [46] 2.9/0.8 2.64/0.6 2.17/0.6 2.71/0.4 4.07/0.9 2.14/0.6 1.66/0.6P1.49/0.5P1.65/0.4P2.38/0.5
Odometry/Lidar KISS-ICP (odometry) [47] 2.28/0.68 2.28/0.68 2.28/0.68 2.34/0.64 2.34/0.64 2.34/0.64 2.89/0.64 2.89/0.64 2.89/0.64 2.5/0.65
Odometry/Radar CFEAR-3-s50 [17] [17] 1.48/0.65 1.51/0.63 1.59/0.75 2.09/0.55 1.38/0.47 1.26/0.47 1.62/0.62 1.35/0.52 1.19/0.37 1.50/0.56
Odometry/Radar CFEAR-3 [17] (odometry used) 1.59/0.66 1.62/0.66 1.73/0.78 2.28/0.54 1.49/0.46 1.47/0.48 1.59/0.63 1.39/0.51 1.41/0.40 1.62/0.57
SLAM/Radar RadarSLAM-Full [45] [1] 1.75/0.5 1.76/0.4 1.72/0.4 2.39/0.4 1.90/0.4 1.56/0.2 3.40/0.9 1.79/0.3 1.95/0.5 2.02/0.4
SLAM/Radar TBV SLAM-T.8 (ours) 1.01/0.30 1.03/0.30 1.08/0.35 2.01/0.27 1.35/0.25 1.14/0.22 1.25/0.35 1.09/0.30 0.99/0.18 1.22/0.28
TABLE II: Top: Absolute Trajectory error (ATE), Bottom: Drift (% translation error / deg/100 m) on the MulRan dataset [40].
x (m)
y (m)
Ground truth
(a) KAIST01
x (m)
y (m)
Ground truth
(b) KAIST02
x (m)
y (m)
Ground truth
(c) KAIST03
x (m)
y (m)
Ground truth
(d) DCC01
x (m)
y (m)
Ground truth
(e) DCC02
x (m)
y (m)
Ground truth
(f) DCC03
x (m)
y (m)
Ground truth
(g) RIV.01
x (m)
y (m)
Ground truth
(h) RIV.02
−200 0
x (m)
y (m)
Ground truth
(i) RIV.03 (j) VolvoCE - forest (k) Kvarntorp - mine
Fig. 6: TBV Radar SLAM,CFEAR-3 [17] (odometry only), and Ground truth. First (×) and final() pose are indicated.
MAROAM [28] was kindly computed and provided by Wang
et al. for this letter. We tuned our parameters on the Oxford
sequence 10-12-32 and evaluated the performance of
SLAM on all other Oxford and MulRan sequences.
The estimated trajectories are depicted in Fig. 5 and
Fig. 6(a-i). We found that TBV effortlessly closes loops
and corrects odometry in all sequences. ATE is substantially
corrected over the full trajectory, with slightly reduced drift
(Tab. I & Tab. II). TBV outperforms previous methods for
radar SLAM in terms of ATE and drift over all sequences.
Hence, we conclude that our method improves the state
of the art in radar SLAM. Surprisingly, we did not observe
any improvement from using dynamic covariance (dyn-cov)
compared to fixed. The Hessian-approximated covariance oc-
casionally under- or over-estimates the odometry uncertainty
[49] and thus deteriorates the optimization process.
D. Generalization to off-road environments
Finally, we tested TBV on the sequences Kvarntorp
and VolvoCE from the Diverse ORU Radar Dataset [16],
see footnote for a demo1.Kvarntorp is an underground
mine with partly feature poor sections, while VolvoCE is a
mixed environment with forest and open fields. Trajectories
are visualized in Fig. 6.(j-k). We found that TBV was able
to produce smooth and globally consistent maps, through
1ORU dataset download:
Demo video:
substantially different environments, including challenging
road conditions without any parameter changes.
We proposed TBV Radar SLAM a real-time method
for robust and accurate large-scale SLAM using a spin-
ning 2D radar. We showed that loop candidate retrieval
can be largely improved by origin-shifting, coupled place
similarity/odometry uncertainty search, and selecting the
most likely loop constraint as proposed by our verification
model. A high level of loop robustness was achieved by
carefully verifying loop constraints based on multiple sources
of information, such as place similarity, consistency with
odometry uncertainty, and alignment quality assessed after
registration. We evaluated TBV on two public datasets and
demonstrated a substantial improvement to the state of the art
in radar SLAM, making radar an attractive option to lidar for
robust and accurate localization. Quantitative and qualitative
experiments demonstrated a high level of generalization
across environments. Some findings in our ablation study
suggest that heavy filtering is undesired as it discards details
that are important for place recognition. Thus, in the future,
we will explore building detailed and dense representations
of scenes, fully utilizing the geometric information richness,
uniquely provided by spinning FMCW radar.
