Augmenting Sparse Laser Scans with
Virtual Scans to Improve the Performance
of Alignment Algorithms
Department of Computer and Information Science Temple University, Philadelphia, USA
We present a system to increase the performance of feature correspondence based alignment
algorithms for laser scan data. Alignment approaches for robot mapping, like ICP or FFS,
perform successfully only under the condition of sufficient feature overlap between single
scans. This condition is often not met, e.g. in sparsely scanned environments or disaster
areas for search and rescue robot tasks. Assuming mid level world knowledge (in the
presented case the weak presence of noisy, roughly linear or rectangular-like objects) our
system augments the sensor data with hypotheses ('Virtual Scans') about ideal models of
these objects, based on analysis of a current estimated map of the underlying iterative
alignment algorithm. Feedback between the data alignment and the data analysis confirms,
modifies, or discards the Virtual Scan data in each iteration. Experiments with a simulated
scenario and real world data from a rescue robot scenario show the applicability and
advantages of the approach.
Robot mapping based on laser range scans is a major field of research in robotics in the
recent years. The basic task of mapping is to combine spatial data usually gained from laser
range devices, called 'scans', to a single data set, the 'global map'. The global map represents
the environment scanned from different locations, even possibly scanned by different robots
('multi robot mapping'), usually without knowledge of their pose (= position and heading).
One class of approaches to tackle this problem, i.e. to align single scans, is based on feature
correspondences between the single scans to find optimal correspondence configurations.
Techniques like ICP (Iterative Closest Point, e.g. [2, 24] and ) or FFS (Force Field
Simulation based alignment, ) belong to this class. They show impressive results, but are
naturally restricted: first since they are feature correspondence based, they require the
presence of a sufficient amount of common, overlapping features in scans belonging
together. Second, since the feature correspondence function is based on a state describing
the relation of the single scans (e.g. the robots' poses), these algorithms are depending on
sufficiently good state initialization to avoid local minima. In this paper, we suggest a
solution to the first problem: correct alignment in the absence of sufficient feature
correspondences. This problem can e.g. arise in search and rescue environments (these
environments typically show a little number of landmarks only) or when multiple robots
team to build a joint global map. In this situation, single scans, acquired from different
views, do not necessarily reveal the entire structure of the scanned object. The motivation to
our approach is that even if the optimal relation between single scans is not known, it is
possible to infer hypotheses of underlying structures from the non-optimal combination of
single scans based on the assumption of certain real world knowledge. Figure 1 illustrates
Fig. 1. Motivation of Virtual Scan approach (a-f in reading order): a) rectangular object
scanned from two positions (red/blue robots). b) correspondence between single scans
(red/blue) does not reveal the scanned structure c) misalignment due to wrong
correspondences d) analysis of estimated global map detects structure e) structure is added
as Virtual Scan f) correct alignment achieved due to correspondences between real world
scans and Virtual Scans
idea. It shows a situation where the relation between features of single scans can not reveal
the real world structure, and therefore leads to misalignment. Analysis from a global view
estimates the underlying structure. This hypothesis then augments the real world data set,
to achieve a correct result.
The motivational example shows the ideal case; it doesn't assume any error in the global
map estimation (the relative pose between red and blue scan), hence it is trivial to detect the
correct structure. Our system also handles the non ideal situation including pose errors. It
utilizes a feedback structure between hypothesis generation and real data alignment
response. The feedback iteratively adjusts the hypotheses to the real data (and vice versa).
This will be discussed in more detail below. We first want to explain our approach in a more
Feature correspondence algorithms, e.g. in ICP or FFS, can be seen as low level spatial
cognition processes (LLSC), since they operate based on low level geometric information.
The feature analysis of the global map, which is suggested in this paper, can be described as
mid level spatial cognition process (MLSC), since we aim at analysis of features like lines,
rectangles, etc. Augmenting real world data with ideal models of expected data can be seen
as an example of integration of LLSC and MLSC processes to improve the performance of
spatial recognition tasks in robotics. We are using the area of robot perception for mobile
rescue robots, specifically alignment of 2D laser scans, as a showcase to demonstrate the
advantages of these processes.
In robot cognition, MLSC processes infer the presence of mid level features from low level
data based on regional properties of the data. In our case, we detect the presence of simple
mid level objects, i.e. line segments and rectangles. The MLSC processes model world
knowledge, or assumptions about the environment. In our setting for search and rescue
environments, we assume the presence of (collapsed) walls and other man made structures.
If possible wall-like elements or elements somewhat resembling rectangular structures are
detected, our system generates the most likely ideal model as a hypothesis, called 'Virtual
Scan'. Virtual Scans are generated from the ideal, expected model in the same data format as
the raw sensor data, hence Virtual Scans are added to the original scan data
indistinguishably for the low level alignment process; the alignment is then performed on
the augmented data set.
In robot cognition, LLSC processes usually describe feature extraction based on local
properties like spatial proximity, e.g. based on metric inferences on data points, like edges in
images or laser reflection points. In our system laser scans (virtual or real) are aligned to a
global map using mainly features of local proximity using the LLSC core process of 'Force
Field Simulation' (FFS). FFS was recently introduced to robotics . In FFS, each data point
can be assigned a weight, or value of certainty. It also does not make a hard, but soft
decision about the data correspondences as a basis for the alignment. Both features make
FFS a natural choice over its main competitor, ICP [2, 24], for the combination with Virtual
Scans. The weight parameter can be utilized to indicate the strength of hypotheses,
represented by the weight of virtual data.
FFS is an iterative alignment algorithm. The two levels (LLSC: data alignment by FFS,
MLSC: data augmentation) are connected by a feedback structure, which is repeated in each
• The FFS-low-level-instances pre-process the data. They find correspondences based
on low level features. The low level processing builds a current version of the global
map, which assists the mid-level feature detection
• The mid level cognition module analyzes the current global map, detects possible mid
level objects and models ideal hypothetical sources possibly
being present in the real world. These can be seen as suggestions, fed back into the
low level system by Virtual Scans. The low level system in turn adjusts its processing
for re-evaluation by the mid level systems.
Fig. 3. Feedback between Virtual Scans (VS) and FFS. From left to right: a) Initial state of real
data. b) Real data augmented by VS (red). c) After one iteration using real and virtual scans.
d) new hypothesis (red) based on (c). e) next iteration. Since this results resembles an ideal
rectangle, adding a VS would not relocate the scans. The system converged.
The following example will illustrate the feedback: Figure 3 assumes two scans, e.g. taken
from robots in two different positions (compare to fig.1). An MLSC process detects a
rectangular structure (the asumed world knowledge) and adds an optimal generating model
to the data set. The LLSC module aligns the augmented data. The hypothesis now directs
the scans to a better location. In each iteration, the relocated real scans are analyzed to adjust
the MLSC hypothesis: LLSC and MLSC assist each other in a feedback loop.
2. Related Work in Spatial Cognition and Robot Mapping
The potential of MLSC has been largely unexplored in robotics, since recent research mainly
addressed LLSC systems. They show an astonishing performance: especially advances in
statistical inferences [5, 10, 13] in connection with geometric modeling of human perception
[6, 9, 25] and the usage of laser range scanners contributed to a breakthrough in robot
applications, with the most spectacular results achieved in the 2005 DARPA Grand
Challenge where several autonomous vehicles were able to successfully complete the race
. But although the work on sophisticated statistical and geometrical models like
extended Kalman Filters (EKF),e.g. , Particle Filters  and ICP (Iterative Closest Point)
[2, 24] utilized in mapping approaches show impressive results, their limits are clearly
visible, e.g. in the aforementioned rescue scenarios. These systems are still based on low
level cognitive features, since they construct metric maps using correspondences between
sensor data points. However, having these well-engineered low level systems at hand, it is
natural to connect them to MLSC processes to mutually assist each other.
The knowledge in the area of MLSC in humans, in particular in spatial intelligence and
learning, is advancing rapidly [7, 14, 27]. Research in AI models such results to generate
Fig. 2. LLSC/MLSC feedback. The LLSC module works on the union of real scans and the
Virtual Scan. The MLSC module in turn re-creates a new Virtual Scan based on the result of
the LLSC module.
generic representations of space for mobile robots using both symbolic, e.g. , and non
symbolic, e.g. , approaches. Each is trying to identify various aspects of the cognitive
mapping process. Naturally, SLAM (Simultaneous Localization and Mapping  is often
used as an application example . In , a spatial cognition based map is generated
based on High Level Objects. Representation of space is mostly based on the notion of a
hierarchical representation of space. Kuipers  suggests a general framework for a Spatial
Semantic Hierarchy (SSH), which organizes spatial knowledge representations into levels
according to ontology from sensory to metrical information. SSH is an attempt to
understand and conceptualize the cognitive map , the way we believe humans
understand space. More recently, Yeap and Jefferies  trace the theories of early cognitive
mapping. They classify representations as being space-based and object-based. Comparing
to our framework, these classifications could be described being related to LLSC and High
Level Spatial Cognition (HLSC), hence the supposed LLSC/MLSC system would relate
closer to space-based systems.
In , the importance of 'Mental Imagery' in (Spatial) Cognition is emphasized and basic
requirements of modeling are stated. Mental Images invent or recreate experiences to
resemble actually perceived events or objects. This is closely related to the "Virtual Scans"
described in this proposal. Recently, Chang et al.  presented a predictive mapping
approach (P-SLAM), which analyzes the environment for repetitive structures on the LLSC
level (lines and corners) to generate a "virtual map". This map is either used as a hypothesis
in unexplored regions to speed up the mapping process or as an initialization help for the
utilized particle filters when a region is first explored. In the second case the approach has
principles similar to the presented Virtual Scans. The impressive results of P-SLAM can also
be seen as proof of concept of integrating prediction into robot perception.
The problem of geometric robot mapping is based on aligning a set of scans. On the LLSC
level the problem of simultaneous aligning of scans has been treated as estimating sets of
poses . The underlying framework for such a technique is to optimize a constraint-
graph, in which nodes are features, poses and edges are constraints built using various
observations and measurements.
There are numerous image registration techniques, the most famous being
Iterative Closest Point (ICP), and its numerous variants to improve speed and converge
basins. Basically all these techniques do search in transformation space trying to find the set
of pair-wise transformations of scans by optimizing some function defined on
transformation space. The techniques vary in defining the optimization functions that range
from being error metrics like "sum of least square distances" to quality metrics like "image
distance". 'Force Field Simulation' (FFS), , minimizes a potential derived from forces
between corresponding data points. The Virtual Scan technique presented in this paper will
interact with FFS as underlying alignment technique.
3. Scan Alignment using Force Field Simulation
The understanding of FFS is crucial to the understanding of the presented extension of the
FFS alignment using Virtual Scans. We will give an overview here. FFS aligns single scans Si
obtained by robots, typically from different positions. We assume the scans to be roughly
pre-aligned (see fig.11), e.g. by odometry or shape based pre-alignment. This is in accord
with the performance comparison between FFS and ICP described in . FFS alignment, in