Conference PaperPDF Available

Collaborative Uncertainty-aware Navigation for Vision based Multirotor Swarms


Abstract and Figures

In this paper, we present a framework for collaborative uncertainty-aware navigation for swarms of vision based multirotor micro aerial vehicles (MAV). We assume that each MAV in a swarm is equipped with a forward-facing monocular camera, and that the vehicles are capable of using feature data to map the environment and perform vision based localization. Additionally, the vehicles are also capable of computing relative poses between each other in order to improve accuracy of pose estimation. For this scenario, we develop a navigation framework which seeks to improve the reconstructed maps and plan trajectories such that localization uncertainty is minimized. Within this framework, we first utilize an evolutionary algorithm that generates better viewpoints for the MAVs from which the map of the environment can be improved. This generated map is subsequently used as a source of information to perform path planning for each vehicle using a rapidly exploring random belief tree. This algorithm, while connecting start and goal poses with collision-free trajectories, ensures that the vehicles prioritize observing feature-rich areas and never lose sight of features, thus improving localization accuracy. Additionally, the algorithm is also capable of estimating when corrections would be required through relative poses and where these observations should be obtained, such that one vehicle can improve the accuracy of its neighbors. Through these approaches, we generate smooth, uncertainty-aware paths that are suitable for MAV navigation.
Content may be subject to copyright.
Collaborative Uncertainty-aware Navigation for Vision based Multirotor Swarms
Sai Vemprala
PhD Candidate
Texas A&M University
College Station, TX, USA
Srikanth Saripalli
Associate Professor
Texas A&M University
College Station, TX, USA
In this paper, we present a framework for collaborative uncertainty-aware navigation for swarms of vision based
multirotor micro aerial vehicles (MAV). We assume that each MAV in a swarm is equipped with a forward-facing
monocular camera, and that the vehicles are capable of using feature data to map the environment and perform vision
based localization. Additionally, the vehicles are also capable of computing relative poses between each other in order
to improve accuracy of pose estimation. For this scenario, we develop a navigation framework which seeks to improve
the reconstructed maps and plan trajectories such that localization uncertainty is minimized. Within this framework,
we first utilize an evolutionary algorithm that generates better viewpoints for the MAVs from which the map of the
environment can be improved. This generated map is subsequently used as a source of information to perform path
planning for each vehicle using a rapidly exploring random belief tree. This algorithm, while connecting start and goal
poses with collision-free trajectories, ensures that the vehicles prioritize observing feature-rich areas and never lose
sight of features, thus improving localization accuracy. Additionally, the algorithm is also capable of estimating when
corrections would be required through relative poses and where these observations should be obtained, such that one
vehicle can improve the accuracy of its neighbors. Through these approaches, we generate smooth, uncertainty-aware
paths that are suitable for MAV navigation.
Micro aerial vehicles (MAV) have currently become a popular
choice in various robotic applications such as aerial surveying,
photography, mapping etc. Using MAVs in complex scenar-
ios creates a need for accurate localization, in which regard,
cameras are a good sensor choice. In recent years, cameras
have become ubiquitous on many MAV platforms, and their
decreasing size makes them more suitable for installation on-
board small platforms. On the other hand, small MAVs are
also suitable for deployment as a group, which is referred to
as an MAV swarm. For MAV platforms that are supposed to
be part of a swarm, there are certain constraints placed on the
size and payload, which automatically extends to a need for
small sensors and low-power computational platforms.
Vision based localization is also advantageous in the realm
of swarms: both because of the small size and low power re-
quirements of cameras, but also due to the fact that one of the
common way of localization using vision, which is feature-
based, provides a simple way to represent information about
the environment in view, and also to transmit and receive this
information between different vehicles. At the same time, this
data can also be used for collaboration within the group of
MAVs, which enables task distribution and more robust lo-
calization. Collaboration also helps reduce the computational
load and data processing on each platform.
Presented at the AHS International 74th Annual Forum &
Technology Display, Phoenix, Arizona, USA, May 14–17, 2018.
Copyright c
2018 by AHS International, Inc. All rights reserved.
Given information from the localization, MAV navigation in-
volves solving the path planning problem: which is responsi-
ble for generating trajectories for a vehicle from a start pose
to a goal pose. For safe and accurate navigation, these tra-
jectories would also need to take into localization uncertainty
into account. For example, in a camera based localization ap-
proach, it can be noticed highly textured or well-lit areas offer
higher fidelity of measurements than others. For improved
confidence during navigation, this information should also be
a part of the navigation framework.
In this paper, we present a framework that performs collab-
orative uncertainty-aware path planning for multiple MAVs
which use vision based localization (figure 1). We assume
that each MAV has access to a forward-facing camera: and
the images from these cameras are used to build 3D maps and
to track features. This information enables localization in full
3D space through two approaches: one is performed onboard
each vehicle by tracking features from the 3D map, and an-
other is performed between vehicles occasionally, that fuses
relative and individual measurements. Similarly, our planning
framework also uses a multi-step strategy: the two main func-
tions of the planning framework are improving existing 3D
maps using multi-vehicle constraints, and secondly, planning
trajectories for all MAVs that ensure sufficient feature visi-
bility, while also using multiple vehicles for pose correction
if needed. The planning framework is capable of generating
smooth paths that are suitable for MAV navigation, and is also
able to use occupancy grids for collision avoidance. This pa-
Fig. 1. Concept of uncertainty-aware collaborative plan-
ning. Three MAVs have individual goals but need to ob-
serve feature points to localize. The shortest path (red) is
faster but results in higher covariance at goal due to com-
promised feature visibility In contrast, the path in blue af-
fords better measurements and lower covariances at the
expense of higher cost.
per is an extension to our previous work on collaborative path
planning (Ref. 1).
The idea of using uncertainty information within the planning
domain is a topic that has been of interest only recently. In
the context of sampling based planning, recent research has
examined fusing uncertainty with pose data, thus perform-
ing ‘belief’ space planning. One of the earliest works in
this area is the belief road map (BRM) algorithm, presented
in (Ref. 2), that builds a roadmap of samples along with sim-
ulated measurements along candidate paths. In (Ref. 3), be-
lief space planning is performed using maximum likelihood
observations. The rapidly exploring random belief tree al-
gorithm (Ref. 4), which we use in this paper, combines the
a rapidly exploring random graph and belief information to
minimize uncertainty of the vehicle. Other belief space plan-
ning methods include direct trajectory optimization (Ref. 5)
which calculates locally optimal control policies, incorporat-
ing motion uncertainty in a roadmap framework (Ref. 6), a
localization-aware version of the RRBT algorithm (Ref. 7)
and a multi-robot belief space planning framework by mod-
eling future inferences (Ref. 8).
Recent work has also investigated belief space planning in
the challenging domain of aerial vehicles. In (Ref. 9), the
authors use RRBTs for uncertainty aware MAV navigation,
where the aerial vehicle is equipped with a downward facing
camera. In (Ref. 10), a perception aware planning approach
is discussed for aerial vehicles, where dense, direct methods
are used and photometric information of a scene is used as
a metric. Next-best-view approaches have been developed in
the literature for model improvement through surface texture
appearance optimization (Ref. 11), for multiple collaborative
sensors to perform dense reconstruction (Ref. 12) and more
recently, Bircher et al (Ref. 13) presented a receding horizon
next best view planner that can perform 3D exploration for a
micro aerial vehicle equipped with a stereo camera.
Fig. 2. Collaborative localization: Each MAV is capable
of individual localization through feature tracking (blue),
but also through relative pose estimation between each
other (green).
In this paper, as our application as aimed at vision based lo-
calization, we present an approach that is a combination of
both next-best-view planning and belief space planning, thus
attempting to perform both mapping and navigation in a way
that reduces localization uncertainty. At the same time, our
framework is mainly applicable for groups of vehicles, as it
accounts for multi-robot constraints by allowing for relative
System Description and Collaborative Localization
In this section, we describe the collaborative localization
framework for MAVs briefly. This idea has been discussed
in our previous work (Ref. 14) (Ref. 15). The system of
MAVs containing forward facing cameras is initialized with
an estimate of the scale of the environment. Using the images
from all the cameras, feature matching is performed to build a
map that contains a partial representation of the surroundings;
which is shared between all vehicles, acting as a ‘global map’.
The two types of localization possible are illustrated in fig-
ure 2. At every time step, MAVs capture images and detect
salient features in each image. In the first type of localiza-
tion, each MAV isolates the features still visible and tracked
from the global map and uses the perspective-N-point algo-
rithm (PNP) to compute its own position and orientation. This
step is known as intra-MAV localization. The second way of
localization is called inter-MAV localization, where one MAV
computes the relative pose between itself and another MAV,
and the latter fuses it with its own onboard estimate. This way
of localizing is especially useful when one MAV is able to
observe more features than the other, thus it is able to use its
own confidence to correct the pose of the other. Periodically,
the MAVs use each other’s images and overlapping feature
information (keypoint locations and descriptors) to update the
environment and the scale. Hence, the quality of localization
depends on the quality of the 3D map as well as the areas tra-
versed by the MAVs and whether or not they afford sufficient
feature visibility.
Uncertainty-aware Path Planning
As it can be seen from the description of the localization al-
gorithm, there are two modes of operation for localization:
one that is responsible for only one MAV (intra-MAV) and
the other involving multiple MAVs (for inter-MAV localiza-
tion, map updates etc.) All of these approaches essentially
depend on the quality of the viewpoints and images captured:
the number of features visible, amount of usable texture in the
images etc. For this to reflect in the planning stage, we break
down the planning step into two stages as well: one that is re-
sponsible for planning paths for only a single MAV while en-
suring feature visibility and attempting to reduce uncertainty,
and the other attempts to identify the best viewpoints for a
combination of MAVs to perform feature matching.
In order to plan uncertainty-aware paths, it’s necessary to de-
velop a way to ‘model’ the uncertainty beforehand, i.e, during
the planning phase. Because the vehicles always have access
to a 3D map of features, and the intrinsics of the cameras
are known, it is trivial to compute the projections of these
points on the image plane for any arbitrary pose of the ve-
hicles, thereby the cameras, in six degrees of freedom. The
actual localization algorithms also use this projection data for
computing the poses, but we note here that the algorithms that
are responsible for estimating actual poses and uncertainties
are usually computationally intensive and are not appropriate
for application during the planning phase, where thousands
of random poses need to be evaluated. Hence, we attempt to
model the uncertainties using a combination of heuristic pa-
rameters. Each of these heuristics describes a certain param-
eter relating to the viewpoint and how the features are being
projected onto the image plane; and thereby helps model a
factor in the accuracy of pose estimation given this viewpoint
and the corresponding image projection.
Viewpoint Heuristics
1. Visibility: This factor calculates the ratio of visible fea-
tures from an MAV to the number of actual features that
are part of the global map. This is the most important
factor for localization, as the more the number of good
quality features, the better the localization accuracy.
2. Span: This factor calculates the ratio of area occupied by
features to the total pixel area of the image. The reason
this factor is considered is because for both PNP and rel-
ative pose computation, large evenly distributed regions
of salient features result in accurate pose computation,
whereas all features located in one corner of the image or
in a line would result in inaccuracy or degenerate cases.
3. Overlap: This factor calculates the ratio between number
of features that are common between the fields of view
of two MAVs to the total number of features in the global
map. Overlap is essential while performing relative pose
computation, as this decides the number of common fea-
tures that can be matched between multiple views.
Fig. 3. Next best view: the best set of viewpoints is the one
that has the potential to result in the most accurate map
after reconstruction. This depends on factors discussed in
the list of heuristics, which are then encoded in an objec-
tive function Jfor minimization.
4. Baseline: According to stereo geometry, the distance be-
tween a pair of cameras is proportional to the amount of
depth error while observing a distant scene. Hence, the
camera baselines must be scaled in a way that takes into
account their distance to the observed scene geometry
5. Vergence angle: For a given binocular view, the vergence
angle defines the angle between the projected rays from
the cameras at their intersection point. The best value
of vergence angle is a trade-off between matching error
and feature overlap, as high vergence angle creates more
overlap but at the expense of matching difficulties.
6. Collisions and occlusion: Configurations that can poten-
tially result in collisions between the MAVs or the scene
and occlusion (one MAV blocking the view of another)
are penalized by observing the distance/angle between
the MAVs and the environment.
Only factors 1 and 2 are used to evaluate viewpoints when
planning for a single vehicle, whereas all the factors are con-
sidered when dealing with multiple vehicles. In both cases,
the factors are computed and combined in such a way that the
lower this value, the lower uncertainty is expected from that
viewpoint(s). It is also possible to weight these factors indi-
vidually, but in our implementation, we weight them equally.
Viewpoint selection for mapping: next best view
After initialization, the MAVs have access to a possibly
sparsely built 3D map. The more accurate and feature-dense
this map is, the better the subsequent localization would.
Hence as the first step, the planning algorithm attempts to
compute the ‘next best view’ (NBV): the viewpoints that
would be an improvement over the current ones (figure 3). If
this map can be improved by moving the MAVs to new poses,
the algorithm generates those new poses as viewpoints.
First, for all the vehicles that are being considered, a joint
state of poses is constructed in the search space of position
and yaw (we assume roll and pitch to be constant), and ran-
dom samples are chosen in this space. For each sample, the
above heuristics are computed through the image projections
and are combined into an objective function. Thus, this ob-
jective function takes a set of candidate poses for all vehicles
as its input and outputs a value that essentially represents the
quality of that set of viewpoints.The configuration of poses
for the vehicles that results in the least value of the objective
function in a given search space is the configuration that can
result in the best possible improvement over the existing map.
Yet, for a given sampling space where each sample is a com-
bination of candidate poses for multiple vehicles, the value
of the objective function depends heavily on scene geom-
etry and is sensitive to small changes in pose parameters:
which is why it is hard to estimate an algebraic model for
this function. Due to this constraint, we consider it as a black-
box optimization problem and use a derivative-free method to
solve the problem. In our framework, we utilize the CMA-
ES (Covariance Matrix Adaptation Evolution Strategy) algo-
rithm (Ref. 16). CMA-ES is an evolutionary algorithm that is
suited for high dimensional black box optimization problems.
CMA-ES works by sampling a random distribution of sam-
ples at every iteration, which are referred to as a ‘population’.
The algorithm determines the strongest sample by evaluating
the function at all these samples; and once these values of the
objective function at all samples are available, then mean of
the distribution is updated such that the likelihood of strong
samples would be maximized in the next step. On the other
hand, the covariance matrix of this distribution is also updated
at every iteration, such that the direction of the sampling is bi-
ased towards stronger solutions. We initialize the algorithm
with the initial poses of the MAVs acting as the guess for the
first iteration. By running through this process of evaluating
and adapting to random samples, the NBV planner converges
to result in the best set of viewpoints.
Path planning with intra-MAV localization
After the NBV planner results in new viewpoints, an updated
map is reconstructed from those poses. At this point, the ve-
hicles have access to a sufficiently informative 3D map, and
now the next step is to connect the start and goal states pro-
vided, through collision-free paths and also areas that assure
localization throughout navigation, preferably in a way that
the uncertainty is reduced during path execution (figure 4).
We recall here that this localization, in the context of a single
vehicle, is performed using the PNP algorithm and 3D-2D fea-
ture tracking with the reconstructed map. In order to generate
plans such that localization is improved during trajectory exe-
cution, we use the sampling-based approach known as rapidly
exploring random belief trees (RRBT) where the planner at-
tempts to minimize localization uncertainty.
The foundation of the RRBT (Ref. 4) is to combine graph con-
struction and uncertainty estimation. The RRBT constructs a
graph of randomly sampled poses with the poses acting as the
vertices, and possible valid movements between the poses be-
ing the edges. At every iteration, a random pose is sampled
and the algorithm computes the image projection of the avail-
able 3D map from this pose. If this vertex is valid (collision
Fig. 4. Uncertainty-aware planning for one MAV. The
shortest path in orange does not afford enough feature
tracking, which causes the covariance to increase. In con-
trast, an ‘uncertainty-aware’ path makes sure to modify
the path such that better measurements are obtained, thus
lowering the covariance.
free), we use the visibility and span heuristics from the previ-
ous list to quantify how good this viewpoint would be in terms
of localization. The sampling of the algorithm is biased by
discarding any poses that result in less than minimum number
of features being visible. Once this check is performed, the
vertex is connected to its nearest existing neighbor.
In order to represent the quality of these vertices, each ver-
tex stores two values: an estimated covariance Σand a path
cost c. The estimated covariance at any vertex is unique to the
path taken to reach that vertex, and encodes a representation
of the uncertainty of the vehicle. This uncertainty is a combi-
nation of both the quality of the viewpoint as well as the path
taken to reach it. It is important to note here that this covari-
ance is purely indicative of the quality of the viewpoint for the
planner’s own purpose, and does not actually represent pose
covariance that one would observe through localization. But
the covariance matrices are expressed in such a way that the
change in covariance over a path actually does match what is
expected during localization. The path cost cis the sum of
the Euclidean distances of all the edges traversed to reach this
vertex. For each vertex added, these two values are computed
and stored as an objective function J. For any vertex vi, and a
possible way to reach it nj, the effect of the estimated covari-
ance is represented through the trace of the matrix nj.Σ, and
the path cost is represented as nj.c.
Jvi.nj=wc(vi.nj.c) + wΣ(Tr(vi.nj.Σ)) (1)
During the course of the algorithm, new samples help discover
new ways to reach the same vertex. Some of these new paths
could take the MAV through texture-rich areas, thus affording
better localization. Hence, to help the planner drive the vehi-
cle to regions that can improve the belief, the RRBT algorithm
examines the best way of reaching a vertex by comparing the
objective functions between various possible paths. The path
that results in the least value of this objective function is con-
sidered to be the best until a new one is discovered.
The parameters wcand wΣin equation 1 are considered to be
tuning parameters. If the value of this objective function for
one path n1is higher than another value coming from a differ-
ent path n2, the edge responsible for n1is pruned, thus making
the path creating n2the preferred one. If this happens, the ver-
tex that was modified is added to a queue, and the belief dom-
ination check repeats for its neighbors. This queue ensures
that the discovery of areas affording good measurements and
higher reduction in uncertainty is updated recursively through
the entire existing graph, rewiring the whole graph. The ob-
jective function in equation 4 is also responsible for a cost
vs uncertainty reduction trade-off. Setting wΣto zero would
cause the algorithm to perform similar to an RRT*, choosing
paths solely based on minimizing the travel distance. Increas-
ing the value of wΣwould make the algorithm prefer covari-
ance minimization over path cost.
Viewpoint selection for inter-MAV estimation
In this subsection, we discuss how the navigation framework
is able to identify situations where inter-MAV estimation is
necessary, and attempts to come up with the best locations to
command these relative measurements. We recall here, that
as mentioned in section previously, the inter-MAV estimation
step consists of one vehicle generating a relative measurement
to another vehicle: which in turn fuses this estimate with its
own internal estimate for a corrected version of its pose. From
the perspective of the planning framework, the relative mea-
surement’s accuracy depends on the same factors that map-
ping would usually depend upon: such as overlap, baseline
etc. In order to implement inter-vehicle estimation within the
planning framework, we combine the RRBT based individual
path planning with the next-best-view planner that can iden-
tify potentially good viewpoints for multiple vehicles to share
information at.
Once the vehicles have access to an initial path plan generated
from the RRBT algorithm, all the paths are sampled equally
in order to have a rough estimate of the locations of vehicles
over time. Then, the estimated values of covariances are an-
alyzed to find out at which locations the covariances actually
deteriorate. In our algorithm, we consider samples at which
the trace of the covariance becomes twice the initial value as
a marker for deterioration: and at these samples, a search is
conducted over a certain radius to check whether any other
vehicles would be present near that location. If another ve-
hicle is found (based on the estimated paths), a slice of the
3D volume the vehicles are navigating in is isolated and the
NBV planner is executed in order to find a pair of viewpoints
that optimize the multi-vehicle heuristic factors. We note here
that because the slice of the environment considered is consid-
erably small compared to a usual search space, this helps the
NBV planner to converge quicker. Through this application of
the NBV planner, a potential pair of viewpoints is generated,
through which an inter-vehicle estimation would be possible,
and subsequently would result in a lower covariance.
In figure 5(b), we show a simple test case. An existing map
containing 3D points is shown in black, and two MAVs are
located at two start positions (dark blue), and are required to
move to individual goals (in green). Normally, if the RRBT
(a) (b)
Fig. 5. Combining NBV and RRBT planning: In case the
constraints on RRBT do not allow for sufficient reduction
in covariance, it is balanced by attempting inter-MAV lo-
calization wherever possible.
algorithm were applied to one of these MAVs, it would re-
sult in a path as in 5(a), where it moves closer to the features
to improve the covariance, and then to the goal. To demon-
strate a requirement for inter-MAV localization, we constrain
the weight on the covariance part of the objective function:
which means the MAVs are not allowed to move forwards
closer to the map to obtain better measurements. In such a
case, the existence of the other MAV can be exploited to im-
prove the individual measurements. The algorithm isolates the
node where the covariance becomes twice its original value,
isolates a slice of area around the MAVs and attempts to gen-
erate viewpoints for an inter-MAV measurement. In 5(c), we
see that both the RRBT paths is modified to account for this
newly generated pair of points (indicated in yellow) which
also tries to maximize feature overlap by moving the MAVs
closer to each other. After this measurement, the MAVs pro-
ceed to their goals.
Map updates
During navigation in sparsely populated areas, the visibility
of a mapped area could decrease over time due to long dis-
tance motion of the vehicles. This is also especially common
in indoor spaces where a constant scene might not be visi-
ble throughout a trajectory, thus necessitating updating of the
map. Such cases are identified during RRBT path planning by
evaluating the profile of the factor ‘visibility’ throughout the
planned path. As mentioned before, any random pose sample
that results in zero visibility is immediately discarded, hence
by examining the path it is possible to identify the last loca-
tion at which some features are still visible. Similar to inter-
vehicle estimation, we attempt to isolate nearby vehicles and
attempt to build a map. If a new map is built with additional
features, the RRBT algorithm is run again on each vehicle in
order to update the existing path and provide a new way to
reach the goal from the current location.
Smooth trajectory generation
As a final step during path planning, it is important to gener-
ate smooth plans that are suitable for navigation with an aerial
vehicle. The algorithms such as the RRBT in this application
are only responsible for generating coarse paths that essen-
tially identify ‘good’ areas in the map, as well as necessary
yaw angles etc. that result in sufficient visibility of feature-
rich areas. In order to improve these plans before actual nav-
igation, we use the information of connected samples from
the path to generate a smoother trajectory. We use the poly-
nomial trajectory generation approach presented in (Ref. 17)
to generate these smooth paths. This approach considers a
differentially flat representation of the vehicle model and con-
nects the coarse path segments into a smooth path through an
optimization between the time taken for traversal and the snap
of the path.
(a) (b) (c)
Fig. 6. NBV planner attempting to improve a map. An
initial point cloud of 3D features (a) is provided as input
to the algorithm, where it tries to optimize an objective
function and results in new viewpoints (b). Once the MAVs
move to these new poses and reconstruct a map again, it
results in a significantly better map. (c)
In this section, we demonstrate the application of the naviga-
tion framework to a group of MAVs and discuss how it results
in uncertainty-aware planning and thereby more accurate lo-
calization. We perform our experiments in the recently re-
leased UAV simulator Microsoft AirSim (Ref. 18). AirSim
is written atop Unreal Engine, a widely used videogame en-
gine that provides photorealistic textures, lighting, soft shad-
ows and various other high fidelity visual features that re-
sult in close-to-real-life settings to test computer vision al-
gorithms. AirSim contains basic flight control algorithms
and position/velocity control APIs, while providing access to
camera streams onboard the drones. We record front camera
views at an approximate frequency of 5 Hz as the vehicles
Fig. 7. Working of the RRBT planner. Intra-MAV localiza-
tion can be improved by driving the MAV towards feature-
rich areas if the weight on the covariance in the objective
function is higher (a). On the other hand, if the weight
on the path cost is higher, the algorithm returns shorter
paths, but still ensuring feature visibility. (b)
move through various trajectories. Throughout this section,
we present various test cases aimed at each feature of the nav-
igation framework described in the previous section.
Next best view planning
Here, we show a sample result from the next best view plan-
ner, which, when given a sparse map and initial locations of
two MAVs, generates viewpoints for both MAVs that can re-
sult in a better map. The initial map and the locations of the
MAVs are shown in figure 6(a). The NBV planner is run for a
maximum of 150 iterations, and computes a pair of viewpoints
that resulted in the least value of the objective function ( 6(b)).
Once obtained, we obtain new images from these viewpoints
for both MAVs and it can be seen that the reconstruction from
this pair of viewpoints results in a significantly improved map
RRBT planning for intra-MAV estimation
In this section, we show sample results from the RRBT plan-
ner. As mentioned before, this planner requires a 3D map
as well as user-defined start and goal states. The planner is
also capable of producing collision-free paths as samples in
proximity to mapped points are discarded. In figure 7, we
show the working of the RRBT planner and how it changes
with changes in the weighting factors. In figure 7(a), more
(a) (b) (c)
Fig. 8. NBV+RRBT planner working on a pair of MAVs. (a) shows the planner result with markers pointing out where
the relative measurements are commanded. (b) compares the RMS errors of inter-MAV localization vs intra-MAV,
showing that for the same images, the former is more accurate. (c) proves that adding inter-MAV localization data
improves overall position error.
Fig. 9. RRBT planner working on three MAVs individu-
weight was given to the covariance of the poses in the ob-
jective function, and this results in a path that acquires more
‘good’ measurements with proximity to the texture, at the ex-
pense of higher path cost. In 7(b), we see the opposite, be-
cause of higher weight on the path cost parameter. This results
in a path that tries to minimize path cost, but still ensures some
visibility of the features, without which localization would not
be possible.
In figure 9, we show the same RRBT algorithm now applied
to multiple MAVs. Two of the three MAVs start at locations
with bad visibility of features (paths in blue and red), but
quickly fly towards the center to observe more features, be-
fore reaching the goal. A balanced choice of cost and covari-
ance weights results in a equal trade-off between path cost and
feature visibility.
Next best view planning for map updating
In figure 10, we show the results from a sample map update
scenario using the NBV planner. During the planning stage,
the MAVs keep track of the last possible vertex where they
had enough visibility of features, and these are used as seed
poses for the NBV planner to optimize. At each such update
step, a new map is built and added to the existing 3D map.
In (a)-(d), we show the incremental updates to the map from
each NBV optimization and subsequent reconstruction. In (f),
the final full map is shown.
RRBT+NBV planning for inter-MAV estimation
In this section, we show some test cases that demonstrate the
working of the extension to the RRBT that allows for identi-
fying the requirement of inter-MAV localization.
A sample map generated from AirSim is used as the 3D map,
and path planning is attempted for two MAVs moving back-
wards by a distance of 25 m With constraints placed on the
path cost allowed during the trajectory, the planner is forced
to compute a low-cost path, which would naturally increase
the covariances of the MAV poses during navigation. Hence,
the planner attempts to generate locations at which inter-MAV
measurements would be necessary (based on the increase in
covariance) and possible. These locations are marked as yel-
low spheres in the image. At these locations, the images cap-
tured by the MAVs are used to generate relative pose mea-
surements from MAV 0 to MAV 1, thus attempting to correct
the pose of MAV 1. The final plan along with the inter-MAV
measurement locations are shown figure 8(a), with the loca-
tions marked as yellow spheres.
In figure 8(c), we see that generating inter-MAV measure-
ments does actually result in improved localization for MAV
1. These plans along with the inter-MAV measurements were
executed within AirSim to obtain actual localization results.
We notice that without inter-intra fusion, there is more drift
in the position of MAV 1, but with these fused measurements,
the error between ground truth and the estimated position is
lower. At the same time, we plot the RMS errors of the re-
projection from MAV1 in order to compare the accuracy of
vision based localization in inter vs intra measurements. For
(a) (b) (c) (d)
(e) (f)
Fig. 10. Map updates commanded during trajectory execution. (a) to (d) show the incremental development in the map.
(e) and (f) compare the actual scene from AirSim with the final map generated.
the same image, intra-MAV localization results in more repro-
jection error (the farther the camera from the scene, the higher
this could be); whereas incorporating the additional source of
information from MAV 0 results in lower RMS error. This is
due to the inherent robustness of the feature-matching based
pose estimation algorithm compared to that of the PNP algo-
Another test case is presented in figure 11. In this case, three
MAVs exist in a group observing a building and the associ-
ated 3D map. Only one MAV, the one in the center, has full
visibility of the features, whereas the others can only observe
a small subset. Similar to before, we set the weight on the
covariance parameter to zero, thus forcing the system to rely
on inter-MAV measurements for effective navigation because
path cost needs to be minimized. Because of the low values
of visibility and span (note figure 11(a) and 11(b)) of the fea-
tures, the algorithm results in a plan that requires inter-MAV
localization to be performed at every step as it would result
in lower covariance than the individual localization. In figure
11(c), we show the resultant plan and the instants at which
the inter-MAV measurements are commanded (all the steps,
displayed as small spheres for two of the MAVs). In order
to evaluate this, we executed these trajectories within AirSim,
but with both inter and intra-MAV localization for the sake of
comparison. As expected by the planning algorithm, perform-
ing solely intra-MAV localization, while it still works because
of non-zero feature visibility, is highly erratic (figure 11(d)),
whereas inter-MAV localization results in a better profile of
the position, and is closer to the ground truth.
Smooth trajectory generation
As mentioned before, the planner is only responsible for gen-
erating coarse paths that connect start and goal states with
pose samples affording better visibility and those that could
help reduce covariance. The connections between these rough
locations might not always be feasible for aerial navigation,
nor are they smooth, which could result in undesirable maneu-
vers for the vehicle. In figure 12, we show an example of how
a coarse path from the planner is converted into a smooth path.
All the locations returned by the planner are still part of the
final smoothed path. Another sample case is presented in 13,
where it can be noticed that the smoothing is performed over
all four dimensions of position and yaw (10(b)), thus ensur-
ing the cameras are always pointed at the desired texture-rich
In this paper, we presented a collaborative uncertainty-aware
navigation framework for micro aerial vehicle swarms. The
MAVs are equipped with forward facing monocular cam-
eras, using which they can localize through feature-based ap-
proaches. The localization algorithm has two modes of opera-
tion: pose estimation is performed either individually or using
multiple MAVs, and map building is performed using mul-
tiple MAVs. The navigation framework is also a multi-step
approach. We use an evolutionary algorithm (CMA-ES) as a
next-best-view (NBV) planner to identify best viewpoints for
map building and updates as well as relative pose estimation.
This allows for improving the 3D map and generate accurate
relative poses, thus increasing the accuracy of the localization
as a whole. To perform path planning for each vehicle, we
use a rapidly exploring random belief tree (RRBT) algorithm
which attempts to generate collision free trajectories for the
MAVs but also makes sure that each MAV has sufficient num-
ber of features visible throughout the path, and also results in
a lower covariance at the goal.
Our approach was validated through image datasets from the
photorealistic UAV simulator Microsoft AirSim. We have
(a) (b)
Fig. 11. Planner attempts to generate inter-MAV localiza-
tion viewpoints for a challenging scenario. The MAV that
has the most visibility of features is commanded to com-
pute relative poses at every step for the others (c), because
intra-MAV localization would result in a too high covari-
ance. (d) validates this by comparing position errors from
inter vs intra-MAV localization.
demonstrated that through this navigation framework, MAVs
were able to build improved maps, and also that the paths re-
sulting from the planner, when executed in the simulator, did
result in improved localization accuracy as compared to other
methods such as shortest path. We have also demonstrated
that the navigation framework is able to incorporate multi-
robot constraints and command relative pose measurements
when required, which in turn helps improve localization of
the vehicles through information fusion. Finally, we showed
that the framework results in smooth paths that are suitable
for MAV navigation.
In terms of future work, we aim to combine our localization
and navigation algorithms and apply them on real world im-
age datasets, thus validating the algorithm in realistic settings.
At the same time, we wish to implement the algorithms in
real time and onboard a group of MAVs, for which we aim
to optimize the algorithm’s performance and also to combine
all the features in a single, ROS-based framework. Further
work would involve extensive testing and validation of the al-
(a) (b)
Fig. 12. Coarse path returned by the RRBT planner (a)
converted into a smooth path (b). Scene represented as an
occupancy grid.
(a) (b)
Fig. 13. Another sample case of smooth trajectory gener-
ation. Both position and yaw changes are converted into
smooth transitions between vertices, while still adhering
to constraints from the vision heuristics such as visibility.
gorithm in different real world scenarios.
Author contact: Sai Vemprala,;
Srikanth Saripalli
1Sai Vemprala and Srikanth Saripalli. Vision based collabo-
rative path planning for micro aerial vehicles. In Robotics and
Automation (2018), 2018 IEEE International Conference on.
2018, accepted, publication pending.
2Sam Prentice and Nicholas Roy. The belief roadmap: Ef-
ficient planning in linear pomdps by factoring the covariance.
In Robotics Research, pages 293–305. Springer, 2010.
3Robert Platt Jr, Russ Tedrake, Leslie Kaelbling, and Tomas
Lozano-Perez. Belief space planning assuming maximum
likelihood observations. 2010.
4Adam Bry and Nicholas Roy. Rapidly-exploring random
belief trees for motion planning under uncertainty. In Robotics
and Automation (ICRA), 2011 IEEE International Conference
on, pages 723–730. IEEE, 2011.
5Jur Van Den Berg, Sachin Patil, and Ron Alterovitz. Mo-
tion planning under uncertainty using iterative local optimiza-
tion in belief space. The International Journal of Robotics
Research, 31(11):1263–1278, 2012.
6Ali-Akbar Agha-Mohammadi, Suman Chakravorty, and
Nancy M Amato. Firm: Sampling-based feedback motion-
planning under motion uncertainty and imperfect measure-
ments. The International Journal of Robotics Research,
33(2):268–304, 2014.
7Vinay Pilania and Kamal Gupta. A localization aware sam-
pling strategy for motion planning under uncertainty. In Intel-
ligent Robots and Systems (IROS), 2015 IEEE/RSJ Interna-
tional Conference on, pages 6093–6099. IEEE, 2015.
8Vadim Indelman. Towards cooperative multi-robot belief
space planning in unknown environments. In Proc. of the Intl.
Symp. of Robotics Research (ISRR), 2015.
9Markus W Achtelik, Simon Lynen, Stephan Weiss, Mar-
garita Chli, and Roland Siegwart. Motion-and uncertainty-
aware path planning for micro aerial vehicles. Journal of Field
Robotics, 31(4):676–698, 2014.
10Gabriele Costante, Christian Forster, Jeffrey Delmerico,
Paolo Valigi, and Davide Scaramuzza. Perception-aware path
planning. arXiv preprint arXiv:1605.04151, 2016.
11Enrique Dunn and Jan-Michael Frahm. Next best view
planning for active model improvement.
12Oscar Mendez, SJ Hadfield, Nicolas Pugeault, and Richard
Bowden. Next-best stereo: extending next best view optimi-
sation for collaborative sensors. Proceedings of BMVC 2016,
13Andreas Bircher, Mina Kamel, Kostas Alexis, Helen
Oleynikova, and Roland Siegwart. Receding horizon” next-
best-view” planner for 3d exploration. In Robotics and Au-
tomation (ICRA), 2016 IEEE International Conference on,
pages 1462–1468. IEEE, 2016.
14Sai Vemprala and Srikanth Saripalli. Monocular vi-
sion based collaborative localization for micro aerial vehicle
swarms. arXiv:cs.RO, arXiv:abs/1804.02510, 2018.
15Sai Vemprala and Srikanth Saripalli. Vision based col-
laborative localization for multirotor vehicles. In Intelligent
Robots and Systems (IROS), 2016 IEEE/RSJ International
Conference on, pages 1653–1658. IEEE, 2016.
16Nikolaus Hansen and Andreas Ostermeier. Adapting ar-
bitrary normal mutation distributions in evolution strategies:
The covariance matrix adaptation. In Evolutionary Compu-
tation, 1996., Proceedings of IEEE International Conference
on, pages 312–317. IEEE, 1996.
17Charles Richter, Adam Bry, and Nicholas Roy. Polynomial
trajectory planning for aggressive quadrotor flight in dense in-
door environments. In Robotics Research, pages 649–666.
Springer, 2016.
18Shital Shah, Debadeepta Dey, Chris Lovett, and Ashish
Kapoor. Airsim: High-fidelity visual and physical simula-
tion for autonomous vehicles. In Field and Service Robotics,
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
We present a framework for localizing a swarm of multirotor micro aerial vehicles (MAV) through collaboration using vision based sensing. For MAVs equipped with monocular cameras, this technique, built upon a relative pose estimation strategy between two or more cameras, enables the MAVs to share information of a common map and thus estimate accurate metric poses between each other even through fast motion and changing environments. Synchronized feature detection, matching and robust tracking enable the use of multiple view geometry concepts for performing the estimation. Furthermore, we present the implementation details of this technique followed by a set of results which involves evaluation of the accuracy of the pose estimates through test cases in both simulated and real experiments. Our test cases involve a group of quadrotors in simulation, as well as real world flight tests with two MAVs.
Conference Paper
Full-text available
Most 3D reconstruction approaches passively optimise over all data, exhaustively matching pairs, rather than actively selecting data to process. This is costly both in terms of time and computer resources, and quickly becomes intractable for large datasets. This work proposes an approach to intelligently filter large amounts of data for 3D reconstructions of unknown scenes using monocular cameras. Our contributions are twofold: First, we present a novel approach to efficiently optimise the Next-Best View (NBV) in terms of accuracy and coverage using partial scene geometry. Second, we extend this to intelligently selecting stereo pairs by jointly optimising the baseline and vergence to find the NBV's best stereo pair to perform reconstruction. Both contributions are extremely efficient, taking 0.8ms and 0.3ms per pose, respectively. Experimental evaluation shows that the proposed method allows efficient selection of stereo pairs for reconstruction, such that a dense model can be obtained with only a small number of images. Once a complete model has been obtained, the remaining computational budget is used to intelligently refine areas of uncertainty, achieving results comparable to state-of-the-art batch approaches on the Middlebury dataset, using as little as 3.8% of the views.
Full-text available
In this paper, we give a double twist to the problem of planning under uncertainty. State-of-the-art planners seek to minimize the localization uncertainty by only considering the geometric structure of the scene. In this paper, we argue that motion planning for vision-controlled robots should be perception aware in that the robot should also favor texture-rich areas to minimize the localization uncertainty during a goal-reaching task. Thus, we describe how to optimally incorporate the photometric information (i.e., texture) of the scene, in addition to the the geometric one, to compute the uncertainty of vision-based localization during path planning. To avoid the caveats of feature-based localization systems (i.e., dependence on feature type and user-defined thresholds), we use dense, direct methods. This allows us to compute the localization uncertainty directly from the intensity values of every pixel in the image. We also describe how to compute trajectories online, considering also scenarios with no prior knowledge about the map. The proposed framework is general and can easily be adapted to different robotic platforms and scenarios. The effectiveness of our approach is demonstrated with extensive experiments in both simulated and real-world environments using a vision-controlled micro aerial vehicle.
We investigate the problem of cooperative multi-robot planning in unknown environments, which is important in numerous applications in robotics. The research community has been actively developing belief space planning approaches that account for the different sources of uncertainty within planning, recently also considering uncertainty in the environment observed by planning time. We further advance the state of the art by reasoning about future observations of environments that are unknown at planning time. The key idea is to incorporate within the belief indirect multi-robot constraints that correspond to these future observations. Such a formulation facilitates a framework for active collaborative state estimation while operating in unknown environments. In particular, it can be used to identify best robot actions or trajectories among given candidates generated by existing motion planning approaches, or to refine nominal trajectories into locally optimal trajectories using direct trajectory optimization techniques. We demonstrate our approach in a multi-robot autonomous navigation scenario and show that modeling future multi-robot interaction within the belief allows to determine robot trajectories that yield significantly improved estimation accuracy.
Localization and state estimation are reaching a certain maturity in mobile robotics, often providing both a precise robot pose estimate at a point in time and the corresponding uncertainty. In the bid to increase the robots' autonomy, the community now turns to more advanced tasks, such as navigation and path planning. For a realistic path to be computed, neither the uncertainty of the robot's perception nor the vehicle's dynamics can be ignored. In this work, we propose to specifically exploit the information on uncertainty, while also accounting for the physical laws governing the motion of the vehicle. Making use of rapidly exploring random belief trees, here we evaluate offline multiple path hypotheses in a known map to select a path exhibiting the motion required to estimate the robot's state accurately and, inherently, to avoid motion in modes, where otherwise observable states are not excited. We demonstrate the proposed approach on a micro aerial vehicle performing visual-inertial navigation. Such a system is known to require sufficient excitation to reach full observability. As a result, the proposed methodology plans safe avoidance not only of obstacles, but also areas where localization might fail during real flights compensating for the limitations of the localization methodology available. We show that our planner actively improves the precision of the state estimation by selecting paths that minimize the uncertainty in the estimated states. Furthermore, our experiments illustrate by comparison that a naive planner would fail to reach the goal within bounded uncertainty in most cases.
In this paper we present feedback-based information roadmap (FIRM), a multi-query approach for planning under uncertainty which is a belief-space variant of probabilistic roadmap methods. The crucial feature of FIRM is that the costs associated with the edges are independent of each other, and in this sense it is the first method that generates a graph in belief space that preserves the optimal substructure property. From a practical point of view, FIRM is a robust and reliable planning framework. It is robust since the solution is a feedback and there is no need for expensive replanning. It is reliable because accurate collision probabilities can be computed along the edges. In addition, FIRM is a scalable framework, where the complexity of planning with FIRM is a constant multiplier of the complexity of planning with PRM. In this paper, FIRM is introduced as an abstract framework. As a concrete instantiation of FIRM, we adopt stationary linear quadratic Gaussian (SLQG) controllers as belief stabilizers and introduce the so-called SLQG-FIRM. In SLQG-FIRM we focus on kinematic systems and then extend to dynamical systems by sampling in the equilibrium space. We investigate the performance of SLQG-FIRM in different scenarios.