Figure 3 - uploaded by Kevin Doherty

Content may be subject to copyright.

# 3D dynamic scene graphs (129). These are a recent application of scene graphs (previously common in the computer graphics community) to the SLAM problem and provide a substantial step toward linking scene understanding and spatial perception methods. Figure courtesy of A. Rosinol.

Source publication

Simultaneous localization and mapping (SLAM) is the process of constructing a global model of an environment from local observations of it; this is a foundational capability for mobile robots, supporting such core functions as planning, navigation, and control. This article reviews recent progress in SLAM, focusing on advances in the expressive cap...

## Context in source publication

**Context 1**

... particular, 3D scene graph models (130,129) present a promising representational direction toward capturing object-level semantics, environment dynamics, and multiple spatial and semantic layers of abstraction (from the connectedness of unoccupied space, to rooms and buildings, and beyond). Scene graphs model the environment in terms of a directed graph where nodes can be entities such as objects or places and edges represent relationships between entities (depicted in Figure 3). The relationships modeled by a scene graph may be spatial or logical. ...

## Similar publications

Simultaneous localization and mapping (SLAM) is the process of constructing a global model of an environment from local observations of it; this is a foundational capability for mobile robots, supporting such core functions as planning, navigation, and control. This article reviews recent progress in SLAM, focusing on advances in the expressive cap...

It is essential to promote the intelligence and autonomy of Maritime Autonomous Surface Ships (MASSs). This study proposed an automatic collision-avoidance method based on an improved Artificial Potential Field (APF) with the formation of MASSs (F-MASSs). Firstly, the navigation environment model was constructed by the S-57 Electronic Navigation Ch...

## Citations

... In this process, how to reduce the uncertainty caused by sensor errors is very important. However, the neural network model itself has certain uncertainty, so when deep learning is introduced into SLAM, the uncertainty brought by deep learning is a factor to be dealt with [6]. Sünderhauf et al. [7] believe that the perception, decision-making, and action of robots all depend on incomplete and uncertain priori knowledge. ...

In recent years, some researchers have combined deep learning methods such as semantic segmentation with a visual SLAM to improve the performance of classical visual SLAM. However, the above method introduces the uncertainty of the neural network model. To solve the above problems, an improved feature selection method based on information entropy and feature semantic uncertainty is proposed in this paper. The former is used to obtain fewer and higher quality feature points, while the latter is used to correct the uncertainty of the network in feature selection. At the same time, in the initial stage of feature point selection, this paper first filters and eliminates the absolute dynamic object feature points in the a priori information provided by the feature point semantic label. Secondly, the potential static objects can be detected combined with the principle of epipolar geometric constraints. Finally, the semantic uncertainty of features is corrected according to the semantic context. Experiments on the KITTI odometer data set show that compared with SIVO, the translation error is reduced by 12.63% and the rotation error is reduced by 22.09%, indicating that our method has better tracking performance than the baseline method.

... The vehicle can use the scanned information to construct a 2D map of its surroundings, including walls and pillars. The robot may determine its position using SLAM by comparing the local map to the factory map [61]. ...

Autonomy offers significant advantages for mobile robots by eliminating the need for human operators, thereby enhancing safety and cost-effectiveness. Path planning is an essential component of achieving autonomy, as it empowers robots to thoughtfully navigate between different areas. This study explores the most recent developments in automated guided vehicles (AGVs) and autonomous mobile robots during the previous ten years. It encompasses a wide range of AGV research topics from both historical and contemporary perspectives. AGVs play a vital role in modern logistics networks, offering time savings and the potential to minimize wear and capital costs through efficient path planning. Numerous approaches to aid in the path-planning procedure for mobile robotics have been suggested and documented in scholarly research. While perfection is not guaranteed, these methods have demonstrated impressive efficacy in practical applications. The study evaluates models, optimization benchmarks, and solution techniques employed for charting optimal courses for mobile robots. Both field researchers and AGV developers encounter challenges in navigating the expanding array of algorithms designed for diverse applications. Digital twins emerge as pivotal tools in AGV systems, contributing to the development and implementation of control algorithms. This research aims to do a comprehensive examination of various AGV-related control strategies and cutting-edge algorithms, including those used in early models and more recent AGV systems.

... We address this issue by leveraging a recent body of work in the robotics and vision communities that deals with socalled certifiably correct methods [58]. These methods use convex semidefinite relaxations of non-convex, polynomial optimization problems (POPs) to either directly find a global optimum or provide a certificate of global optimality for a given solution. ...

... In this section, we review the well-known procedure for deriving convex, SDP relaxations of a standard form of POP. This procedure was pioneered by Shor [59] and has become the cornerstone of certifiably correct methods in robotics and computer vision [58,10]. ...

Differentiable optimization is a powerful new paradigm capable of reconciling model-based and learning-based approaches in robotics. However, the majority of robotics optimization problems are non-convex and current differentiable optimization techniques are therefore prone to convergence to local minima. When this occurs, the gradients provided by these existing solvers can be wildly inaccurate and will ultimately corrupt the training process. On the other hand, any non-convex robotics problems can be framed as polynomial optimization problems and, in turn, admit convex relaxations that can be used to recover a global solution via so-called certifiably correct methods. We present SDPRLayers, an approach that leverages these methods as well as state-of-the-art convex implicit differentiation techniques to provide certifiably correct gradients throughout the training process. We introduce this approach and showcase theoretical results that provide conditions under which correctness of the gradients is guaranteed. We demonstrate our approach on two simple-but-demonstrative simulated examples, which expose the potential pitfalls of existing, state-of-the-art, differentiable optimization methods. We apply our method in a real-world application: we train a deep neural network to detect image keypoints for robot localization in challenging lighting conditions. An open-source, PyTorch implementation of SDPRLayers will be made available upon paper acceptance.

... Products of orthogonal matrices appear in applications of orthogonal group synchronisation [29]. The simultaneous localization and mapping problem in robotics involves optimization over a product of Stiefel manifolds [34]. ...

We address the problem of minimizing a smooth function under smooth equality constraints. Under regularity assumptions on these constraints, we propose a notion of approximate first- and second-order critical point which relies on the geometric formalism of Riemannian optimization. Using a smooth exact penalty function known as Fletcher’s augmented Lagrangian, we propose an algorithm to minimize the penalized cost function which reaches ε\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varepsilon $$\end{document}-approximate second-order critical points of the original optimization problem in at most O(ε-3)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\mathcal {O}}(\varepsilon ^{-3})$$\end{document} iterations. This improves on current best theoretical bounds. Along the way, we show new properties of Fletcher’s augmented Lagrangian, which may be of independent interest.

... Terrestrial laser scanning (TLS) can capture the structural elements of a forest understory, provide a three-dimensional image of the assessed space, and quantify structural elements at a high resolution (Eichhorn et al., 2017) for use in ecological applications. Mobile laser scanning (MLS) is an emerging alternative to TLS which allows virtual reconstruction of the forest understory's key structural parameters through simultaneous localization and mapping, so-called SLAM, using automatic feature recognition to render a spatially accurate point cloud of the scanned area (Rosen et al., 2021). ...

Forest understory complexity is important for many species, from large herbivores such as deer to small mammals such as mice and voles. For species that utilize the forest understory on a very small scale, it is often impractical to conduct correspondingly fine‐grained manual surveys of the understory, and thus few studies consider this small‐scale variation in understory complexity and instead work with average values on a larger scale. We explored the use of a mobile laser scanning derived understory complexity measure—understory roughness—to predict the capture probability of two representative small mammal species, the yellow‐necked mouse ( Apodemus flavicollis ) and the bank vole ( Clethrionomys glareolus ). We found a positive relationship between capture probability and understory roughness for both bank voles and yellow‐necked mice. Our results suggest that mobile laser scanning is a promising technology for measuring understory complexity in an ecologically meaningful way.

... Existing surveys on SLAM have reviewed the fundamental challenges for accurate and robust large-scale applications [15], [16], [17], from early probabilistic approaches and data association [18], [19] to the potential use of deep learning [20]. SLAM components, including sensors to the embedded localization [21] have been intensively studied to provide a robust solution to many applications, like autonomous driving [22], search and rescue tasks, infrastructure inspection and 3D reconstruction in static and dynamic environments [23] with challenging conditions [24]. ...

... II. SLAM PIPELINE FROM SENSORS TO 3D RECONSTRUCTION The SLAM community has made remarkable improvement on the accuracy and robustness of large-scale applications in the recent years [16], [17], [15]. Figure 1 illustrates a conventional 3D scene reconstruction pipeline using imaging sensors and inertial measurements as inputs. ...

The 3D reconstruction of simultaneous localization and mapping (SLAM) is an important topic in the field for transport systems such as drones, service robots and mobile AR/VR devices. Compared to a point cloud representation, the 3D reconstruction based on meshes and voxels is particularly useful for high-level functions, like obstacle avoidance or interaction with the physical environment. This article reviews the implementation of a visual-based 3D scene reconstruction pipeline on resource-constrained hardware platforms. Real-time performances, memory management and low power consumption are critical for embedded systems. A conventional SLAM pipeline from sensors to 3D reconstruction is described, including the potential use of deep learning. The implementation of advanced functions with limited resources is detailed. Recent systems propose the embedded implementation of 3D reconstruction methods with different granularities. The trade-off between required accuracy and resource consumption for real-time localization and reconstruction is one of the open research questions identified and discussed in this paper.

... For applications like autonomous mobile robots (AMR) the WM is often a geometric map, which can be provided a-priori or created by the robot itself using Simultaneous Localization and Mapping (SLAM) [3]. New sensor data is compared to the geometric map for localization and object tracking, e.g. using particle filtering [4]. ...

Robots that have to robustly execute their task in an environment containing many variations need situational awareness to adapt at run-time. This work proposes a knowledge-centered software architecture with a world model (WM) as a first class citizen, from which other software components can query information in order to infer predictions, configure skills, and monitor the progress of the task. This approach is demonstrated on the task of detecting tomato trusses hanging from a plant, with possible occlusions from leaves. A Labeled Property Graph is used to model a tomato plant, which can be queried to create predictions of truss locations. This information is used to configure two tomato detection skills. First the plant is passively scanned for trusses. Association of the obtained information to the semantic objects in the model leads to multiple semantic hypotheses, that are explicitly modeled in the graph world model. If trusses are missing according to a hypothesis the second skill actively looks at inferred position of the undetected trusses. Tests shows that this approach of context-aware active perception allows the robot to decide when to look for missing trusses, which improves the detection of occluded trusses. Moreover, by keeping the task-, skill- and semantic association functionalities agnostic to the context, but relying on the answers to the queries to the world model, the approach is composable and flexible. This is shown by a qualitative test on a different tomato plant.

... S TATE-ESTIMATION is an integral component of modern robotics systems. Workhorse algorithms for stateestimation -such as localization and simultaneous localization and mapping (SLAM) -are now capable of estimating hundreds of thousands of states on a single processor in real time [48] and are far from the computational bottleneck of robotic systems. To obtain such levels of performance, these algorithms typically rely on local optimization methods (e.g., Gauss-Newton), which often exhibit super-linear convergence. ...

... low-rank nature of its semidefinite program (SDP) relaxation. A series of extensions to this method have been and continue to be developed [48]. ...

... Some of these methods boast runtimes that even rival stateof-the-art, local methods (e.g., Gauss-Newton-based methods [21]), with the added guarantee of a global certificate [10], [36]. An excellent review of the current state of certifiable methods is provided in [48]. ...

In recent years, there has been remarkable progress in the development of so-called certifiable perception methods, which leverage semidefinite, convex relaxations to find global optima of perception problems in robotics. However, many of these relaxations rely on simplifying assumptions that facilitate the problem formulation, such as an isotropic measurement noise distribution. In this paper, we explore the tightness of the semidefinite relaxations of matrix-weighted (anisotropic) state-estimation problems and reveal the limitations lurking therein: matrix-weighted factors can cause convex relaxations to lose tightness. In particular, we show that the semidefinite relaxations of localization problems with matrix weights may be tight only for low noise levels. We empirically explore the factors that contribute to this loss of tightness and demonstrate that redundant constraints can be used to regain tightness, albeit at the expense of real-time performance. As a second technical contribution of this paper, we show that the state-of-the-art relaxation of scalar-weighted SLAM cannot be used when matrix weights are considered. We provide an alternate formulation and show that its SDP relaxation is not tight (even for very low noise levels) unless specific redundant constraints are used. We demonstrate the tightness of our formulations on both simulated and real-world data.

... Meta AI, 2 Imperial College London,3 Reality Labs Research,4 Northwestern University(a) Ladybug (b) Dubrovnik (c) Final ...

... The literature on orthogonal synchronization is vast, appearing in multiple communities such as robotics, image processing, signal processing, and dynamical systems. We highlight a few salient references here; see also [22,23] for partial surveys. Many of the tools we use in our analysis have been used before. ...

... The second equality uses (19) and (23). Next, SBD(EẎẎ ⊤ ) = (p + r − 2)I rn implies ...

Orthogonal group synchronization is the problem of estimating $n$ elements $Z_1, \ldots, Z_n$ from the orthogonal group $\mathrm{O}(r)$ given some relative measurements $R_{ij} \approx Z_i^{}Z_j^{-1}$. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from $O(n)$ to $O(n^2)$. Burer--Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension $O(n^{3/2})$. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that nonconvex relaxations of only slightly increased dimension seem sufficient for SLAM problems. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators $Y_1, \ldots, Y_n$. Each $Y_i$ is relaxed to the Stiefel manifold $\mathrm{St}(r, p)$ of $r \times p$ matrices with orthonormal rows. The available measurements implicitly define a (connected) graph $G$ on $n$ vertices. In the noiseless case, we show that second-order critical points are globally optimal as soon as $p \geq r+2$ for all connected graphs $G$. (This implies that Kuramoto oscillators on $\mathrm{St}(r, p)$ synchronize for all $p \geq r + 2$.) This result is the best possible for general graphs; the previous best known result requires $2p \geq 3(r + 1)$. For $p > r + 2$, our result is robust to modest amounts of noise (depending on $p$ and $G$). When local minima remain, they still achieve minimax-optimal error rates. Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group), showing that there are no spurious local minima when $2p \geq 3r$.