Michael Kaess's research while affiliated with Carnegie Mellon University and other places

Publications (161)

Preprint
Full-text available
We address the problem of map sparsification for long-term visual localization. For map sparsification, a commonly employed assumption is that the pre-build map and the later captured localization query are consistent. However, this assumption can be easily violated in the dynamic world. Additionally, the map size grows as new data accumulate throu...
Preprint
Spinning LiDAR data are prevalent for 3D perception tasks, yet its cylindrical image form is less studied. Conventional approaches regard scans as point clouds, and they either rely on expensive Euclidean 3D nearest neighbor search for data association or depend on projected range images for further processing. We revisit the LiDAR scan formation a...
Preprint
Full-text available
We address the problem of tracking 3D object poses from touch during in-hand manipulations. Specifically, we look at tracking small objects using vision-based tactile sensors that provide high-dimensional tactile image measurements at the point of contact. While prior work has relied on a-priori information about the object being localized, we remo...
Preprint
Full-text available
We present ASH, a modern and high-performance framework for parallel spatial hashing on GPU. Compared to existing GPU hash map implementations, ASH achieves higher performance, supports richer functionality, and requires fewer lines of code (LoC) when used for implementing spatially varying operations from volumetric geometry reconstruction to diff...
Article
In this letter, we present a direct visual odometry (VO) using points and lines. Direct methods generally choose pixels with sufficient gradients to minimize the photometric error for the status estimation. Pixels on lines are generally involved in this process. But the collinear constraint among these points are generally ignored, which may result...
Preprint
Knowledge of 3-D object shape is of great importance to robot manipulation tasks, but may not be readily available in unstructured environments. While vision is often occluded during robot-object interaction, high-resolution tactile sensors can give a dense local perspective of the object. However, tactile sensors have limited sensing area and the...
Preprint
Full-text available
We address the problem of learning observation models end-to-end for estimation. Robots operating in partially observable environments must infer latent states from multiple sensory inputs using observation models that capture the joint distribution between latent states and observations. This inference problem can be formulated as an objective ove...
Preprint
Full-text available
There has been exciting recent progress in using radar as a sensor for robot navigation due to its increased robustness to varying environmental conditions. However, within these different radar perception systems, ground penetrating radar (GPR) remains under-explored. By measuring structures beneath the ground, GPR can provide stable features that...
Article
Planes ubiquitously exist in the indoor environment. This paper presents a real-time and low-drift LiDAR SLAM system using planes as the landmark for the indoor environment. Our algorithm includes three components: localization, local mapping and global mapping. The localization component performs real-time and global registration, instead of the s...
Preprint
Full-text available
We address the problem of robot localization using ground penetrating radar (GPR) sensors. Current approaches for localization with GPR sensors require a priori maps of the system's environment as well as access to approximate global positioning (GPS) during operation. In this paper, we propose a novel, real-time GPR-based localization system for u...
Chapter
Perception systems for autonomy are most useful if they can operate within limited/predictable computing resources. Existing algorithms in robot navigation—e.g. simultaneous localization and mapping—employ concepts from filtering, fixed-lag, or incremental smoothing to find feasible inference solutions. Using factor graphs as a probabilistic modeli...
Article
Tactile perception is central to robot manipulation in unstructured environments. However, it requires contact, and a mature implementation must infer object models while also accounting for the motion induced by the interaction. In this work, we present a method to estimate both object shape and pose in real-time from a stream of tactile measureme...
Article
This paper presents a complete, accurate, and efficient solution for the Perspective-n-Line (PnL) problem. Generally, the camera pose can be determined from $N \geq 3$ 2D-3D line correspondences. The minimal problem $(N = 3)$ and the least-squares problem $(N \textgreater 3)$ are generally solved in different ways. This paper shows that a lea...
Preprint
Full-text available
We address the problem of estimating object pose from touch during manipulation under occlusion. Vision-based tactile sensors provide rich, local measurements at the point of contact. A single such measurement, however, contains limited information and multiple measurements are needed to infer latent object state. We solve this inference problem us...
Preprint
Tactile perception is central to robot manipulation in unstructured environments. However, it requires contact, and a mature implementation must infer object models while also accounting for the motion induced by the interaction. In this work, we present a method to estimate both object shape and pose in real-time from a stream of tactile measureme...
Preprint
We present a fast, scalable, and accurate Simultaneous Localization and Mapping (SLAM) system that represents indoor scenes as a graph of objects. Leveraging the observation that artificial environments are structured and occupied by recognizable objects, we show that a compositional scalable object mapping formulation is amenable to a robust SLAM...
Preprint
This paper presents an efficient algorithm for the least-squares problem using the point-to-plane cost, which aims to jointly optimize depth sensor poses and plane parameters for 3D reconstruction. We call this least-squares problem \textbf{Planar Bundle Adjustment} (PBA), due to the similarity between this problem and the original Bundle Adjustmen...
Article
This letter presents a self-supervised framework for learning depth from monocular videos. In particular, the main contributions of this letter include: (1) We present a windowed bundle adjustment framework to train the network. Compared to most previous works that only consider constraints from consecutive frames, our framework increases the camer...
Conference Paper
Typically, the reconstruction problem is addressed in three independent steps: first, sensor processing techniques are used to filter and segment sensor data as required by the front end. Second, the front end builds the factor graph for the problem to obtain an accurate estimate of the robot’s full trajectory. Finally, the end product is obtained...
Article
High-frequency imaging sonar sensors have recently been applied to aid underwater vehicle localization, by providing frame-to-frame odometry measurements or loop closures over large timescales. Previous methods have often assumed a planar environment, thereby restricting the use of such algorithms mostly to seafloor mapping. We propose an algorithm...
Article
This paper proposes a novel algorithm to solve the pose estimation problem from 2D/3D line correspondences, known as the Perspective-n-Line (PnL) problem. It is widely known that minimizing the geometric distance generally results in more accurate results than minimizing an algebraic distance. However, the rational form of the reprojection distance...
Chapter
This paper proposes an algebraic solution for the problem of camera pose estimation using the minimal configurations of 2D/3D point and line correspondences, including three point correspondences, two point and one line correspondences, one point and two line correspondences, and three line correspondences. In contrast to the previous works that ad...
Preprint
Estimating pose from given 3D correspondences, including point-to-point, point-to-line and point-to-plane correspondences, is a fundamental task in computer vision with many applications. We present a complete solution for this task, including a solution for the minimal problem and the least-squares problem of this task. Previous works mainly focus...
Article
Purpose Robot-assisted intraocular microsurgery can improve performance by aiding the surgeon in operating on delicate micron-scale anatomical structures of the eye. In order to account for the eyeball motion that is typical in intraocular surgery, there is a need for fast and accurate algorithms that map the retinal vasculature and localize the re...
Article
In this work, we propose a novel method for underwater localization using natural visual landmarks above the water surface. High-accuracy, drift-free pose estimates are necessary for inspection tasks in underwater indoor environments, such as nuclear spent pools. Inaccuracies in robot localization degrade the quality of its obtained map. Our framew...
Preprint
We present a novel unsupervised learning framework for single view depth estimation using monocular videos. It is well known in 3D vision that enlarging the baseline can increase the depth estimation accuracy, and jointly optimizing a set of camera poses and landmarks is essential. In previous monocular unsupervised learning frameworks, only part o...
Article
2018 IEEE. From archaeology to the inspection of subsea structures, underwater mapping has become critical to many applications. Because of the balanced trade-off between range and resolution, multibeam sonars are often used as the primary sensor in underwater mapping platforms. These sonars output an image representing the intensity of the receive...
Article
This paper reports on a real-time SLAM algorithm for an underwater robot using an imaging forward-looking sonar and its application in the area of autonomous underwater ship hull inspection. The proposed algorithm overcomes specific challenges associated with deliverable underwater acoustic SLAM, including feature sparsity and false-positive data a...
Thesis
Full-text available
Virtually all robotics and computer vision applications require some form of pose estimation; such as registration, structure from motion, sensor calibration, etc. This problem is challenging because it is highly nonlinear and nonconvex. A fundamental contribution of this thesis is the development of fast and accurate pose estimation by formulating...
Article
In this paper, we present RKD-SLAM, a robust keyframe-based dense SLAM approach for an RGB-D camera that can robustly handle fast motion and dense loop closure, and run without time limitation in a moderate size scene. It not only can be used to scan high-quality 3D models, but also can satisfy the demand of VR and AR applications. First, we combin...
Article
Full-text available
Visual odometry can be augmented by depth information such as provided by RGB-D cameras, or from lidars associated with cameras. However, such depth information can be limited by the sensors, leaving large areas in the visual images where depth is unavailable. Here, we propose a method to utilize the depth, even if sparsely available, in recovery o...
Book
Factor Graphs for Robot Perception reviews the use of factor graphs for the modeling and solving of large-scale inference problems in robotics. Factor graphs are a family of probabilistic graphical models, other examples of which are Bayesian networks and Markov random fields, well known from the statistical modeling and machine learning literature...
Article
Feature descriptors are powerful tools for photometrically and geometrically invariant image matching. To date, however, their use has been tied to sparse interest point detection, which is susceptible to noise under adverse imaging conditions. In this work, we propose to use binary feature descriptors in a direct tracking framework without relying...