
Oliver WasenmüllerHochschule Mannheim
Oliver Wasenmüller
Prof. Dr.-Ing.
About
79
Publications
33,273
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,007
Citations
Citations since 2017
Publications
Publications (79)
In scientific evaluation public datasets and benchmarks are indispensable to perform objective assessment. In this paper we present a new Comprehensive RGB-D Benchmark for SLAM (CoRBS). In contrast to state-of-the-art RGB-D SLAM benchmarks, we provide the combination of real depth and color data together with a ground truth trajectory of the camera...
Discrepancy check is a well-known task in industrial Augmented Reality (AR). In this paper we present a new approach consisting of three main contributions: First, we propose a new two-step depth mapping algorithm for RGB-D cameras, which fuses depth images with given camera pose in real-time into a consistent 3D model. In a rigorous evaluation wit...
SLAM with RGB-D cameras is a very active field in Computer Vision as well as Robotics. Dense methods using all depth and intensity information showed best results in the past. However, usually they were developed and evaluated with RGB-D cameras using Pattern Projection like the Kinect v1 or Xtion Pro. Recently, Time-of-Flight (ToF) cameras like th...
RGB-D cameras like the Microsoft Kinect had a huge impact on recent research in Computer Vision as well as Robotics. With the release of the Kinect v2 a new promising device is available, which will – most probably – be used in many future research. In this paper, we present a systematic comparison of the Kinect v1 and Kinect v2. We investigate the...
While most scene flow methods use either variational optimization or a strong rigid motion assumption, we show for the first time that scene flow can also be estimated by dense interpolation of sparse matches. To this end, we find sparse matches across two stereo image pairs that are detected without any prior regularization and perform dense inter...
While transformer architectures have dominated computer vision in recent years, these models cannot easily be deployed on hardware with limited resources for autonomous driving tasks that require real-time-performance. Their computational complexity and memory requirements limits their use, especially for applications with high-resolution inputs. I...
Recently, transformer architectures have shown superior performance compared to their CNN counterparts in many computer vision tasks. The self-attention mechanism enables transformer networks to connect visual dependencies over short as well as long distances, thus generating a large, sometimes even a global receptive field. In this paper, we propo...
In order to ensure safe autonomous driving, precise information about the conditions in and around the vehicle must be available. Accordingly, the monitoring of occupants and objects inside the vehicle is crucial. In the state-of-the-art, single or multiple deep neural networks are used for either object recognition, semantic segmentation, or human...
Many road accidents are caused by drowsiness of the driver. While there are methods to detect closed eyes, it is a non-trivial task to detect the gradual process of a driver becoming drowsy. We consider a simple real-time detection system for drowsiness merely based on the eye blinking rate derived from the eye aspect ratio. For the eye detection w...
LiDAR depth maps provide environmental guidance in a variety of applications. However, such depth maps are typically sparse and insufficient for complex tasks such as autonomous navigation. State of the art methods use image guided neural networks for dense depth completion. We develop a guided convolutional neural network focusing on gathering den...
Depth completion from sparse LiDAR and high-resolution RGB data is one of the foundations for autonomous driving techniques. Current approaches often rely on CNN-based methods with several known drawbacks: flying pixel at depth discontinuities, overfitting to both a given data set as well as error metric, and many more. Thus, we propose our novel P...
Common domain shift problem formulations consider the integration of multiple source domains, or the target domain during training. Regarding the generalization of machine learning models between different car interiors, we formulate the criterion of training in a single vehicle: without access to the target distribution of the vehicle the model wo...
Large labeled data sets are one of the essential basics of modern deep learning techniques. Therefore, there is an increasing need for tools that allow to label large amounts of data as intuitively as possible. In this paper, we introduce SALT, a tool to semi-automatically annotate RGB-D video sequences to generate 3D bounding boxes for full six De...
Ghost targets are targets that appear at wrong locations in radar data and are caused by the presence of multiple indirect reflections between the target and the sensor. In this work, we introduce the first point based deep learning approach for ghost target detection in 3D radar point clouds. This is done by extending the PointNet network architec...
In-the-wild human pose estimation has a huge potential for various fields, ranging from animation and action recognition to intention recognition and prediction for autonomous driving. The current state-of-the-art is focused only on RGB and RGB-D approaches for predicting the 3D human pose. However, not using precise LiDAR depth information limits...
Interpolation of sparse pixel information towards a dense target resolution finds its application across multiple disciplines in computer vision. State-of-the-art interpolation of motion fields applies model-based interpolation that makes use of edge information extracted from the target image. For depth completion, data-driven learning approaches...
Scene flow is the dense 3D reconstruction of motion and geometry of a scene. Most state-of-the-art methods use a pair of stereo images as input for full scene reconstruction. These methods depend a lot on the quality of the RGB images and perform poorly in regions with reflective objects, shadows, ill-conditioned light environment and so on. LiDAR...
Dense pixel matching is required for many computer vision algorithms such as disparity, optical flow or scene flow estimation. Feature Pyramid Networks (FPN) have proven to be a suitable feature extractor for CNN-based dense matching tasks. FPN generates well localized and semantically strong features at multiple scales. However, the generic FPN is...
While short range 3D pedestrian detection is sufficient for emergency breaking, long range detections are required for smooth breaking and gaining trust in autonomous vehicles. The current state-of-the-art on the KITTI benchmark performs suboptimal in detecting the position of pedestrians at long range. Thus, we propose an approach specifically tar...
State-of-the-art scene flow algorithms pursue the conflicting targets of accuracy, run time, and robustness. With the successful concept of pixel-wise matching and sparse-to-dense interpolation, we shift the operating point in this field of conflicts towards universality and speed. Avoiding strong assumptions on the domain or the problem yields a m...
We present a test platform for visual in-cabin scene analysis and occupant monitoring functions. The test platform is based on a driving simulator developed at the DFKI, consisting of a realistic in-cabin mock-up and a wide-angle projection system for a realistic driving experience. The platform has been equipped with a wide-angle 2D/3D camera syst...
We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles, in order to analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations (e.g. identical backgrounds and textures, few instances per class). This is in contrast t...
The detection of pedestrians plays an essential part in the development of automated driver assistance systems. Many of the currently available datasets for pedestrian detection focus on urban environments. State-of-the-art neural networks trained on these datasets struggle in generalizing their predictions from one environment to a visually dissim...
We propose a new approach called LiDAR-Flow to robustly estimate a dense scene flow by fusing a sparse LiDAR with stereo images. We take the advantage of the high accuracy of LiDAR to resolve the lack of information in some regions of stereo images due to textureless objects, shadows, ill-conditioned light environment and many more. Additionally, t...
Running time of the light field depth estimation algorithms is typically high. This assessment is based on the computational complexity of existing methods and the large amounts of data involved. The aim of our work is to develop a simple and fast algorithm for accurate depth computation. In this context, we propose an approach, which involves Semi...
Depth cameras are utilized in many applications. Recently light field approaches are increasingly being used for depth computation. While these approaches demonstrate the technical feasibility, they can not be brought into real-world application, since they have both a high computation time as well as a large design. Exactly these two drawbacks are...
Running time of the light field depth estimation algorithms is typically high. This assessment is based on the computational complexity of existing methods and the large amounts of data involved. The aim of our work is to develop a simple and fast algorithm for accurate depth computation. In this context, we propose an approach, which involves Semi...
Depth cameras are utilized in many applications. Recently light field approaches are increasingly being used for depth computation. While these approaches demonstrate the technical feasibility, they can not be brought into real-world application, since they have both a high computation time as well as a large design. Exactly these two drawbacks are...
Depth cameras are utilized in many applications. Recently light field approaches are increasingly being used for depth computation. While these approaches demonstrate the technical feasibility, they can not be brought into real-world application, since they have both a high computation time as well as a large design. Exactly these two drawbacks are...
Dense pixel matching is important for many computer vision tasks such as disparity and flow estimation. We present a robust, unified descriptor network that considers a large context region with high spatial variance. Our network has a very large receptive field and avoids striding layers to maintain spatial resolution. These properties are achieve...
In the last few years, convolutional neural networks (CNNs) have demonstrated increasing success at learning many computer vision tasks including dense estimation problems such as optical flow and stereo matching. However, the joint prediction of these tasks, called scene flow, has traditionally been tackled using slow classical methods based on pr...
Training a deep neural network is a non-trivial task. Not only the tuning of hyperparameters, but also the gathering and selection of training data, the design of the loss function , and the construction of training schedules is important to get the most out of a model. In this study, we perform a set of experiments all related to these issues. The...
Most LiDAR odometry algorithms estimate the transformation between two consecutive frames by estimating the rotation and translation in an intervening fashion. In this paper, we propose our Decoupled LiDAR Odometry (DeLiO), which -- for the first time -- decouples the rotation estimation completely from the translation estimation. In particular, th...
In the last few years, convolutional neural networks (CNNs) have demonstrated increasing success at learning many computer vision tasks including dense estimation problems such as optical flow and stereo matching. However, the joint prediction of these tasks, called scene flow, has traditionally been tackled using slow classical methods based on pr...
Training a deep neural network is a non-trivial task. Not only the tuning of hyperparameters, but also the gathering and selection of training data, the design of the loss function, and the construction of training schedules is important to get the most out of a model. In this study, we perform a set of experiments all related to these issues. The...
Dense pixel matching is important for many computer vision tasks such as disparity and flow estimation. We present a robust, unified descriptor network that considers a large context region with high spatial variance. Our network has a very large receptive field and avoids striding layers to maintain spatial resolution. These properties are achieve...
State-of-the-art scene flow algorithms pursue the conflicting targets of accuracy, run time, and robustness. With the successful concept of pixel-wise matching and sparse-to-dense interpolation, we push the limits of scene flow estimation. Avoiding strong assumptions on the domain or the problem yields a more robust algorithm. This algorithm is fas...
Scene flow describes 3D motion in a 3D scene. It can either be modeled as a single task, or it can be reconstructed from the auxiliary tasks of stereo depth and optical flow estimation. While the second method can achieve real-time performance by using real-time auxiliary methods, it will typically produce non-dense results. In this representation...
Scene flow describes the 3D position as well as the 3D motion of
each pixel in an image. Such algorithms are the basis for many
state-of-the-art autonomous or automated driving functions. For
verification and training large amounts of ground truth data is
required, which is not available for real data. In this paper, we
demonstrate a technology to...
Scene flow describes the 3D position as well as the 3D motion of each pixel in an image. Such algorithms are the basis for many state-of-the-art autonomous or automated driving functions. For verification and training large amounts of ground truth data is required, which is not available for real data. In this paper, we demonstrate a technology to...
Scene flow describes 3D motion in a 3D scene. It can either be modeled as a single task, or it can be reconstructed from the auxiliary tasks of stereo depth and optical flow estimation. While the second method can achieve real-time performance by using real-time auxiliary methods, it will typically produce non-dense results. In this representation...
Vehicles of higher automation levels require the creation of situation awareness. One important aspect of this situation awareness is an understanding of the current risk of a driving situation. In this work, we present a novel approach for the dynamic risk assessment of driving situations based on images of a front stereo camera using deep learnin...
Vehicles of higher automation levels require the creation of situation awareness. One important aspect of this situation awareness is an understanding of the current risk of a driving situation. In this work, we present a novel approach for the dynamic risk assessment of driving situations based on images of a front stereo camera using deep learnin...
Vehicles of higher automation levels require the creation of situation awareness. One important aspect of this situation awareness is an understanding of the current risk of a driving situation. In this work, we present a novel approach for the dynamic risk assessment of driving situations based on images of a front stereo camera using deep learnin...
Optical Flow algorithms are of high importance for many applications. Recently, the Flow Field algorithm and its modifications have shown remarkable results, as they have been evaluated with top accuracy on different data sets. In our analysis of the algorithm we have found that it produces accurate sparse matches, but there is room for improvement...
Optical Flow algorithms are of high importance for many applications. Recently, the Flow Field algorithm and its modifications have shown remarkable results, as they have been evaluated with top accuracy on different data sets. In our analysis of the algorithm we have found that it produces accurate sparse matches, but there is room for improvement...
Scene flow is a description of real world motion in 3D that contains more information than optical flow. Because of its complexity there exists no applicable variant for real-time scene flow estimation in an automotive or commercial vehicle context that is sufficiently robust and accurate. Therefore, many applications estimate the 2D optical flow i...
In the highly active research field of Simultaneous Localization And Mapping (SLAM), RGB-D images have been a major interest to use. Real-time SLAM for RGB-D images is of great importance since dense methods using all the depth and intensity values showed superior performance in the past. Due to development of GPU and CPU technologies, the real-tim...
The scale difference in driving scenarios is one of the essential challenges in semantic scene segmentation. Close objects cover significantly more pixels than far objects. In this paper, we address this challenge with a scale invariant architecture. Within this architecture, we explicitly estimate the depth and adapt the pooling field size accordi...
Scene flow is a description of real world motion in 3D that contains more information than optical flow. Because of its complexity there exists no applicable variant for real-time scene flow estimation in an automotive or commercial vehicle context that is sufficiently robust and accurate. Therefore, many applications estimate the 2D optical flow i...