Conference Paper

A probabilistic representation of LiDAR range data for efficient 3D object detection

Dept. of Electr., Comput., & Syst. Eng., Rensselaer Polytech. Inst., Troy, NY
DOI: 10.1109/CVPRW.2008.4563033 Conference: Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08. IEEE Computer Society Conference on
Source: IEEE Xplore

ABSTRACT We present a novel approach to 3D object detection in scenes scanned by LiDAR sensors, based on a probabilistic representation of free, occupied, and hidden space that extends the concept of occupancy grids from robot mapping algorithms. This scene representation naturally handles LiDAR sampling issues, can be used to fuse multiple LiDAR data sets, and captures the inherent uncertainty of the data due to occlusions and clutter. Using this model, we formulate a hypothesis testing methodology to determine the probability that given 3D objects are present in the scene. By propagating uncertainty in the original sample points, we are able to measure confidence in the detection results in a principled way. We demonstrate the approach in examples of detecting objects that are partially occluded by scene clutter such as camouflage netting.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents methods for 3D object detection and multi-object (or multi-agent) behavior recognition using a sequence of 3D point clouds of a scene taken over time. This motion 3D data can be collected using different sensors and techniques such as flash LIDAR (Light Detection And Ranging), stereo cameras, time-of-flight cameras, or spatial phase imaging sensors. Our goal is to segment objects from the D point cloud data in order to construct tracks of multiple objects (i.e., persons and vehicles) and then classify the multi-object tracks as one of a set of known behaviors, such as “A person drives a car and gets out”. A track is a sequence of object locations changing over time and is the compact object-level information we use and obtain from the motion 3D data. Leveraging the rich structure of dynamic 3D data makes many visual learning problems better posed and more tractable. Our behavior recognition method is based on combining the Dynamic Time Warping-based behavior distances from the multiple object-level tracks using a normalized car-centric coordinate system to recognize the interactive behavior of those multiple objects. We apply our techniques for behavior recognition on data collected using a LIDAR sensor, with promising results.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This work is related to the development of a marker less system allowing the tracking of elderly people at home. Microsoft Kinect is a low cost 3D camera adapted to the tracking of human movements. We propose a method for making the fusion of the information provided by several Kinects. The observed space is tesselated into cells forming a 3D occupancy grid. We calculate a probability of occupation for each cell of the grid. From this probability we distinguish whether the cells are occupied or not by a static object (wall) or a mobile object (chair, human being). This categorization is realized in real-time using a simple three states HMM. The proposed method for discriminating between mobile and static objects in a room is the main contribution of this paper. The use of HMMs allows to deal with an aliasing problem since mobile objects result in the same observation as static objects. The approach is evaluated in simulation and in a real environment showing an efficient real-time discrimination between cells occupied by mobile objects and cells occupied by static objects.
    IEEE 23rd International Conference on Tools with Artificial Intelligence, ICTAI 2011, Boca Raton, FL, USA, November 7-9, 2011; 01/2011
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a novel method for reconstruction of the target three-dimensional (3D) objects based on motion vision. The main contribution of this paper is twofold: (1) we build a new mathematical model to express the position relationship of the camera, the image plane, the ground plane and the target 3D objects at different times. (2) We propose and implement a 3D reconstruction algorithm refer to the mathematical model using geometric theory. With the result of that, we know the distance from the camera to the objects and the height of the objects in the real world. The experimental indicates have shown the accuracy of the approach presented in this paper.
    Pervasive Computing, Signal Porcessing and Applications, International Conference on. 01/2010;


Available from