Conference PaperPDF Available

Abstract and Figures

Perception is a crucial and challenging part of Intelligent Transportation Systems. One of the main issues is to keep up with the moving objects in complex and dynamic environments. This paper proposes a method for dynamic object detection using Evidential 2.5D Occupancy Grids. The approach is based on a map representation for occupation modeling and navigable area definition. At each time step, a local grid is derived from the sensor data. Belief Theory is then retained to perform a grid fusion over time in order to keep track of the moving objects in the grid. The description of the dynamic behavior of objects in a scene is related to the conflict issued after the temporal fusion. Finally, the construction of the objects themselves is realized with a segmentation based on Density-Based Spatial Clustering of Applications with Noise (DBSCAN). In order to validate the efficiency of the proposed approach, experimental results are provided based on the KITTI dataset. Performances are evaluated through comparison with the ground truth.
Content may be subject to copyright.
2.5D Evidential Grids for Dynamic Object
Detection
Hind Laghmara, Thomas Laurain, Christophe Cudel and Jean-Philippe Lauffenburger
IRIMAS EA7499, Universit´
e de Haute-Alsace
Mulhouse, France
firstname.name@uha.fr
Abstract—Perception is a crucial and challenging part of
Intelligent Transportation Systems. One of the main issues is
to keep up with the moving objects in complex and dynamic
environments. This paper proposes a method for dynamic object
detection using Evidential 2.5D Occupancy Grids. The approach
is based on a map representation for occupation modeling and
navigable area definition. At each time step, a local grid is
derived from the sensor data. Belief Theory is then retained
to perform a grid fusion over time in order to keep track of
the moving objects in the grid. The description of the dynamic
behavior of objects in a scene is related to the conflict issued
after the temporal fusion. Finally, the construction of the objects
themselves is realized with a segmentation based on Density-
Based Spatial Clustering of Applications with Noise (DBSCAN).
In order to validate the efficiency of the proposed approach,
experimental results are provided based on the KITTI dataset.
Performances are evaluated through comparison with the ground
truth.
Index Terms—Occupancy Grids, Belief Theory, Dynamic Ob-
ject Detection, Autonomous Vehicles, LiDAR.
I. INTRODUCTION
This paper focuses on the perception of Intelligent Trans-
portation Systems for which the objective is to model the
vehicle’s local environment according to data issued from
multiple sensors. This paper aims to achieve such task con-
sidering that the vehicle’s pose is known. The surrounding
environment representation is based on Occupancy Grids (OG)
as it indicates two main features: the navigable space as
well as the location of obstacles which may be static or
dynamic. OGs can be constructed based on the measures of the
objects’ distance from the ego-vehicle which can be given by
exteroceptive sensors such as LiDARs (Light Detection And
Ranging), Radars or stereo vision.
The main challenges considering this issue are the uncer-
tainty and imprecision of information as well as the complexity
that lies behind modeling a dynamic scene. Multiple Object
Tracking (MOT) is the application allowing to handle this
issue which includes the detection, association and tracking of
dynamic objects sequentially. OGs have proven to be effective
with a limited computation complexity for local environment
modelling taking account of temporal data. The aim in this
paper is to improve OGs to insure a robust detection of
dynamic objects. Due to the high uncertainty in such task, the
use of Dempster-Shafer theory is convenient for autonomous
systems’ applications.
Some of the main references in the literature to treat the
detection of dynamic objects with grid-based solutions are
reviewed in Section II-A. A survey is also presented in [1]
where a 2.5D (two dimensions plus the elevation information
obtained by averaging the height of all points that fall into a
given cell) approach is used for the determination of moving
cells.
This paper will be based on the same representation to
include a tri-dimensional modeling of the environment. The
main contribution of this approach is that the classification
of dynamic objects in a 2.5D grid is done according to
an evidential fusion of multi-grids. The second aim was to
extract an object-level representation according to the detected
dynamic cells for tracking purposes.The objective is to achieve
the dynamic detections on an object-level rather than on a
cell-level which is commonly done in literature. The objects
are built based on the clustering of mobile cells using the
DBSCAN algorithm. The third contribution of this work is
a quantitative evaluation according to a measure of average
precision of the detection results based on a KITTI dataset for
comparison purposes.
The paper is structured as follows: Section II covers a
survey on multiple object detection based on occupancy grids
as well as the definition of a 2.5D representation. Section
III introduces the different steps allowing the transition from
a grid-level representation to the dynamic object detection.
This includes the definition of an evidential grid as well as
mobile cells labeling (cf. Fig. 1). The segmentation algorithm
is also specified in order to extract dynamic objects. Section
IV illustrates the dataset used for evaluation as well as a
quantitative result analysis. Section V concludes the paper.
II. 2.5D GRID MAP S
A. Related Work
An OG is a representation which employs a multidimen-
sional tesselation of the space into cells where each one stores
a knowledge of its state of occupancy [2]. Today, there is a
large use of OGs due to the availability of powerful resources
to handle their computational cost. The construction of a
grid has been applied in multiple dimensions (2D, 2.5D and
3D) [1] using different sensor technologies like 2D radars,
2D or 3D LiDARs and stereo-vision. In this representation,
each cell-state is described according to a chosen formalism.
The most common one is the Bayesian framework which
was adopted first by Elfes [2] followed by many extensions
as the well-known Bayesian Occupancy Filter (BOF) [3, 4].
The latter estimates the dynamics of the grid cells using the
Fast Clustering and Tracking Algorithm in order to ensure
MOT [5].
Other works suggested a formalism based on Dempster-
Shafer (or Evidence) Theory. It has been applied in [6] based
on a 2D occupancy grid using a ring of ultrasonic transducers
and a sensor scanner. Moras et al. proposed a similar approach
also used for mobile object detection based on an inverse
sensor model [7, 8]. The latter is realized according to the
conflict analysis by a temporal evidential fusion of multiple
grids. Extending Moras et al.’s work, contextual discounting
is applied in [9] to control cell remanence.
Some references study the dynamics of the environment
at the cell level to avoid the inconsistencies of the object
representation [10]. Tanzmeister et al. [11] also estimate the
static and dynamic characteristics at the grid cell level and
use a particle filter for obtaining the cell velocity distribution.
Honer et al. [12] focus on the classification of stationary
and dynamic elements based on an evidential semantic 2D
grid map with a five state cell configuration: each cell can
either be free, dynamic, static, occupied or unknown. However,
the update of cells is done according to a combination table
heuristically determined.
The above literature review and especially [1] show that
most of the works consider a two-dimensional grid for the en-
vironment representation even when 3D sensors are providing
the data to build the map. In fact, 3D solutions like voxel grids
or octomaps can generate high complexity and computation
load when applied to real-time applications like autonomous
navigation. An interesting tradeoff remains in 2.5D occupancy
grid which are known to be memory efficient and at the
same time store elevation data. In the particular context of
autonomous driving and ITS, in which the elevation variation
of the terrain is limited in the local area in which the vehicles
are driving, 2.5D representations are of real interest and are
retained here.
In this work, the objective is to consider an object-oriented
tracking which necessitates an efficient object detection mod-
ule. The idea is to consider the tri-dimensional sensor data
issued by a Velodyne LiDAR to built at each time step a 2.5D
grid for which the elevation is attributed to each cell. Sections
II-B and III describe the approach illustrated in Fig. 1.
B. Building a 2.5D Grid
The pre-processing step from Fig. 1 required to build a 2.5D
grid is derived from [1]. The grid is composed of discrete cells
in which the object height above the ground level is stored.
This representation can describe the elevated objects from the
ground which can correspond to dynamic or static objects.
Building the 2.5D grid includes defining the covering area
as well as its resolution, which corresponds to the dimensions
of each cell. An example of a 2.5D grid map is shown in Fig. 2
where the resolution is 0.4×0.4m. The grid covers 40min
front, 20mbehind and 20malong right and left sides of the
vehicle.
Fig. 2. Top: 3D LiDAR point cloud from KITTI, Bottom: Corresponding
2.5D grid with 0.4×0.4mcells.
In order to consider the elevation of objects, it is necessary
to determine all measures that correspond to the ground.
Several approaches as [13] treat this point because it can
induce errors when investigating the occupancy. Such cases
are very frequent when the road is uneven or tilted. In this
work, the method presented in [1] is employed. It consists
in evaluating the variance of height of the points which
correspond to a cell. If this value is larger than a threshold, the
average height is verified to surpass a defined value in order to
make sure it belongs to the ground surface and not any other
planar surface. This is equivalent to the following statement:
G(i, j) = (0if σ2
i,j < trσand µi,j < trµ
µi,j otherwise ,(1)
where µi,j and σ2
i,j are the average height and its variance in
cell with index (i, j). The thresholds trσand trµdefined in [1]
are respectively equal to 2cm and 30cm.Gis the resulting
2.5D grid which will be further used for object detection.
III. FROM A N EVIDENTIAL GRI D TO T HE OBJECT LE VE L
A. Modeling an Evidential Grid
Extending the probability theory, the Belief Theory offers an
adequate representation of the data and source imperfections
and thus is appropriate for perception in ITS. It offers a wide
range of fusion operators handling these properties according
to the application.
In this work, the solution from [7] is adapted to the 2.5D
grid. Moras et al. suggest an approach based on the conflict
Fig. 1. Dynamic object detection with an evidential 2.5D grid.
Fig. 3. Polar representation of an occupancy map according to LiDAR data
where Ris the range of a cell and θits angular sector [14].
appearing during the temporal grid fusion for mobile object
detection and navigable space determination. For that, a frame
of discernment is defined to include the states of a cell
considering it to be Free (F) or Occupied (O). The discernment
frame is then Ω = {F, O}.
The referential power set contains all possible combinations
of the discernment frame hypotheses: 2={∅, F, O, {F, O}}.
To express the belief in each state, a mass function m(.)is
defined to respectively express conflict m(), Free state m(F),
Occupied state m(O)and the unknown state m({F, O}).
B. Inverse Sensor Model
Basically, a sensor model is how the mass function of a
state according to a measure is calculated. This basic belief
assignment (bba) also includes the reliability of the source.
In this application, the considered sensor is a 3D multi-echo
LiDAR provided by Velodyne. The input data will include
ranges riand angles θiof each laser beam or point pias
shown in Fig. 3.
According to this set of data, a Scan Grid (SG) in polar
coordinates is constructed. Each row of this SG corresponds
to an angular sector Θ=[θ, θ+]for which a cell is defined
in R×Θ. The range of a cell is R= [r, r+]which means
that each cell is defined by a pair on which a mass is attributed
as m{Θ, R}. The masses corresponding to each proposition
Aare found hereby [7]:
m{Θ, R}() = 0 (2)
m{Θ, R}(O) = (1µFriR
0otherwise (3)
m{Θ, R}(F) = (1µOif r+<min(ri)
0otherwise (4)
m{Θ, R}(Ω) =
µFriR
µOif r+<min(ri)
1otherwise
(5)
where µFand µOrespectively correspond to the probability of
false alarm and missed-detection of the sensor. For simplicity
reason, these mass functions will be noted m(),m(O),m(F)
and m(Ω).
C. Combination of Evidential Grids
The construction of a SG is sequentially done to translate
the sensor’s data. However, the temporal propagation of the
knowledge and uncertainties provided by every point cloud
given by the sensor requires a fusion process between the
current SG and the result of the previous fusion. The complete
description of the environment resulting from such a combina-
tion provides a Map Grid (MG). This update allows to detect
the consistencies of data as well as some cases of conflict.
Fig. 4 illustrates the process of building and updating a MG
Fig. 4. Map Grid Construction.
using the sensor point cloud provided at a time t. It is the
outcome of a combination of a SG built at taccording to (2)
and a transformed MG built at t1.
The grid transformation is applied with respect to the new
pose of the vehicle at tin order to guarantee that the infor-
mation is expressed in the current coordinate system of the
vehicle. This operation is realized by a spatial transformation
for which the aim is to associate to each cell new coordinates.
Algorithm 1 describes the approach.
Algorithm 1 Grid Transformation to new vehicle coordinates
Require: Previous Map Grid MGt1, Rotation matrix R,
Translation Vector T.
Ensure: Build a transformed Map Grid MGt1,tr
Initialize MGt1,tr cells with mM Gt1,tr (Ω) = 1
for each cell with index (p, q)do
Apply a change of coordinates (p, q) = R×(p, q) + T
Calculate the new indices
(pnew, qnew ) = min(|ceil(p, q )|,|floor(p, q)|)×
sign(i, j )
if (pnew ×qnew)>0then
MGt1,tr(pnew, qnew ) = M Gt1(p, q)
end if
end for
This update is done according to an evidential multi-grid
fusion. This is the crucial point of the grid-based object
detection process as it allows the temporal update of the map
grid and also the evaluation of the state of cells. Among
the various operators in Belief Theory, Dempster’s rule of
combination is used:
mMGt=mM Gt1,tr mS Gt(6)
where mMGt1,tr and mS Gtare resp. the mass function of
the transformed MG and SG at time t. The operator is defined
as:
(m1m2)(A) = KX
B,C 2Θ,BC=A,A6=
m1(B).m2(C)(7)
where
K1= 1 X
B,C 2Θ,BC=
m1(B).m2(C)(8)
The resulting masses mMGt(A)define the state of each cell
which depends on the previous state and the new measures.
The resulting masses according to each state are found as
follows [7]:
mMGt(O) =mS Gt(O)mM Gt1,tr (O) + mSGt(Ω)
mMGt1,tr (O) + mSGt(O)mM Gt1,tr (Ω)
mMGt(F) =mS Gt(F)mM Gt1,tr (F) + mSGt(Ω)
mMGt1,tr (F) + mSGt(Ω)mM Gt1,tr (F)
mMGt(Ω) =mS Gt(Ω)mM Gt1,tr (Ω)
mMGt() =mS Gt(O)mM Gt1,tr (F) + mSGt(F)
mMGt1,tr (O)
(9)
with mMGt() being the combined mass expressing the con-
flict. Basically, this property shows the discordance between
the knowledge expressed at t1and t. The reason for which
a conflict appears is that when a cell changes its state from F
to Oor vice-versa. Therefore, the detection of this conflict can
lead to the evaluation of the dynamic cells. The conflict allows
to label the occupied cells which change their state according
to two types of conflict:
C1=mSGt(O).mM Gt1,tr (F)from Fto O
C2=mSGt(F).mM Gt1,tr (O)from Oto F(10)
where mMGt() = C1+C2.
Dempster’s operator implies a normalization of conflict at
fusion considering its absorbing property. Basically, if the
conflict is included in the next combination, it induces loss
of information because mMGt()increases at each fusion.
Therefore, the updated grid contains no conflict. It is only
preserved to classify the mobile cells to be studied for dynamic
object extraction.
D. Clustering for Dynamic Object Detection
In order to attain the object level representation from the
mobile cells, a clustering is applied to group those cells related
to the same object in the grid. For that, the partitioning method
must be unsupervised considering that the number of objects to
be found is unknown. The well-known Density-Based Spatial
Clustering of Applications with Noise (DBSCAN) algorithm is
used [15]. It is based on the estimated density of measures for
partitioning clusters. This algorithm uses two main parameters:
the minimal distance and the minimum number of points
minP ts which must reside within a radius to be included in
a cluster. This algorithm is convenient because it is simple and
can handle aberrant or noisy values while clustering. However,
it can have some issues when clusters have different densities.
Fig. 5. Appearance of conflict due to the displacement of objects. Conflict C1
informs about newly occupied cells whereas C2describes transitions between
occupied to free.
The clustering algorithm is applied to a set of cells which
should necessarily be occupied (i.e. non-zero elevation) and
for which the conflict mMGt()is later used to classify the
set of the resulting clusters. The partial conflict C2informs
about cells changing state from occupied to free at a time t.
The cells affected with this conflict do not belong to a given
object and hence do not provide any knowledge about the
object’s presence. That is why, it is trivial to only consider the
partial conflict C1to determine the location of the object at
time t. However, exclusively clustering the C1-labeled cells is
not informative enough to obtain a complete representation of
the shape of the dynamic object. On the grid, the displacement
of objects is only visible at the perimeter, the conflict is mostly
located at the boundaries of objects as shown in Fig. 5. For
that, clustering is applied to detect both classes of static and
dynamic objects according to the elevation measure on the
2.5D grid. Afterwards, a classification of these clusters is made
according to those which partially contain conflictual cells as
mentioned in Fig. 1.
IV. EXP ER IM EN TAL RE SULTS
The presented approach is applied to real data and has been
tested offline. The validation is done at the grid-level as well
as the object-level according to the ground truth (GT) for
qualitative and quantitative evaluation.
A. Dataset and Performance Evaluation
The used data is extracted from the KITTI database [16].
It is a widely used dataset in research for autonomous driving
as it provides a large set of images, GPS/IMU recordings and
laser scans raw data as well as labeled scenes. In this study,
sequence 17 from the raw data set is used considering that
annotations with the GT on detected objects as well as the
vehicle pose are available. A total of 114 frames are used
for which 59 contain annotated moving cars. The used data
for this approach are the point cloud recorded by a Velodyne
HDL-64 which is characterized by 64 horizontal layers and
a360ohorizontal field of view and a 26.9overtical field of
view. The GPS data is also used to obtain the vehicle’s pose.
The images are not exploited for this application but are used
for visualization.
3D object detection benchmarks offer various criteria for
performance evaluation purposes. The most common measure
is the precision. It is the proportion of all example above the
rank which are from the positive class [17]:
P recision =T P
T P +F P (11)
where T P and F P respectively stand for True Positive and
False Positives.
This metric is calculated according to the overlap of candi-
date detections with the GT. For the computation, the bounding
box of a detected object is compared to the GT bounding box.
A correct detection has an overlap area aowhich exceeds 50%
to be considered True and can be calculated as follows:
ao=area(BpBgt )
area(BpBgt )(12)
where Bpand Bgt are respectively the candidate bounding
box and the GT bounding box.
Fig. 6. Top: Frame 40 of Sequence 17, Middle: The corresponding 2.5D grid,
Bottom: Comparison of the detection results on frame 40 with the ground
truth.
B. Results
In the following section, the perception results will be
illustrated according to the 2.5D grid, the evidential occupancy
measures and the detected objects. The results are confronted
with the available GT-based objects.
Fig. 6 shows an example of results according to the image
captured by the camera facing the front of the vehicle. The
corresponding 2.5D grid found below the image expresses
the average height of elevated objects in the scene. In this
view, the car’s position is approximately x= 50, y = 50
and heading to the right. It can been seen that this map
contains voxels describing moving objects (2 cars) as well
as static ones like numerous traffic signs or static vehicles
behind the ego-car. The grid fusion allows to determine which
among these voxels belong to dynamic objects. The bottom
figure, exclusively, shows the results of dynamic objects found
in the range of the camera view. It provides a comparison
between the bounding boxes resulting from the fusion and
clustering process and the GT bounding boxes. We choose not
to display the objects detected behind the ego-vehicle since no
annotations are available in the GT.
The position of these objects is found according to the
evaluation of conflict in the corresponding MG. The conflict
C1allows to observe the cells changing state from Free to
Fig. 7. Overlap of the detected objects with the ground truth showing the
rate above which a detection is eligible.
Occupied and the voxels which contain a non-zero value of C1
are grouped to define the moving objects. Considering that we
only detect cars in this sequence, the parameters of DBSCAN
are minP ts = 4 and = 5 in grid coordinates. This algorithm
is advantageous for this application because it discards the
measures which can be considered noisy. This allows to
optimize the number of relevant clusters. The extracted clusters
are labeled according to bounding boxes containing the exact
number of LiDAR data belonging to the cluster. The use of 3D
bounding boxes allows to evaluate the results according to the
known locations of GT. The total AP is found to be 91.23%
with an overlap illustrated in Fig. 7. It can be noticed that most
detected objects overlap with true objects at a rate varying
between 65% 90%. Note that the number of false alarms is
very low due to the property of the clustering algorithm which
only considers dense groups of measurement. The noisy or
distant data do not belong to any object.
V. CONCLUSION
The approach presented in this paper aims at the detection
of multiple objects based on LiDAR data according to an
evidential 2.5D grid. The main contributions of this paper
is the use of an evidential elevation map and the evaluation
of conflict for the determination of mobile objects. Another
contribution is the clustering to have an object-level repre-
sentation for tracking purposes. The detection of dynamic
objects is evaluated according to ground truth given by a set
of annotations of KITTI dataset. The approach is shown to be
efficient according to its high average precision. The perspec-
tives for future work are the identification of the clustering
algorithm parameters in order to identify many classes of
objects. Extending these first results by testing the approach
on more complex scenarios including occluded objects is a
second work perspective. Furthermore, a comparative study of
this work with the state of the art results will be performed.
ACKNOWLEDGMENT
The authors gratefully acknowledge the financial support
from Fondation Wallach (Mulhouse) in the context of the
Project SIMPHA (Solution Innovantes pour la Mobilit´
e in-
dividualis´
ee et durable des s´
eniors et Personnes pr´
esentant un
Handicap).
REFERENCES
[1] A. Asvadi, P. Peixoto, and U. Nunes, “Detection and
tracking of moving objects using 2.5d motion grids,” in
18th International Conference on Intelligent Transporta-
tion Systems, Las Palmas, Spain, 2015.
[2] A. Elfes, “Using occupancy grids for mobile robot per-
ception and navigation,Computer, vol. 22, no. 6, pp.
46–57, Jun. 1989.
[3] C. Cou ´
e, C. Pradalier, C. Laugier, T. Fraichard, and
P. Bessiere, “Bayesian Occupancy Filtering for Multitar-
get Tracking: an Automotive Application,” International
Journal of Robotics Research, vol. 25, no. 1, pp. 19–30,
Jan. 2006.
[4] A. Broggi, S. Cattani, M. Patander, M. Sabbatelli, and
P. Zani, “A full-3d voxel-based dynamic obstacle detec-
tion for urban scenario using stereo vision,” in 16th Inter-
national IEEE Conference on Intelligent Transportation
Systems (ITSC), Oct 2013, pp. 71–76.
[5] K. Mekhnacha, Y. Mao, D. Raulo, and C. Laugier,
“The “fast clustering-tracking” algorithm in the bayesian
occupancy filter framework,” in IEEE International Con-
ference on Multisensor Fusion and Integration for Intel-
ligent Systems, Aug 2008, pp. 238–245.
[6] D. Pagac, E. M. Nebot, and H. F. Durrant-Whyte, “An
evidential approach to map-building for autonomous ve-
hicles,” IEEE Transactions on Robotics and Automation,
vol. 14, no. 4, pp. 623–629, 1998.
[7] J. Moras, V. Berge-Cherfaoui, and P. Bonnifait, “Moving
objects detection by conflict analysis in evidential grids,
in IEEE Intelligent Vehicles Symposium (IV), Baden-
Daden, Germany, 2011, pp. 1122–1127.
[8] J. Moras, V. Cherfaoui, and P. Bonnifait, “Credibilist
occupancy grids for vehicle perception in dynamic envi-
ronments,” in IEEE International Conference on Robotics
and Automation, May 2011, pp. 84–89.
[9] M. Kurdej, J. Moras, V. Cherfaoui, and P. Bonnifait,
“Controlling remanence in evidential grids using geodata
for dynamic scene perception,” International Journal of
Approximate Reasoning, vol. 55, no. 1, Part 3, pp. 355–
375, 2014.
[10] R. Danescu, F. Oniga, and S. Nedevschi, “Modeling
and tracking the driving environment with a particle-
based occupancy grid,IEEE Transactions on Intelligent
Transportation Systems, vol. 12, no. 4, pp. 1331–1342,
Dec 2011.
[11] G. Tanzmeister and D. Wollherr, “Evidential grid-based
tracking and mapping,” IEEE Transactions on Intelligent
Transportation Systems, vol. 18, no. 6, pp. 1454–1467,
June 2017.
[12] J. Honer and H. Hettmann., “Motion state classification
for automotive lidar based on evidential grid maps and
transferable belief model.” in 21st International Confer-
ence on Information Fusion (FUSION), Oxford, U.K, Jul.
2018.
[13] L. Wang and Y. Zhang, “Lidar ground filtering algorithm
for urban areas using scan line based segmentation,
Computing Research Repository, vol. abs/1603.00912,
2016.
[14] K. Jo, S. Cho, C. Kim, P. Resende, B. Bradai,
F. Nashashibi, and M. Sunwoo, “Cloud update of tiled
evidential occupancy grid maps for the multi-vehicle
mapping,” Sensors, vol. 18, no. 12, 2018.
[15] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu, “A density-
based algorithm for discovering clusters in large spatial
databases with noise.” AAAI Press, 1996, pp. 226–231.
[16] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision
meets robotics: The kitti dataset,” International Journal
of Robotics Research (IJRR), 2013.
[17] M. Everingham, L. V. Gool, C. K. I. Williams, J. Winn,
and A. Zisserman, “The pascal visual object classes (voc)
challenge,” International Jounal of Computer Vision,
2010.
... Therefore, to overcome the above problems, the latest management systems tend to use image sensors (such as Laser scanners, LIDAR) with the fusion of multiple sensor information (such as RADAR) to improve the precision of measuring for Advanced applications. For example, positioning, traffic prediction, object detection/tracking, autonomous car driving safety, road condition classification, V2X communication, noise detection, travel time estimation, routing system, quality enrichment of the data, and journey planning [9,34,35,39,41,42,[54][55][56][57][58][59][60][61][62][63][64][65][66][67][68]. ...
... Laghmara et al. [128] described Dempster-Shafer as a new method for real-time tracking of objects using a robust Dempster-Shafer approach. Laghmara et al. [60] proposed a method for dynamic object detection that uses Evidential 2.5D Occupancy Grids as well as the Belief Theory to perform a grid fusion over time to keep track of moving objects in the grid. Description of dynamic behavior is based on the issue of the temporal conflict after the fusion. ...
Article
In recent years, the development of intelligent transportation systems (ITS) has involved the input of various kinds of heterogeneous data in real time and from multiple sources, which presents several additional challenges. Studies on Data Fusion (DF) have delivered significant enhancements in ITS and demonstrated a substantial impact on its evolution. This paper introduces a systematic literature review on recent data fusion methods and extracts the main issues and challenges of using these techniques in intelligent transportation systems (ITS). It endeavors to identify and discuss the multi-sensor data sources and properties used for various traffic domains, including autonomous vehicles, detection models, driving assistance, traffic prediction, Vehicular communication, Localization, and management systems. Moreover, it attempts to associate abstractions of observation-level fusion, feature-level fusion, and decision-level fusion with different methods to better understand how DF is used in ITS applications. Consequently, the main objective of this paper is to review DF methods used for ITS studies to extract its trendy challenges. The review outcomes are (i) a description of the current Data fusion methods that adopt multi-sensor sources of heterogeneous data under different evaluation strategies, (ii) identifying several research gaps, current challenges, and new research trends.
... According to the design process and the technical considerations of the dynamic clustering algorithm introduced in section III.B, C is determined by the maximum size of obstacles to be detected. In this section, other design parameters of the dynamic clustering algorithm are to be designed experimentally considering comprehensive clustering performances by KITTI dataset [37]. ...
... Next, MinPts is to be designed experimentally by KITTI dataset considering the comprehensive clustering performance [38], i.e., the rate of correct detection, missed detection, under-segmentation and over-segmentation. KITTI dataset is composed of a variety of scenarios including residential areas, city roads and expressways [37]. During the numerical experiment, E d is set to be 0.2 m, MinPts is scanned from 3 to 8 to find the optimal one [39] and the results of the clustering performances are shown in Fig. 7. ...
Article
Full-text available
Lidar is an important sensor of the autonomous driving system to detect environmental obstacles, but the spatial distribution of its point cloud is non-uniform because of the scanning mechanism. For adaption to this spatial non-uniformity, a dynamic clustering algorithm is proposed based on the spatial distribution analysis of the point cloud along different directions. The proposed algorithm adopts an elliptical function to describe the neighbor, whose semi-minor and semi-major are adjusted dynamically according to the position of the core point. Base on the relationship analysis of different clustering parameters, they are further designed quantitatively by KITTI dataset considering comprehensive clustering performances. To validate the effectiveness of the proposed algorithm, several comparative experiments with different clustering methods and projection planes have been conducted in the campus by an electric sedan equipped with three IBEO LUX 8 lidars. The experimental results show that the proposed elliptical neighbor can deal with the uneven point cloud more effectively, the performances of over-segmentation, under- segmentation and missed detection all are improved and accordingly a higher detection accuracy is achieved.
... In a structured environment, Laghmara et al. [8] proposed to generate a 2.5D map by fusing data from both a LiDAR and a camera. Through the application of belief theory, it not only represents the static surroundings but also identifies dynamic objects. ...
Preprint
Full-text available
In this paper, we introduce a novel method for safe navigation in agricultural robotics. As global environmental challenges intensify, robotics offers a powerful solution to reduce chemical usage while meeting the increasing demands for food production. However, significant challenges remain in ensuring the autonomy and resilience of robots operating in unstructured agricultural environments. Obstacles such as crops and tall grass, which are deformable, must be identified as safely traversable, compared to rigid obstacles. To address this, we propose a new traversability analysis method based on a 3D spectral map reconstructed using a LIDAR and a multispectral camera. This approach enables the robot to distinguish between safe and unsafe collisions with deformable obstacles. We perform a comprehensive evaluation of multispectral metrics for vegetation detection and incorporate these metrics into an augmented environmental map. Utilizing this map, we compute a physics-based traversability metric that accounts for the robot's weight and size, ensuring safe navigation over deformable obstacles.
Article
In this study, the object association issue is tackled in order to ensure a correct affiliation of perceived objects with known ones. The proposed approach is based on the evidence theory and includes multiple object features in order to manage pairing issues in a complex environment. Two heterogeneous information sources are built based on kinematic features related to the objects: their position and size on one hand and their direction of motion on the other hand. A study on the estimation of the belief expressed by these independent sources is performed. The multiple features are managed through a hierarchical fusion which includes two levels of combination. The first level is a pairwise combination, fusing position and orientation data of each pair of objects and the second one processes sequentially the previously combined information over all possible associations. This paper also investigates the effectiveness of the association according to different combination operators at both levels. The performance of the proposed approach is demonstrated in the Intelligent Transportation Systems context for which environmental perception is crucial. The validation exploits a large amount of real data issued from a camera and a 3D LiDAR from the KITTI database.
Article
Full-text available
Nowadays, many intelligent vehicles are equipped with various sensors to recognize their surrounding environment and to measure the motion or position of the vehicle. In addition, the number of intelligent vehicles equipped with a mobile Internet modem is increasing. Based on the sensors and Internet connection, the intelligent vehicles are able to share the sensor information with other vehicles via a cloud service. The sensor information sharing via the cloud service promises to improve the safe and efficient operation of the multiple intelligent vehicles. This paper presents a cloud update framework of occupancy grid maps for multiple intelligent vehicles in a large-scale environment. An evidential theory is applied to create the occupancy grid maps to address sensor disturbance such as measurement noise, occlusion and dynamic objects. Multiple vehicles equipped with LiDARs, motion sensors, and a low-cost GPS receiver create the evidential occupancy grid map (EOGM) for their passing trajectory based on GraphSLAM. A geodetic quad-tree tile system is applied to manage the EOGM, which provides a common tiling format to cover the large-scale environment. The created EOGM tiles are uploaded to EOGM cloud and merged with old EOGM tiles in the cloud using Dempster combination of evidential theory. Experiments were performed to evaluate the multiple EOGM mapping and the cloud update framework for large-scale road environment.
Conference Paper
Full-text available
Autonomous vehicles require a reliable perception of their environment to operate in real-world conditions. Awareness of moving objects is one of the key components for the perception of the environment. This paper proposes a method for detection and tracking of moving objects (DATMO) in dynamic environments surrounding a moving road vehicle equipped with a Velodyne laser scanner and GPS/IMU localization system. First, at every time step, a local 2.5D grid is built using the last sets of sensor measurements. Along time, the generated grids combined with localization data are integrated into an environment model called local 2.5D map. In every frame, a 2.5D grid is compared with an updated 2.5D map to compute a 2.5D motion grid. A mechanism based on spatial properties is presented to suppress false detections that are due to small localization errors. Next, the 2.5D motion grid is post-processed to provide an object level representation of the scene. The detected moving objects are tracked over time by applying data association and Kalman filtering. The experiments conducted on different sequences from KITTI dataset showed promising results, demonstrating the applicability of the proposed method.
Article
Full-text available
We present a novel dataset captured from a VW station wagon for use in mobile robotics and autonomous driving research. In total, we recorded 6 hours of traffic scenarios at 10–100 Hz using a variety of sensor modalities such as high-resolution color and grayscale stereo cameras, a Velodyne 3D laser scanner and a high-precision GPS/IMU inertial navigation system. The scenarios are diverse, capturing real-world traffic situations, and range from freeways over rural areas to inner-city scenes with many static and dynamic objects. Our data is calibrated, synchronized and timestamped, and we provide the rectified and raw image sequences. Our dataset also contains object labels in the form of 3D tracklets, and we provide online benchmarks for stereo, optical flow, object detection and other tasks. This paper describes our recording platform, the data format and the utilities that we provide.
Conference Paper
Full-text available
Autonomous Ground Vehicles designed for dynamic environments require a reliable perception of the real world, in terms of obstacle presence, position and speed. In this paper we present a flexible technique to build, in real time, a dense voxel-based map from a 3D point cloud, able to: 1) discriminate between stationary and moving obstacles; 2) provide an approximation of the detected obstacle’s absolute speed using the information of the vehicle’s egomotion computed through a visual odometry approach. The point cloud is first sampled into a full 3D map based on voxels to preserve the tridimensional information; egomotion information allows computational efficiency in voxels creation; then voxels are processed using a flood fill approach to segment them into a clusters structure; finally, with the egomotion information, the obtained clusters are labeled as stationary or moving obstacles, and an estimation of their speed is provided. The algorithm runs in real time; it has been tested on one of VisLab’s AGVs using a modified SGM-based stereo system as 3D data source.
Article
Full-text available
Reliable and efficient perception and reasoning in dynamic and densely cluttered environments is still a major challenge for driver assistance systems. Most of today's systems use target tracking algorithms based on object models. They work quite well in simple environments such as freeways, where few potential obstacles have to be considered. However, these approaches usually fail in more complex environments featuring a large variety of potential obstacles, as it is usually the case in urban drivi ng situations. In this paper, we propose a new approach for robust perception and risk assessment in highly dynamic environments. This approach is called Bayesian Occupancy Filtering, it basically combines a 4-dimensional occupancy grid representation of the obstacle state-space with Bayesian filtering techniques.
Article
Full-text available
Modeling and tracking the driving environment is a complex problem due to the heterogeneous nature of the real world. In many situations, modeling the obstacles and the driving surfaces can be achieved by the use of geometrical objects, and tracking becomes the problem of estimating the parameters of these objects. In the more complex cases, the scene can be modeled and tracked as an occupancy grid. This paper presents a novel occupancy grid tracking solution based on particles for tracking the dynamic driving environment. The particles will have a dual nature-they will denote hypotheses, as in the particle filtering algorithms, but they will also be the building blocks of our modeled world. The particles have position and speed, and they can migrate in the grid from cell to cell, depending on their motion model and motion parameters, but they will be also created and destroyed using a weighting-resampling mechanism that is specific to particle filtering algorithms. The tracking algorithm will be centered on particles, instead of cells. An obstacle grid derived from processing a stereovision-generated elevation map is used as measurement information, and the measurement model takes into account the uncertainties of the stereo reconstruction. The resulting system is a flexible real-time tracking solution for dynamic unstructured driving environments.
Article
Tracking and mapping the local environment form the basis of an autonomous vehicle system. They are often realized separately using occupancy grids, which do not require object or shape assumptions, and model-based object tracking algorithms. Many approaches require a binary classification of the sensor measurements into coming from a static or from a dynamic object, as otherwise inconsistencies between the different representations are likely to occur. This paper presents grid-based tracking and mapping (GTAM), a low-level grid-based approach that simultaneously estimates the static and the dynamic environment, their uncertainties, velocities, as well as information about free space. GTAM works on the level of grid cells, rather than creating object hypotheses. A particle filter is used to obtain continuous cell velocity distributions for all obstacles. Continuous evidences in a Dempster-Shafer model are derived without requiring a binary pre-classification of the sensor measurements. Results and evaluations using a vehicle moving in real dynamic street environments demonstrate the performance of the presented approach.
Article
This paper addresses the task of separating ground points from airborne LiDAR point cloud data in urban areas. A novel ground filtering method using scan line segmentation is proposed here, which we call SLSGF. It utilizes the scan line information in LiDAR data to segment the LiDAR data. The similarity measurements are designed to make it possible to segment complex roof structures into a single segment as much as possible so the topological relationships between the roof and the ground are simpler, which will benefit the labeling process. In the labeling process, the initial ground segments are detected and a coarse to fine labeling scheme is applied. Data from ISPRS 2011 are used to test the accuracy of SLSGF; and our analytical and experimental results show that this method is computationally-efficient and noise-insensitive, thereby making a denoising process unnecessary before filtering.
Article
This article proposes a perception scheme in the field of intelligent vehicles. The method exploits prior map knowledge and makes use of evidential grids constructed from the sensor data. Evidential grids are based on occupancy grids and the formalism of the Dempster–Shafer theory. Prior knowledge is obtained from a geographic map which is considered as an additional source of information and combined with a grid representing sensor data. Since the vehicle environment is dynamic, stationary and mobile objects have to be distinguished. In order to achieve this objective, evidential conflict information is used for mobile cell detection. As well, an accumulator is introduced and used as a factor for mass function specialisation in order to detect static cells. data recorded in urban conditions illustrate the benefits of the presented approach.