Conference PaperPDF Available

Point cloud matching based on 3D self-similarity

Authors:

Abstract and Figures

Point cloud is one of the primitive representations of 3D data nowadays. Despite that much work has been done in 2D image matching, matching 3D points achieved from different perspective or at different time remains to be a challenging problem. This paper proposes a 3D local descriptor based on 3D self-similarities. We not only extend the concept of 2D self-similarity [1] to the 3D space, but also establish the similarity measurement based on the combination of geometric and photometric information. The matching process is fully automatic i.e. needs no manually selected land marks. The results on the LiDAR and model data sets show that our method has robust performance on 3D data under various transformations and noises.
Content may be subject to copyright.
Point Cloud Matching based on 3D Self-Similarity
Jing Huang
University of Southern California
Los Angeles, CA 90089
huang10@usc.edu
Suya You
University of Southern California
Los Angeles, CA 90089
suya.you@usc.edu
Abstract
Point cloud is one of the primitive representations of 3D
data nowadays. Despite that much work has been done in
2D image matching, matching 3D points achieved from dif-
ferent perspective or at different time remains to be a chal-
lenging problem. This paper proposes a 3D local descriptor
based on 3D self-similarities. We not only extend the con-
cept of 2D self-similarity [1] to the 3D space, but also estab-
lish the similarity measurement based on the combination of
geometric and photometric information. The matching pro-
cess is fully automatic i.e. needs no manually selected land
marks. The results on the LiDAR and model data sets show
that our method has robust performance on 3D data under
various transformations and noises.
1. Introduction
Matching is a process of establishing precise correspon-
dences between two or more datasets acquired, for exam-
ple, at different times, from different aspects or even from
different sensors or platforms. This paper addresses the
challenging problem of finding precise point-to-point cor-
respondences between two 3D point cloud data. This is a
key step for many tasks including multi-view scans registra-
tion, data fusion, 3D modeling, 3D object recognition and
3D data retrieval.
Figure 1 shows an example of point cloud data that were
acquired by airborne LiDAR sensor. The two point clouds
represent two LiDAR scans of the same area (downtown
Vancouver) acquired at different times and from different
viewpoints. The goal is to find precise matches for the
points in the overlapping areas of the two point clouds.
Matching of point clouds is challenging in that there are
usually enormous number of 3D points and the coordinate
system can vary in terms of translation, 3D-rotation and
scale. Point positions are generally not coincident; noises
and occlusions are common due to incomplete scans; and
objects are attached to each other and/or the ground. Fur-
thermore, many data may not contain any photometric in-
Figure 1. Two point clouds represent two LiDAR scans of the same
area captured at different times and from different aspects. The
proposed method can find precise matches for the points in the
overlapping area of the two point clouds.
formation such as intensity other than point positions.
Given the problems above, matching methods that solely
rely on photometric properties will fail and conventional
techniques or simple extensions of 2D methods are no
longer feasible. The unique nature of point clouds requires
methods and strategies different from those for 2D images.
In real applications, most point clouds are a set of ge-
ometric points representing external surfaces or shapes of
3D objects. We therefore treat the geometry as the essen-
tial information. We need a powerful descriptor as a way to
capture geometric arrangements of points, surfaces and ob-
jects. The descriptor should be invariant to translation, scal-
ing and rotation. In addition, the high dimensional structure
of 3D points must be collapsed into something manageable.
This paper presents a novel technique specifically de-
signed for matching of 3D point clouds. Particularly, our
approach is based on the concept of self-similarity. Self-
similarity is an attractive image property that has recently
found its way in matching in the form of local self-similarity
descriptors [1]. It captures the internal geometric layout of
local patterns in a level of abstraction. Locations in im-
age with self-similarity structure of a local pattern are dis-
tinguishable from locations in their neighbors, which can
greatly facilitate matching process. Several works have
demonstrated the value of self-similarity for image match-
ing and related applications [14] [15]. From a totally new
perspective, we design a descriptor that can efficiently cap-
1
ture distinctive geometric signatures embedded in point
clouds. The resulting 3D self-similarity descriptor is com-
pact and view/scale-independent, and hence can produce
highly efficient feature representation. We apply the devel-
oped descriptor to build a complete feature-based match-
ing system for high performance matching between point
clouds.
2. Related Work
2.1. 3D Matching
3D data matching has recently been widely addressed
in both computer vision and graphic communities. A va-
riety of methods have been proposed, but the approaches
based on local feature descriptors demonstrate superior per-
formance in terms of accuracy and robustness [3] [7]. In
local feature-based approach, the original data are trans-
formed into a set of distinctive local features, each repre-
senting a quasi-independent salient region within the scene.
The features are then characterized with robust descriptors
containing local surface properties that are supposedly re-
peatable and distinctive for matching. Finally, registration
methods such as the famous Iterative Closest Point (ICP)
[5], as well as its variants, could be employed to figure out
the global arrangement.
Spin image is a well-known feature descriptor for 3D
surface representation and matching [3]. One key element
of spin image generation is the oriented point, or 3D surface
point with an associated direction. Once the oriented point
is defined, the surrounding cylindrical region is compressed
to generate the spin image as the 2D histogram of number
of points lying in different distance grids. By using local
object-oriented coordinate system, the spin image descrip-
tor is view and scale independent. Several variations of spin
image have been suggested. For example, [18] proposed a
spherical spin image for 3D object recognition, which can
capture the equivalence classes of spin images derived from
linear correlation coefficients.
Heat Kernel Signature (HKS) proposed in [7] is a type of
shape descriptor targeting for matching objects under non-
rigid transformation. The idea of HKS is to make use of
the heat diffusion process on the shape to generate intrin-
sic local geometry descriptor. It is shown that HKS can
capture much of the information contained in the heat ker-
nel and characterize the shapes up to isometry. Further, [8]
improved HKS to achieve scale-invariant, and developed a
HKS local descriptor that can be used in the bag-of-features
framework for shape retrieval in the presence of a variety of
non-rigid transformations.
Many works also attempt to generalize from 2D to 3D,
such as 3D SURF [6] extended from SURF [4] and 3D
Shape context [19] extended from 2D Shape Context [2].
A detailed performance evaluation and benchmark on 3D
shape matching were reported in [11] that simulates the fea-
ture detection, description and matching stages of feature-
based matching and recognition algorithms. The bench-
mark tests the performance of shape feature detectors and
descriptors under a wide variety of conditions. We also use
the benchmark to test and evaluate our proposed approach.
Recently, the concept of self-similarity has drawn much
attention and been successfully applied for image matching
and object recognition. Shechtman and Irani [1] proposed
the first algorithm that explicitly employs self-similarity to
form a descriptor for image matching. They used the in-
tensity correlation computed in local region as resemblance
to generate the local descriptor. Later on, several exten-
sions and varieties were proposed. For example, Chat-
field et al. [14] combined a local self-similarity descrip-
tor with the bag-of-words framework for image retrieval of
deformable shape classes; Maver [15] used the local self-
similarity measurement for interest point detection; Huang
et al. [17] proposed a 2D self-similarity descriptor for mul-
timodal image matching, with different definitions of self-
similarity evaluated.
This paper extend the self-similarity framework to
matching of 3D point clouds. We develop a new 3D lo-
cal feature descriptor that can efficiently characterizes dis-
tinctive signatures of surfaces embedded in point clouds,
hence can produce high performance matching. To the best
of our knowledge, we are the first one to introduce the self-
similarity to the area of point cloud matching.
The remainder of the paper describes the details of our
proposed approach and implementations. We also present
the results of our analysis and experiments.
3. Point clouds and self-similarity
Originated from fractals and topological geometry, self-
similarity is the property held by those parts of a data or ob-
ject that resemble themselves in comparison to other parts
of the data. The resemblance can be photometric properties,
geometric properties or their combinations.
Photometric properties such as color, intensity or texture
are useful and necessary to measure the similarity between
imagery data, however, they are no longer as reliable on
point cloud data. In many situations, the data may only
contain point positions without any photometric informa-
tion. Therefore, geometric properties such as surface nor-
mals and curvatures are treated as the essential information
for point cloud data.
Particularly, we found that surface normal is the most ef-
fective geometric property that enables human visual per-
ception to distinguish local surfaces or shapes in point
clouds. Normal similarity has shown sufficiently robust to
a wide range of variations that occur within disparate ob-
ject classes. Furthermore, a point and its normal vector can
form a simple local coordinate system that can be used to
2
(a) (b) (c) (d) (e)
Figure 2. Illustration for self-similarities. Column (a) are three point clouds of the same object and (b) are their normal distributions.
There are many noises in the 2nd point cloud, which lead to quite different normals from the other two. However, the 2nd point cloud
shares similar intensity distribution as the 1st point cloud, which ensures that their self-similarity surface (c), quantized bins (d) and thus
descriptors (e) are similar to each other. On the other hand, while the intensity distribution of the 3rd point cloud is different from the
other two, it shares similar normals as the 1st point cloud (3rd row vs. 1st row in column (b)), which again ensures that their self-similarity
surface, quantized bins and descriptors are similar to each other.
generate view/scale-independent descriptor.
Curvature is another important geometric property that
should be considered in similarity measurement. The cur-
vature illustrates the changing rate of tangents. Curved sur-
faces always have varying normal, yet many natural shapes
such as sphere and cylinder preserve the curvature consis-
tency. Therefore, we incorporate the curvature in similar-
ity measurement to characterize local geometry of surface.
Since there are many possible directions of curvature in 3D,
we consider the direction in which the curvature is maxi-
mized, i.e. the principal curvature, to keep its uniqueness.
We also consider the photometric information in our al-
gorithm development to generalize the problem. We assume
the case that both the photometric and geometric informa-
tion are available in the dataset. We propose to use both the
properties as similarity measurements and combine them
under a unified framework.
4. 3D Self-similarity descriptor
Given an interest point and its local region, there are
two major steps to construct the descriptor: (1) generat-
ing the self-similarity surface using the defined similarity
measurements, and (2) quantizing the self-similarity surface
in a rotation-invariant manner. In this work, we consider
similarity measurements on surface normal, curvature, and
photometric properties. Once the similarity measurements
are defined, the local region is converted to self-similarity
surface centered at the interest point, with multiple/united
property similarity at each point. We can then construct the
3D local self-similarity descriptors to generate signatures of
surfaces embedded in the point cloud.
4.1. Generating self-similarity surface
Assume there are property functions f
1
, f
2
, . . . f
n
de-
fined on a point set X, which map any point x X to
property vectors f
1
(x), f
2
(x), . . . f
n
(x). For 2D images,
the property can be intensities, colors, gradients or textures.
In our 3D situation, the property set can further include nor-
mals and curvatures, besides intensities/colors.
For each property function f that has definition on two
points x and y, we can further induce a pointwise similar-
ity function s(x, y, f). Then, the united similarity can be
defined as the combination of the similarity functions of all
the properties. Figure 2 gives the intuition of how combined
self-similarity would work for different data.
4.1.1 Normal similarity
The normal gives the most direct description of the shape
information, especially for the surface model. One of the
most significant characteristics of the normal distribution is
the continuity, which means the normal similarity is usually
positively correlated to the distance between the two points.
However, any non-trivial shape could disturb the distribu-
3
Figure 3. Self-similarity surface of normals. The brighter a point
is, the more similar its normal is to the normal at the center point.
tion of normals, which gives the descriptive power of the
normal similarity.
We use the method described in [13] to extract the nor-
mals. Figure 2(b) are examples of normal distributions.
The property function of normal is a 3D function
f
normal
(x) = n(x). Assume that the normals are normal-
ized i.e. n(x) = 1, we can define the normal similarity
between two points x and y as the angle between the nor-
mals, as formula 1 suggests.
s(x, y , f
normal
) = [π cos
1
(f
normal
(x) · f
normal
(y))]
= [π cos
1
(n(x) · n(y))].
(1)
It’s easy to see that when the angle is 0, the function re-
turns 1; whereas the angle is π, i.e. the normals are opposite
to each other, the function returns 0.
We should be careful that a locally stable normal estima-
tion method is needed here to ensure that the directions of
normals are consistent with each other, because flipping one
normal could lead to the opposite result of the function.
Figure 3 shows the visualization of the self-similarity
surface of normals of one key point.
4.1.2 Curvature similarity
The curvature illustrates the changing rate of tangents.
Curved surfaces always have varying normals, yet many
natural shapes such as sphere and cylinder preserve the cur-
vature consistency. Therefore, it’s worthwhile to incorpo-
rate the curvature information in the measurement of sim-
ilarity. Since there are infinite possible directions of cur-
vature in 3D, we only consider the direction in which the
curvature is maximized, i.e. the principal curvature.
The principal curvature direction can be approximated as
the eigen vector corresponding to the largest eigen value of
the covariance matrix C of normals projected on the tangent
plane. The property function of curvature is defined as a
single-value function
f
curv
(x) =
1
N
arg max(λ|det(C λI) = 0), (2)
where N is the number of points (normals) taken into ac-
count in the neighborhood so that values are scaled to the
range from 0 to 1 (In practice this value is typically less
than 0.7). We then define the curvature similarity between
two points x and y as the absolute difference between them:
s(x, y , f
curv
) = 1 |f
curv
(x) f
curv
(y)|.
(3)
Again, the function returns 1 when the curvature values are
similar, and returns 0 when they are different.
4.1.3 Photometric similarity
Photometric information is an important clue for our un-
derstanding of the world. For example, when we look at a
gray image, we can infer the 3D structure through the ob-
servation of changes in intensity. Some point clouds, be-
sides point positions, also contain certain photometric infor-
mation such as intensity or any reflective values generated
by sensors. Such information is a combinational result of
geometric structure, material, lighting and even shadows.
While they are not as generally reliable as geometric in-
formation for point clouds, they can be helpful in specific
situations and we also incorporated them in the similarity
measurement. We try to use the photometric similarity to
model this complicated situation as it is invariant to lighting
to some extent, given the similar material properties.
In our current framework, the property function of pho-
tometry is a single-value function f
photometry
(x) = I(x)
where I(x) is the intensity function. With the range nor-
malized to [0, 1], we can define the photometric similarity
between two points x and y as their absolute difference:
s(x, y , f
photometry
) = 1 |f
photometry
(x) f
photometry
(y)|
= 1 |I(x) I(y)|.
(4)
4.1.4 United Similarity
Given a set of properties, we need to combine them to mea-
sure the united similarity:
s(x, y ) =
pPropertySet
w
p
· s(x, y, f
p
). (5)
The weights w
p
[0, 1] can be experimentally determined
according to availability and contribution of each consid-
ered property. For example, when dealing with point clouds
converted from mesh models, we will let w
photometry
= 0
since there’s no intensity information in the data. Another
example is when we have known that there are many noise
points in the data, which makes the curvature estimation
unstable, we can reduce its weight accordingly. In gen-
eral cases, the equal weights or weights of 2:1:1 (with nor-
mals dominating) are good enough without prior knowl-
edge. Learning the best weights from different data sets,
however, could be an interesting topic.
4
Once the similarity measurements are defined, construc-
tion of self-similarity surface is straightforward. First, the
point cloud is converted to 3D positions with the defined
properties. Then, the self-similarity surface is constructed
by comparing each point’s united similarity to that of sur-
rounding points within a local spherical volume. The radius
of the sphere is k times the detected scale at which the prin-
cipal curvature reaches its local maxima. The choice of the
size can determine whether the algorithm is performed at
a local region or more like a global region. We found by
experiments that the performance is the best when k 4.
4.2. Forming the Descriptor
Our approach is trying to make full use of all kinds of
geometric information on the point cloud, mainly including
the normal and curvature, which can be seen as the first-
order and the second-order differential quantities. Since we
are facing discrete data, certain approximations are needed
for calculation. Such approximations have been provided
by open-source libraries such as Point Cloud Library (PCL).
The rotation invariance is achieved by using local refer-
ence system (Fig. 4) of each given key point: the origin
is placed at the key point; the z-axis is the direction of the
normal; the x-axis is the direction of the principal curvature;
and the y-axis is the cross product of z and x directions.
In order to reduce the dimension as well as bearing small
distortion of the data, we quantize the correlation space into
bins. In our experiments we have #Bin(r) = 6 radial bins,
#Bin(ϕ) = 8 bins in longitude ϕ and #Bin (θ) = 6 bins
in latitude θ, and replace the values in each cell with the
average similarity value of all points in the cell, resulting in
a descriptor of 6*8*6 = 288 dimensions (Fig. 4).
The index of each dimension can be represented by
a triple (Index (r), Index (ϕ), Index (θ)), ranging from
(0,0,0) to (5,7,5). Each index component can be calculated
in the following way:
Index (r) = #Bin (r ) ·
r
scale
Index (ϕ) = #Bin(ϕ) ·
ϕ
2π
Index (θ) = #Bin (θ ) ·
θ
π
(6)
In the final step, the descriptor is normalized by scaling
the dimensions with the maximum value to be 1.
5. Point Cloud Matching
We apply the developed descriptor to build a complete
feature-based point cloud matching system. Point clouds
often contain hundreds of millions of points, yielding a
large high dimensional feature space to search, index and
match. So selection of the most distinctive and repeatable
features for matching is a necessity.
Figure 4. Illustration of the local reference frame and quantization.
5.1. Multi-scale feature detector
Feature, or key point extraction is a necessary step be-
fore the calculation of 3D descriptor because (1) 3D data
always have too many points to calculate the descriptor on;
(2) distinctive and repeatable features will largely enhance
the accuracy of matching. There are many feature detec-
tion methods evaluated in [9]. Our approach detects salient
features with a multi-scale detector, where 3D peaks are de-
tected in both scale-space and spatial-space. Inspired by
[10], we propose to extract key points based on the local
Maxima of Principle Curvature (MoPC), which provide rel-
atively stable interest regions compared to a range of other
interest point detectors. Note that different from [20], where
the scale-invariant curvature is measured, we make use of
the variation of the curvature to extract the specific scale.
The first step is setting up several layers of different
scales. Assume the diameter of the input point cloud is d.
We choose one tenth of d as the largest scale, and one sixti-
eth of d as the minimum scale. The intermediate scales are
interpolated so that the ratios between them are constant.
Next, for each point p and scale s, we calculate the prin-
cipal curvature using points that lie within s units from p.
The calculation process is discussed in 4.1.2.
Finally, if the principal curvature value of point p at
scale s is larger than the value of the same point p at ad-
jacent scales and the value of all points within one third of s
units from p at scale s, meaning that the principal curvature
reaches local maxima across both scale and local neighbor-
hood of p, then p is added to the key point set with scale s.
Note that the same point could appear in multiple scales.
Figure 5 shows the feature points detected in the model.
5.2. Matching criteria
In case there can be multiple regions (and thus descrip-
tors) that are similar to each other, we follow the Nearest
Neighbor Distance Ratio (NNDR) method i.e. matching a
key point in cloud X to a key point in cloud Y if and only if
dist (x, y
1
)
dist (x, y
2
)
< threshold , (7)
where y
1
is the nearest neighbor of x in point cloud Y and
y
2
is the 2nd nearest neighbor of x in point cloud Y (in
5
Figure 5. Detected salient features (highlighted) with the proposed
multi-scale detector. Different sizes/colors of balls indicate differ-
ent scales at which the key points are detected. These features turn
out to be distinctive, repeatable and compact for matching.
the feature space). Balancing between the number of true
positives and false positives, the threshold is typically set to
be 0.75 in our experiments.
5.3. Outlier Removal
After a set of local matches are selected, we can per-
form outlier removal using global constraints. If it’s known
that there are only rigid body and scale transformations, a
3D RANSAC algorithm is applied to determine the trans-
formation that allow maximum number of matches to fit in.
Figure 10 (b) shows the filtered result for Fig. 10 (a). In
the future, variants of L
1
norm instead of L
2
norm could
be considered as penalty function, which has been proved
superior in optical flow methods.
6. Experimental Results and Evaluation
The proposed approach have been extensively tested and
evaluated using various datasets including both synthetic
data from standard benchmarks and our own datasets, cov-
ering a wide variety of objects and conditions. We evaluated
the effectiveness of our approach in the terms of distinctive-
ness, robustness and invariance.
6.1. SHREC data
We use the SHREC feature descriptor benchmark [11]
[12] and convert the mesh model to point cloud by keep-
ing only the vertices for our test. This benchmark includes
shapes with a variety of transformations such as holes, mi-
cro holes, scale, local scale, noise, shot noise, topology,
sampling and rasterization (Fig. 6).
Table 1 and 2 show the average normalized L
2
error of
SS-Normal descriptors and SS-Curvature descriptors at cor-
responding points detected by MoPC. Note that only the
transformations with clear ground truth are compared here
i.e. the isometry shape is used as the comparing template.
Given the set of detected features F (X) and F (Y ), the de-
scriptor quality is the calculated at corresponding points x
k
Transform.
Strength
1 2 3 4 5
Holes 0.11 0.22 0.33 0.45 0.52
Local scale 0.40 0.58 0.67 0.77 0.83
Sampling 0.31 0.46 0.57 0.67 0.81
Noise 0.58 0.65 0.70 0.74 0.77
Shot noise 0.13 0.23 0.28 0.32 0.35
Average 0.34 0.45 0.53 0.60 0.67
Table 1. Robustness of Normal Self-Similarity descriptor based on
features detected by MoPC (average L
2
distance between descrip-
tors at corresponding points). Average number of points: 518
Transform.
Strength
1 2 3 4 5
Holes 0.10 0.21 0.31 0.42 0.49
Local scale 0.38 0.56 0.65 0.75 0.81
Sampling 0.44 0.55 0.63 0.71 0.87
Noise 0.55 0.62 0.67 0.72 0.75
Shot noise 0.13 0.22 0.27 0.31 0.33
Average 0.32 0.43 0.51 0.58 0.65
Table 2. Robustness of Curvature Self-Similarity descriptor based
on features detected by MoPC (average L
2
distance between de-
scriptors at corresponding points). Average number of points: 518
(with descriptor f
k
(k = 1, 2, . . . , |F (Y )|) and y
j
(with de-
scriptor g
j
, j = 1, 2, . . . , |F (X)|):
d
kj
=
||f
k
g
j
||
2
1
|F (X)|
2
−|F (X)|
k,j̸=k
||f
k
g
j
||
2
,
(8)
and then sum up using (note that |F (X)| = |F (Y )| when
we only consider corresponding points):
Q =
1
|F (X)|
|F (X)|
k=1
d
kj
.
(9)
The results are competitive to the state-of-the-art methods
compared in [11] [12].
For synthetic data under rotational and scale transfor-
mations, the proposed feature detector and descriptor can
achieve nearly fully correct matches, e.g. Fig. 7 (a) and
(b). Figure 7 (c), (d) and (e) show the matching results be-
tween human point clouds from SHREC’10 [11] dataset un-
der affine transformation, with holes and after rasterization.
The feature extraction and descriptor calculation take about
1-2 minutes on typical data with around 50,000 points.
Another set of test data is from TOSCA High-resolution
[16]. Figure 8 (a) is a typical matching example using dense
self-similarity descriptor and 8 (b) is the matching example
using MoPC feature-based self-similarity descriptor. Figure
9 (a) shows the precision-recall curve for the wolf data.
6
Figure 6. SHREC benchmark dataset. The transformations are (from left to right): isometry, holes, micro holes, scale, local scale, noise,
shotnoise, topology and sampling.
(a) (b)
(c) (d) (e)
Figure 7. Matching result between human point clouds with rotation, scale, affine transformation, holes and rasterization.
(a) (b)
Figure 8. (a) is the matching result with dense SS descriptor of
different poses of a cat. (b) is the matching result with MoPC
feature-based SS descriptor of different poses of a wolf.
6.2. Lidar point clouds
Table 3 shows the comparison of different configurations
of similarity on 50 pairs of randomly sampled data in ran-
domly chopped 300m * 300m * 50m area of Richard Cor-
ridor in Vancouver. The precision is the ratio between the
number of correctly identified matches (TP) and all matches
(TP + FP). The running time is about 15s per pair.
We also perform experiments on different point clouds
of the same region. Figure 9 (b) shows the precision-recall
curve for 3D Self-similarity descriptor working on data
chopped out from the LiDAR data (600m * 600m * 60m,
100,000 points) of Richard Corridor in Vancouver.
0 0.05 0.1 0.15 0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Recall
Precision
(a)
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Recall
Precision
(b)
Figure 9. (a) is the precision-recall curve for the 3D Self-
similarity descriptor between two wolf models from TOSCA-
HighResolution Data. (b) is the precision-recall curve for the 3D
Self-similarity descriptor on Vancouver Richard Corridor data.
Property Precision
Normal 55%
Curvature 49%
Photometry 49%
Normal+Curvature+Photometry 51%
Table 3. Evaluation of matching results with different configura-
tions (weights) of united self-similarity. Pure normal similarity
performs the best overall, but curvature/photometry similarity can
do better for specific data.
7
(a) (b)
Figure 10. Matching result of aerial LiDAR out of two scans of the Richard Corridor area in Vancouver. (b) is the filtered result of (a).
In real applications there are tremendous data that might
spread across large scales. Our framework can also deal
with large scale data by divide-and-conquer since we only
require local information to calculate the descriptors. Fig-
ure 10 shows the matching result of aerial LiDAR out of
two scans of the Richard Corridor area in Vancouver.
7. Conclusion
In this paper we have extended the 2D self-similarity
descriptor to the 3D spatial space. The new feature-based
3D descriptor is invariant to scale and orientation change.
The new descriptor achieves competitive results on 3D point
cloud data from public dataset TOSCA as well as aerial Li-
DAR data. Since meshes or surfaces can be sampled and
transformed into the point cloud or voxel representations,
the method can be easily adapted to the matching models to
point clouds, or models to models. We are currently work-
ing on the matching propagation and shape retrieval scheme
based on our descriptor.
References
[1] E. Shechtman and M. Irani. Matching Local Self-Similarities
across Images and Videos. In Proc. CVPR, 2007.
[2] S. Belongie, J. Malik, and J. Puzicha. Shape Matching and
Object Recognition using Shape Contexts. Trans. PAMI,
24(4):509.522, 2002.
[3] A. Johnson and M. Hebert. Object recognition by Match-
ing Oriented Points. In Proceedings of the Conference on
Computer Vision and Pattern Recognition, Puerto Rico, USA,
pages 684.689, 1997. ICCV 1997.
[4] H. Bay, T. Tuytelaars, and L. Van Gool. SURF: Speeded Up
Robust Features. In Proc. ECCV, pp. 404-417, 2006.
[5] P. Besl and N. McKay. A Method for Registration of 3-D
Shapes. Trans. PAMI, Vol. 14, No. 2, 1992.
[6] J. Knopp, M. Prasad, G. Willems, R. Timofte, and L. Van
Gool. Hough Transform and 3D SURF for Robust Three Di-
mensional Classification. In: ECCV 2010.
[7] J. Sun, M. Ovsjanikov, and L. Guibas. A Concise and Prov-
ably Informative Multi-scale Signature based on Heat Diffu-
sion. In: SGP. 2009.
[8] M. M. Bronstein and I. Kokkinos. Scale-Invariant Heat Kernel
Signatures for Non-rigid Shape Recognition. CVPR 2010.
[9] S. Salti, F. Tombari and L. Di Stefano. A Performance Evalu-
ation of 3D Keypoint Detectors. International Conference on
3DIMPVT, 2011.
[10] A. Mian, M. Bennamoun, and R. Owens. On the Repeata-
bility and Quality of Keypoints for Local Feature-based 3D
Object Retrieval from Cluttered Scenes. IJCV 2009.
[11] A. M. Bronstein, M. M. Bronstein, B. Bustos, U. Castel-
lani, M. Crisani, B. Falcidieno, L. J. Guibas, I. Kokkinos, V.
Murino, M. Ovsjanikov, G. Patan, I. Sipiran, M. Spagnuolo,
J. Sun. SHREC 2010: Robust Feature Detection and Descrip-
tion Benchmark. Proc. EUROGRAPHICS Workshop on 3D
Object Retrieval (3DOR), 2010.
[12] E. Boyer, A. M. Bronstein, M. M. Bronstein, B. Bustos, T.
Darom, R. Horaud, I. Hotz, Y. Keller, J. Keustermans, A.
Kovnatsky, R. Litman, J. Reininghaus, I. Sipiran, D. Smeets,
P. Suetens, D. Vandermeulen, A. Zaharescu, and V. Zobel.
SHREC 2011: Robust Feature Detection and Description
Benchmark. ArXiv e-prints, February 2011.
[13] R. B. Rusu, Z. C. Marton, N. Blodow, M. Dolha, and M.
Beetz. Towards 3D Point Cloud Based Object Maps for
Household environments. Robotics and Autonomous Systems
Journal (Special Issue on Semantic Knowledge), 2008.
[14] K. Chatfield, J. Philbin, and A. Zisserman. Efficient Retrieval
of Deformable Shape Classes using Local Self-Similarities. In
NORDIA Workshop at ICCV 2009, 2009.
[15] J. Maver. Self-Similarity and Points of Interest. Trans.
PAMI, Vol. 32, No. 7, pp. 1211-1226, 2010.
[16] A. M. Bronstein, M. M. Bronstein, and R. Kimmel. Nu-
merical geometry of non-rigid shapes. Springer, 2008. ISBN:
978-0-387-73300-5.
[17] J. Huang, S. You, and J. Zhao. Multimodal Image Matching
using Self-Similarity. 2011 IEEE Applied Imagery Pattern
Recognition Workshop, Washington DC, 2011.
[18] S. Ruiz-Correa, L. G. Shapiro, and M. Meil ˇa. A New
Signature-based Method for Efficient 3-D Object Recogni-
tion. In Proc. CVPR, 2001.
[19] M. K
¨
ortgen, G.-J. Park, M. Novotni, and R. Klein. 3D shape
matching with 3D Shape Contexts. In The 7th Central Euro-
pean Seminar on Computer Graphics, April 2003.
[20] J. Rugis, and R. Klette. A Scale Invariant Surface Curvature
Estimator. In Proc. PSIVT, pages 138-147, 2006.
8
... There are also many improved feature description-based methods, for example, Xiangqian et al [22] introduced isomorphic subgraphs to describe the structure between marker points, and performed marker point 3D matching by searching for other isomorphic subgraphs with similar structure, which improved the algorithm efficiency; Jing and Suya [23] described marker points features by geometric similarity of 3D graph matching, which improved the matching accuracy, but the process of 3D graph construction was complicated; Weng et al [24] proposed the marker points energy principle matching method to describe the structure of marker points, the energy value is defined based on the structure between the marker points to be matched and the surrounding marker points and the probability of correct matching between pairs of points. The first matched point pair is determined by the unique maximum energy value, after that, the other matching point pairs are obtained by the congruence of the triangles formed by the matched marker points and the surrounding marker points. ...
... Inspired by the above research [22][23][24], we propose a matching method based on the Delaunay triangulation structure through 3D Delaunay triangulation of the set of marker points, and later deconstructing the 3D structural units into 2D units for matching. After finding as many potentially matched pairs of marker points as possible, the initial transformation matrix is calculated, transformation results are iteratively verified and mismatched results are removed based on the error dispersion. ...
Article
Full-text available
Point cloud registration techniques based on marker points are widely used in optical 3D industrial measurements. However, in this process, marker points 3D matching methods are often haunted by low efficiency and accuracy. To improve the performance of marker points 3D matching, we propose a two-step method of ‘matching-verification’. In the matching process, Delaunay triangulation is introduced to extract the 3D structure of the marker points set, and then the 3D structure is deconstructed into 2D units for matching, which simplifies complexity and improves the efficiency of the algorithm. In the verification process, the mismatched pairs of points are located and removed by the method that is based on the error dispersion of initial matched results, and the initial transformation results are iteratively verified to obtain the optimal transformation matrix. The experimental results show that our method takes an average of 2.2 s for each matching, the average error of coarse registration point cloud is 0.075 mm and the root mean square is 0.219 mm, which effectively solves the problem of the low efficiency and accuracy of marker points 3D matching methods.
... We assert that this is feasible using computationally efficient algorithms, while human drivers often rely on visual matching to locate themselves, computerbased visual matching offers unique advantages and challenges. Our focus here is on lidar point cloud matching [4]. Lidar output consists of three-dimensional point sets, which are generally considered more computationally tractable than processing color images. ...
... After extracting the global signature of the point features from both the AOs and the objects in the BIM object library, the framework employs CS algorithm to identify the EO that bears the highest similarity to the AO. It is mainly used for the point-based similarity calculation (Collins et al., 2023;Huang & You, 2012). In the below equation 1), A and X represents the feature vector extracted from BIM object library and AO respectively. ...
Conference Paper
Full-text available
As-is building information model (BIM) is regarded as the mainstream solution of the digital twin (DT) for the intelligent building management, especially in the facilities management (FM) phase for the existing buildings. The current automatic scan-to-BIM methods mainly focus on the detailed geometric information modelling. However, the attributes information of the 'secondary' building objects is equally valuable comparing to that of the primary structural objects in the FM workflow. These components may include the light fixture, plumbing and heating terminal and furniture. The knowledge supporting to the FM practice can be extracted based on the geometric and attribute information. Therefore, the 'secondary' building components should be efficiently modelled as the main operation and maintenance targets in the FM phase. This paper proposes an automatic 'secondary' object-based BIM model retrieval method based on segmented point cloud model. The machine learning (ML) supported technical conceptual framework will be introduced in this paper.
... Compared to CNNs, this implementation can achieve a prediction with significantly reduced training data. Different methods for matching surfaces or point clouds [23] are possible. The usage of the Coherent Point Drift algorithm (CPD) has shown a broad spectrum of applications. ...
Article
Full-text available
As manufacturing and assembly processes continue to require more adaptable systems for automated handling, innovative solutions for universal gripping are emerging. These grasping systems can enable the handling of wide varieties of shapes, with gripping forces varying with grasped geometries. For the efficient usage of handling systems, precise offline and online prediction models for resulting grasping forces for different objects are necessary. In previous research, a flexible vacuum-based granular gripper was developed, for which no option for predicting gripping forces is currently available. Various gripping force prediction methodologies within the current state of the art are examined and evaluated. For an assessment of grasping forces of previously untested objects for the examined gripper with limited data and low computational effort, two methodologies are proposed. An analytical, 2D-geometry-derived gripper-specific metric for geometries is compared to a methodology based on similarities of objects to a small existing dataset. The applicability and prediction quality for different object types is analyzed through validation experiments. Gripping force estimations are possible with both methodologies, with individual weaknesses towards geometric features such as air permeabilities. With further development, robust predictions of gripping forces could be achieved for a wide range of unknown object geometries with limited experimental effort.
... 3D Self-similarity descriptor (Huang et al. 2012) contains two major steps. The first step computes the normal similarity s x, y, f normal = − cos −1 (n(x) − n(y)) ∕ , curvature similarity s x, y, f curv = 1 − | | f curv (x) − f curv (y) | | and photometric similarity s x, y, f photometry = 1 − |I(x) − I(y)| , between two points x and y. ...
Article
Full-text available
The development of inexpensive 3D data acquisition devices has promisingly facilitated the wide availability and popularity of point clouds, which attracts increasing attention to the effective extraction of 3D point cloud descriptors for accuracy of the efficiency of 3D computer vision tasks in recent years. However, how to develop discriminative and robust feature representations from 3D point clouds remains a challenging task due to their intrinsic characteristics. In this paper, we give a comprehensively insightful investigation of the existing 3D point cloud descriptors. These methods can be principally divided into two categories according to their advancement: hand-crafted and deep learning-based approaches, which will be further discussed from the perspective of elaborate classification, their advantages, and limitations. Finally, we present the future research directions of the extraction of 3D point cloud descriptors.
... In 2012, Huang et al. [40] developed the concept of self-similarity on 3D point clouds. Accordingly, self-similarity includes the similarity of normals between a central point and points located in a neighborhood, according to Equation (8). ...
Article
Full-text available
Recent advances in 3D laser scanner technology have provided a large amount of accurate geo-information as point clouds. The methods of machine vision and photogrammetry are used in various applications such as medicine, environmental studies, and cultural heritage. Aerial laser scanners (ALS), terrestrial laser scanners (TLS), mobile mapping laser scanners (MLS), and photogrammetric cameras via image matching are the most important tools for producing point clouds. In most applications, the process of point cloud registration is considered to be a fundamental issue. Due to the high volume of initial point cloud data, 3D keypoint detection has been introduced as an important step in the registration of point clouds. In this step, the initial volume of point clouds is converted into a set of candidate points with high information content. Many methods for 3D keypoint detection have been proposed in machine vision, and most of them were based on thresholding the saliency of points, but less attention had been paid to the spatial distribution and number of extracted points. This poses a challenge in the registration process when dealing with point clouds with a homogeneous structure. As keypoints are selected in areas of structural complexity, it leads to an unbalanced distribution of keypoints and a lower registration quality. This research presents an automated approach for 3D keypoint detection to control the quality, spatial distribution, and the number of keypoints. The proposed method generates a quality criterion by combining 3D local shape features, 3D local self-similarity, and the histogram of normal orientation and provides a competency index. In addition, the Octree structure is applied to control the spatial distribution of the detected 3D keypoints. The proposed method was evaluated for the keypoint-based coarse registration of aerial laser scanner and terrestrial laser scanner data, having both cluttered and homogeneous regions. The obtained results demonstrate the proper performance of the proposed method in the registration of these types of data, and in comparison to the standard algorithms, the registration error was diminished by up to 56%.
Article
Metamaterials are designed with intrinsic geometries to deliver unique properties, and recent years have witnessed an upsurge in leveraging additive manufacturing (AM) to produce metamaterials. However, the frequent occurrence of geometric defects in AM poses a critical obstacle to realizing the desired properties of fabricated metamaterials. Advances in three-dimensional (3D) scanning technologies enable the capture of fine-grained 3D geometric patterns, thereby providing a great opportunity for detecting geometric defects in fabricated metamaterials for property-oriented quality assurance. Realizing the full potential of 3D scanning-based quality control hinges largely on devising effective approaches to process scanned point clouds and extract geometric-pertinent information. In this study, a novel framework is developed to integrate recurrence network-based 3D geometry profiling with deep one-class learning for geometric defect detection in AM of metamaterials. First, we extend existing recurrence network models that focus on image data to representing 3D point clouds, by designing a new mechanism that characterizes points' geometric pattern affinities and spatial proximities. Then, a one-class graph neural network (GNN) approach is tailored to uncover topological variations of the recurrence network and detect anomalies that associated with geometric defects in the fabricated metamaterial. The developed methodology is evaluated through comprehensive simulated and real-world case studies. Experimental results have highlighted the efficacy of the developed methodology in identifying both global and local geometric defects in AM-fabricated metamaterials.
Article
Full-text available
This paper presents a new image description and matching process based on internal self-similarity property of images. Various definitions of self-similarity are explored to find the best one for image matching. The method also ensures rotation and scale invariance and computational efficiency through a feature detection process. Experiments demonstrate that the proposed method increases robustness of image matching under different imaging conditions or modalities.
Conference Paper
Full-text available
In this paper, we present a novel scale- and rotation-invariant interest point detector and descriptor, coined SURF (Speeded Up Robust Features). It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster. This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (in casu, using a Hessian matrix-based measure for the detector, and a distribution-based descriptor); and by simplifying these methods to the essential. This leads to a combination of novel detection, description, and matching steps. The paper presents experimental results on a standard evaluation set, as well as on imagery obtained in the context of a real-life object recognition application. Both show SURF’s strong performance.
Article
Content based 3D shape retrieval for broad domains like the World Wide Web has recently gained considerable at-tention in Computer Graphics community. One of the main challenges in this context is the mapping of 3D ob-jects into compact canonical representations referred to as descriptors or feature vector, which serve as search keys during the retrieval process. The descriptors should have certain desirable properties like invariance under scaling, rotation and translation as well as a descriptive power providing a basis for similarity measure between three-dimensional objects which is close to the human notion of resemblance. In this paper we introduce an enhanced 3D approach of the recently introduced 2D Shape Contexts that can be used for measuring 3d shape similarity as fast, intuitive and powerful similarity model for 3D objects. The Shape Context at a point captures the distribution over relative positions of other shape points and thus summarizes global shape in a rich, local descriptor. Shape Contexts greatly simplify recovery of correspondences between points of two given shapes. Moreover, the Shape Context leads to a robust score for measuring shape similarity, once shapes are aligned.
Conference Paper
We present an efficient object retrieval system based on the identification of abstract deformable `shape' classes using the self-similarity descriptor of Shechtman and Irani. Given a user-specified query object, we retrieve other images which share a common `shape' even if their appearance differs greatly in terms of colour, texture, edges and other common photometric properties. In order to use the self-similarity descriptor for efficient retrieval we make three contributions: (i) we sparsify the descriptor points by locating discriminative regions within each image, thus reducing the computational expense of shape matching; (ii) we extend to enable matching despite changes in scale; and (iii) we show that vector quantizing the descriptor does not inhibit performance, thus providing the basis of a large-scale shape-based retrieval system using a bag-of-visual-words approach. Performance is demonstrated on the challenging ETHZ deformable shape dataset and a full episode from the television series Lost, and is shown to be superior to appearance-based approaches for matching non-rigid shape classes.
Article
This article investigates the problem of acquiring 3D object maps of indoor household environments, in particular kitchens. The objects modeled in these maps include cupboards, tables, drawers and shelves, which are of particular importance for a household robotic assistant. Our mapping approach is based on PCD (point cloud data) representations. Sophisticated interpretation methods operating on these representations eliminate noise and resample the data without deleting the important details, and interpret the improved point clouds in terms of rectangular planes and 3D geometric shapes. We detail the steps of our mapping approach and explain the key techniques that make it work. The novel techniques include statistical analysis, persistent histogram features estimation that allows for a consistent registration, resampling with additional robust fitting techniques, and segmentation of the environment into meaningful regions.
Conference Paper
Intense research activity on 3D data analysis tasks, such as object recognition and shape retrieval, has recently fostered the proposal of many techniques to perform detection of repeatable and distinctive key points in 3D surfaces. This high number of proposals has not been accompanied yet by a comprehensive comparative evaluation of the methods. Motivated by this, our work proposes a performance evaluation of the state-of-the-art in 3D key point detection, mainly addressing the task of 3D object recognition. The evaluation is carried out by analyzing the performance of several prominent methods in terms of robustness to noise (real and synthetic), presence of clutter, occlusions and point-of-view variations.
Conference Paper
One of the biggest challenges in non-rigid shape retrieval and comparison is the design of a shape descriptor that would maintain invariance under a wide class of transformations the shape can undergo. Recently, heat kernel signature was introduced as an intrinsic local shape descriptor based on diffusion scale-space analysis. In this paper, we develop a scale-invariant version of the heat kernel descriptor. Our construction is based on a logarithmically sampled scale-space in which shape scaling corresponds, up to a multiplicative constant, to a translation. This translation is undone using the magnitude of the Fourier transform. The proposed scale-invariant local descriptors can be used in the bag-of-features framework for shape retrieval in the presence of transformations such as isometric deformations, missing data, topological noise, and global and local scaling. We get significant performance improvement over state-of-the-art algorithms on recently established non-rigid shape retrieval benchmarks.