ArticlePDF Available

# An Algebraic Topological Method for Feature Identification.

Authors:

## Abstract and Figures

We develop a mathematical framework for describing local features of a geometric object| such as the edges of a square or the apex of a cone|in terms of algebraic topological invariants. The main tool is the construction of a tangent complex for an arbitrary geometrical object, generalising the usual tangent bundle of a manifold. This framework can be used to develop algorithms for automatic feature location. We give several examples of applying such algorithms to geometric objects represented by point-cloud data sets.
Content may be subject to copyright.
An Algebraic Topological Method for Feature Identiﬁcation
Erik Carlsson, Gunnar Carlssonand Vin de Silva
August 12, 2003
Abstract
We develop a mathematical framework for describing local features of a geometric object—
such as the edges of a square or the apex of a cone—in terms of algebraic topological invariants.
The main tool is the construction of a tangent complex for an arbitrary geometrical object,
generalising the usual tangent bundle of a manifold. This framework can be used to develop
algorithms for automatic feature location. We give several examples of applying such algorithms
to geometric objects represented by point-cloud data sets.
1 Introduction
In attempting to recognize geometric objects, it is often very useful to ﬁrst recognize iden-
tiﬁable features of the object in question. For example, in correctly identifying a square a
natural ﬁrst step is to locate the corners; this information is enough to determine which
square we are dealing with. Similarly, if the object in question is a convex polyhedron, then
the vertices and edges of the polyhedron are the most important features to identify. In the
case of a cone, one looks for the cone point. It is an interesting problem theoretically and
computationally to construct automatic methods for locating such features.
In order to develop such methods, it is ﬁrst necessary to make mathematical sense of the
notion of “feature”. A reasonable starting point, based on the examples above, is to deﬁne
features as singular points of geometric curves, surfaces, etc. Accordingly, in this paper we
set ourselves the task of developing automatic methods for locating singular points on a
curve, surface, or higher dimensional geometric object.
A desirable feature of such methods is that they should be robust to deformation, to a
certain degree. For example, in optical character recognition, it is important that variously
deformed versions of a given character should be identiﬁed as being equivalent and having
equivalent features. The methods we develop here ought to be able to recognize the apex,
T-junctions and leg-ends of an upper-case letter “A”, even if that letter has been sheared,
or bent, or compressed in the vertical or horizontal direction. Another situation where
robustness is important occurs when an object is viewed in two diﬀerent coordinate systems.
The locus given by {(r, θ) : 1 r2,0θπ/2}looks like a rectangle in (r, θ)-space,
but it looks like a sector of an annulus when viewed in rectangular coordinates. We develop
methods which detect properties of this locus which are invariant under such coordinate
changes.
A typical method (see [3] or [5]), for dealing with such questions is to develop templates,
equipped with parameters, with the hope that the ﬁgure in question will be very close to a
Supported in part by NSF DMS-0101364
Supported in part by NSF DMS-0101364
1
template model, for some choice of the parameter values. For example, in the case of the
letter “A” above, one might have a template consisting of a standard letter “A”, together
with two parameters describing vertical and horizontal compression of the letter. This family
of templates may be adequate for a particular class of documents, but it would not be
adequate in documents where a “sheared” letter is permitted. Of course, a new parameter
can be added which describes the shear. To cover an even larger class of documents, perhaps
containing instances of “A” where some of the line segments deﬁning it are in fact curves,
yet more parameters are necessary. Clearly this can become unwieldy quite quickly.
By contrast, our approach uses algebraic topology to locate and identify relevant features
of objects without requiring the choice of templates, or of parametrized families of defor-
mations. Our method permits us to conclude the existence of a singular point, without
having to match it with any particular model of the particular singularity. For example, a
sharp bend (“corner”) in a curve can be recognized without having to match that region of
the curve to any particular pair of lines locally. The idea is to identify algebraic topological
invariants which can recognize a singular point, and which are by their nature deformation
invariant; instead of trying to match with a larger and larger family of templates.
These invariants automatically distinguish between diﬀerent kinds of singular points. For
example, if our underlying point set is a cube, the set of singular points consists of all the
edges on the cube; this includes the vertices, which are common to multiple edges. However,
we may wish to isolate the vertices directly. This can be done by adjusting a single parameter
in the search. We use a topological invariant referred to as the ﬁrst Betti number β1in this
case, and setting β1>1 ﬁnds all singular points, while setting β1>3 will ﬁnd only the
vertices.
A key consideration is in what form the geometric objects are presented. For instance,
if they are presented using ﬁnite systems of algebraic equations and inequalities, then it is
typically feasible to determine the collection of singular points explicitly. In this paper, we
will instead deal with point cloud data, i.e ﬁnite but large sets of points sampled from a
geometric object in Euclidean space. Dealing with spaces presented in this form produces
computational challenges for us, since one must determine how to “estimate” the topological
invariants from a geometric object using only a ﬁnite sample from it.
1.1 Overview of the method
We now give an informal description of our method. An initial observation is that many
singular points are topologically standard. This means that there is a continuous, but not
smooth, change of coordinates which transforms the surface locally into a smooth model.
Since topological invariants are insensitive to such coordinate changes, this means that we
cannot apply topological invariants directly to the spaces in question to detect these features.
We are instead forced to consider constructions on the surface, which are sensitive to the
local smooth structure, and which produce spaces which can be distinguished by topological
methods. In this paper, we will develop an extension of the concept of the tangent bundle
to a smooth submanifold of Rn, which applies to more general subsets. We will refer to
this construction as the tangent complex T(X) of a subset XRn; the tangent complex
is a subset of X×Rn. It is closely related to the notion of tangent cone used in geometric
measure theory (see [4]).
In many examples, which are topologically standard, T(X) nevertheless produces a space
2
which is topologically distinct from Tapplied to a smooth submanifold. This will ultimately
permit us to detect singular points by ﬁnding regions in which the tangent complex is
homotopically non-standard. Here are some examples of how the construction behaves;
we will give a formal deﬁnition in the body of the paper.
Example 1.1 When Xis a smooth submanifold of Rn,T(X)is the usual tangent unit sphere
bundle of X.
Contrast the following two examples.
Example 1.2 (Straight line.) If X1=R× {0}is the x-axis in R2, then T(X1)is the union
of two components R× {0} × {e1}and R× {0} × {−e1}. Here e1denotes the standard basis
vector (1,0) R2.
Example 1.3 (L-shaped line.) Let X2=R+×{0} ∪ {0} × R+, where R+denotes the set
of nonnegative reals {x:x0}. In this case, T(X2)is a disconnected union of four rays,
given by R+× {0} × {e1},R+× {0} × {−e1},{0} × R+× {e2}, and {0} × R+× {−e2}.
The sets in Examples 1.2 and 1.3 are topologically equivalent to the real line, but their
tangent complexes fall into two and four connected components respectively. Thus we dis-
tinguish X1and X2by simple topological invariants of T(X1) and T(X2), though the spaces
themselves are topologically indistinguishable. In fact for any smooth curve CRn, the
tangent complex T(C) is topologically equivalent to T(X1). In contrast, for a piecewise
smooth curve with ktangent discontinuities, the tangent complex has 2k+ 2 connected
components. Example 1.3 simply illustrates the case k= 1.
In these simple examples the existence of a corner, or the number of corners, can be
derived from the number of connected components of the tangent complex. This may be
computed exactly or up to some tolerance using a clustering algorithm ([7], pp. 453-480).
In higher-dimensional cases it is may not be enough to count connected components, as we
see next.
Example 1.4 (One wall.) Let X3be the set {0R2in R3. Then the tangent complex T(X3)
is connected and has the homotopy type of a circle.
Example 1.5 (Two walls meeting at a corner.) Let X4be the subset of R3given by
X4=X2×R
=R+× {0} × R∪ {0} × R+×R
In this case the tangent complex is connected, but has the homotopy type of a bouquet of three
circles.
Here the presence of singular points in X4along the subset {0} × {0} × Rcan be detected
using one-dimensional homology, which detects loops.
In this paper, we use these ideas as the basis for an algorithm to locate the singular set.
To give an idea of how the algorithm works, we consider the case of a curve in the plane.
The object is to locate any singular points. We suppose that we are dealing with a bounded
part of the curve contained in a square window, as in Figure 1.
The ﬁrst step in the algorithm is to compute the homology of the tangent complex for the
part of the curve contained in the window. If the homology agrees with the standard model
3
Figure 1: A curve with a singular point
Figure 2: A divide-and-conquer strategy for locating the singular point
of a single smooth curve, then we stop looking for singular points. In this case the tangent
complex has four connected components (as in Example 1.3), which is non-standard.
The next step is to divide the window into four smaller windows and repeat the homology
calculation in each window (Figure 2, left panel). In this case, one of the windows is empty
and two of the windows contain a standard curve, and hence have standard homology. As
indicated by the shading, we discard these three windows and apply the algorithm recursively
on the single remaining non-standard window. Two further iterations of this process are
shown in last two panels of Figure 2. The result is a nested sequence of windows converging
on the singular point. If there are several singular points, then the process will have several
active branches converging separately to the diﬀerent singular points.
When implementing this algorithm in practice, we need to take account of the fact that
we are dealing with point cloud data. This presents two challenges:
How do we recover homology from a space represented as point cloud data?
How do we reconstruct a discrete tangent complex from point cloud data, when it
depends on limiting information concerning the underlying space?
In this paper we have taken a straightforward approach to the ﬁrst question. Given
a point cloud space we build a simplicial complex approximation called the Rips complex
which depends on a choice of length scale and which has a vertex for every data point
considered. Given a simplicial complex, the homology calculation is straightforward linear
algebra. The Rips complex is simple to implement but not particularly eﬃcient; it suﬃces
for the examples given here. A more sophisticated approach is the the synthetic Delaunay
triangulation developed in [1] .
We reconstruct the tangent complex by using local principal components analysis§at a
small number of base points in the complex to obtain a an approximation to the tangent
space at these points; then we sample the unit spheres in these tangent spaces uniformly to
4
(a) (b) (c)
Figure 3: Example spaces with easily-computed homology
obtain a point cloud in Rn×Sn1. The resulting point cloud space is amenable to the Rips
complex construction, and the homology of the tangent complex can be recovered reliably
given suﬃcient data.
2 Homological Preliminaries
In this section, we will discuss the properties of homology groups we will need. The reader
is encouraged to consult a standard text such as [6] or [8] for a more detailed exposition of
these ideas.
Homology is a technique for assigning, to every topological space Xand nonnegative
integer n, a vector spaces Hn(X). We will deal exclusively with “mod 2 homology”, in
which case these are vector spaces over the ﬁnite ﬁeld F2={0,1}. The dimension of this
vector space is referred to as the n-th Betti number of Xwith mod 2 coeﬃcients, and will
be written βn(X). In an informal sense, the n-th Betti number of Xmeasures the number
of n-dimensional holes in the space X.
Example 2.1 Suppose that X=S1is the unit circle in the plane. Then H1(X)
=F2, so
β1(X) = 1. This represents the one dimensional hole “in the middle of the circle”.
Example 2.2 Suppose that Xis a bouquet of two circles, as shown in Figure 3(b). In this
case, β1(X) = 2, representing two distinct one dimensional holes.
Example 2.3 Suppose that X=S2, the unit sphere in 3-space. Then β2(X) = 1, measuring
the two dimensional hole in the sphere. More generally, we have that βi(Sn) = 0 when i6= 0, n
and βi(Sn) = 1 for i= 0, n.
Example 2.4 Suppose that Xconsists of kdistinct points. Then β0(X) = k. In general,
β0measures the number of path components of X.
The homology groups have the following properties.
Hnis functorial, i.e. every continuous map f:XYinduces a linear transformation
Hn(f): Hn(X)Hn(Y) for all n.
Hnis homotopy invariant, i.e. if two maps f, g:XYare homotopic, then the induced
linear transformations Hn(f) and Hn(g) are equal. This is an extremely important
property of these linear transformations. We say two spaces Xand Yare homotopy
equivalent if there are maps f:XYand g:YXso that fg is homotopic to idY
and gf is homotopic to idX. The homotopy property for Hnimplies that if Xand Y
are homotopy equivalent, then Hn(X) and Hn(Y) are isomorphic, and in particular
βn(X) = βn(Y).
5
The phrase “can be deformed into” is loosely synonymous with “is homotopy equivalent to”,
and conveys roughly the right idea.
Example 2.5 The circle in Figure 3(a) and the annulus in Figure 3(b) are homotopy equiv-
alent and so have the same Betti numbers.
When a space is broken up as the union of diﬀerent pieces, the homology can be com-
puted from the homology of the pieces and all possible overlaps of these pieces, using
Mayer–Vietoris techniques ([6], [8]).
When a space is described as a simplicial complex, the computation of homology re-
duces to straightforward linear algebra over the ﬁeld F2. A simplicial complex is a subspace
of Rnexpressed as a union of simplices which overlap in faces, i.e. the intersection of any
pair of simplices is a face of each of the two simplices. Such a space is determined up to
homeomorphism by simple combinatorial data.
Deﬁnition 2.6 By an abstract simplicial complex, we will mean a pair (V, Σ), where Vis
a ﬁnite set whose objects are referred to as vertices, and where Σis a collection of subsets
of V, so that if σΣ, and στ, then τΣ. The elements of Σare referred to as faces.
If a face τΣconsists of exactly k+ 1 elements of Vthen we say that τ={v0, v1, . . . , vk}
is a k-simplex of Σwith vertices v0, v1,...,vk.
Any simplicial complex Sdetermines an abstract simplicial complex as follows. Let Vbe
the set of vertices of S, and let Σ consist of those sets of vertices τ={v0, v1, . . . , vk}which
span a simplex in S. Conversely, we can recover the topological type of Sfrom the abstract
simplicial complex by taking a simplex for each face of Σ and gluing these simplices together
appropriately.
The homology of a simplicial complex Sis computed from the abstract simplicial complex
associated to it. The idea is to set up a chain complex, which is a sequence of vector spaces
and linear maps between them:
00
C0
1
C1 · · · Ck1
k
Ck · · ·
Each Ckis a vector space over the ﬁeld F2with a basis vector ¯τfor each k-simplex τΣ.
The linear map kis known as the boundary operator and is deﬁned as follows. First
choose an ordering of the vertex set V. Writing τ={v0, v1,...,vk}with the vertices listed
in increasing order, we deﬁne the j-th face of τto be the (k1)-simplex τjobtained by
deleting the vertex vjfrom the list. Then kis deﬁned to be the linear map deﬁned by
k¯τ=
k
X
j=0
(1)j¯τj
on basis vectors ¯τ, and extended by linearity to all of Ck. [Note: the (1)jterms shown here
are necessary in general, but in our case they happen to be redundant since we are working
over F2.]
If στis a (k2)-simplex, then it is a face of a (k1)-face of τin exactly two diﬀerent
ways. Using this observation it can be shown that k1k= 0 for all k. In other words, the
boundary of a boundary is always zero. Let ZkCkdenote the null space of the operator k,
6
(a) (b) (c)
Figure 4: Chains and cycles
(a) (b)
Figure 5: A boundary cycle
and let BkCkdenote the image of the operator k+1. It follows that BkZk, and we
deﬁne the k-th homology group Hkto be the quotient vector space Zk/Bk. The structure
of Hkcan therefore be expressed in terms of matrix calculations over the ﬁeld F2.
2.1 Chains, cycles and boundaries
It may be helpful to give some examples of how the deﬁnition Hk=Zk/Bkworks in practice.
We introduce the language of chains,cycles and boundaries.
Ak-chain is an element of the F2vector space Ckderived from a simplicial complex S.
There is a coeﬃcient, 0 or 1, for each k-simplex of S; thus we can regard a k-chain simply as
a set of k-simplices, by picking out those simplices with coeﬃcient 1. A k-cycle is an element
of Zk; in other words a k-chain whose boundary is zero (empty). Finally a k-boundary is a
k-chain which is the boundary of some (k+ 1)-chain. Every k-boundary is automatically a
k-cycle; this is equivalent to the assertion = 0. The homology Hkis deﬁned to be the
space of k-cycles modulo all the uninteresting k-cycles that be created cheaply by taking the
boundary of some (k+ 1)-chain.
In Figure 4(a), a typical 1-chain is shown highlighted in red. It is not a 1-cycle, since
its boundary 0-chain is nonempty (Figure 4(b)). Figure 4(c) shows a 1-cycle. This is not
the boundary of any 2-chain, so it corresponds to a genuine non-zero element of H1. On
the other hand, the cycle in Figure 5(a) is the boundary of the 2-cycle shown in Figure ref-
ﬁg:1boundary(b), and so it is zero in homology.
3 Point Cloud Data
We have seen that homology is readily computable for spaces which are equipped with a
triangulation, i.e. a homeomorphism to a simplicial complex. The geometric objects we
will deal with will rarely come equipped with such a structure. In fact, we will be trying
to recover topological information about a geometric object from point cloud data obtained
from the space, by which we mean a ﬁnite set of points sampled from the object. In order
7
to make calculations, this means that we must somehow construct a simplicial complex from
the point cloud data, which we believe approximates the space in question.
The idea is as follows. Let Xbe a topological space, and suppose we have a ﬁnite covering
U={Uα}αAof Xindexed by a set A.
Deﬁnition 3.1 The Cech complex of U,C(U), is the simplicial complex whose vertex set
is A, and where a subset {α0, α1,...,αk}is a simplex if and only if
Uα0Uα1...Uαk6=
It is frequently the case that the Cech complex of the covering Uis homotopy equivalent
to X, and therefore has homology isomorphic to that of X. For example, if all sets of the
form
Uα0Uα1...Uαk
are either empty or contractible, then C(U) is homotopy equivalent to X. For any Rieman-
nian manifold M, there is an so that if {x1,...,xN}has the property that the balls B(xi)
cover M, then the Cech complex of the covering {B(x1),...,B(xN)}is homotopy equivalent
to M.
If Sis a ﬁnite subset of a metric space, we write C(S) to mean C(B), where Bis the
collection of metric balls {B(s) : sS}. In the case of Euclidean data there is the following
approximation theorem.
Theorem 3.2 If SRnis a ﬁnite set of points in Euclidean space, then C(S)is homotopy
equivalent to the space:
S=[
sS
B(s)
When Sis sampled from a space XRn, it may well be the case that the union of balls S
covers and is homotopy equivalent to X. If so, then this theorem implies that C(S) has the
same homology as X.
There is a second complex we can construct to approximate the homotopy type of a space
which is equipped with a metric.
Deﬁnition 3.3 Suppose that Xis a metric space, with metric d. For any ﬁnite subset S
of X, and any  > 0, we deﬁne the -Rips complex of the subset Sto be the abstract simplicial
complex whose vertex set is S, and where a subset {s0, s1,...,sk}is a simplex if and only if
d(xi, xj)for all i, j so that 0i, j k. We write R(S)for this complex.
Suppose again that Xis a metric space, and that Sis a ﬁnite subset so that
[
sS
B(s) = X
We have an evident inclusion C/2(S)R(S): the vertex sets of the two complexes are
the same, and it follows from the triangle inequality that if B/2(s1)B/2(s2)6=, then
d(s1, s2). If we are dealing with points in Rn, there is also an inclusion R(S)C(S),
as one can readily check. This comparability suggests that both complexes can be useful in
approximating homotopy types.
8
(a) (b)
Figure 6: ‘Holes’ due to uneven sampling lead to incorrect homology.
Remark. The two complexes have diﬀerent useful properties. The Cech complex is theoret-
ically amenable in that there are results (such as Theorem 3.2 above) which establish that
under certain conditions the homotopy type of the Cech complex of a covering of Xis the
same as that of X. However the Cech complex is computationally more involved, since one
needs to determine for every collection of metric balls whether they have a common intersec-
tion. This is a slightly awkward calculation even in Euclidean space. The Rips complex, on
the other hand, does not have such good theoretical properties, but is computationally more
convenient, since one only needs to identify the 1-simplices (edges), which then determine
the rest of the complex.
3.1 Uneven sampling, and persistent homology
In spite of the theorems alluded to above, in practice it is unusual for the Cech complex to
exactly recover the homotopy type of the underlying space X. The usual problem is that
our sampling from the geometric object may not be adequate.
To see how this happens, consider Figure 6. Here we suppose that we have obtained
point cloud data by sampling from an annulus, which has the homotopy type of a circle.
However, the sampling is not completely uniform. The blue shaded region in (a) represents
the cloud of sampled points, with the white holes representing subregions where there are no
sample points. Each of the holes which is entirely contained in the shaded region will create
a new generator in homology, so when we compute the homology of the Cech complex, for
a suitable small value of , we ﬁnd that rank H1(C) = 4 instead of the desired rank of 1.
The simple solution—make bigger–is not always as helpful as it seems; see Figure 6(b).
Here we have thickened the data cloud (green region), to represent the eﬀect of choosing a
larger value 0> . Although we have successfully closed the three small holes in (a), a new
hole has formed and this time rank H1(C0) = 2, again not the desired value.
This phenomenon suggests that instead of computing homology for a Cech or Rips com-
plex for a single value of , we instead compute homology for several values of , and consider
the image of the homomorphism
Hi(R(X)) Hi(R0(X))
for  < 0. This construction is known as persistent homology becuse it picks out those
homology classes already existing in Cwhich persist when we move to the larger complex C0.
9
Equivalently, we consider the k-cycles of Cmodulo those which can be expressed as
boundaries of (k+ 1)-chains in C0. In the example of Figure 6, a 1-cycle encircling any
of the three small holes in Cbecomes the boundary of a 2-chain when we move to C0.
On the other hand, the newly-created hole in C0does not correspond to any 1-cycle in C
itself. Thus the persistent homology with respect to , 0detects only a single nontrivial
1-dimensional homology class, coming from the obvious cycle which encircles the annulus.
This is the approach we adopt. We will select diﬀerent length scales for our complexes,
which we believe will be of the right scale to capture the features we are interested in, and
so that any spurious classes vanish under passage to the longer length scale.
Note: The idea of considering homology for Cech complexes of varying length scales and
deﬁning persistent homology groups was introduced by H. Edelsbrunner in [2]. An eﬀective
algorithm for simultaneously computing all the persistent homology groups over an interval
range of values for , 0is given in [2].
4 The Tangent Complex
In this section, we will consider subsets Xof Euclidean space Rn, which in many cases
are contractible, but which nevertheless carry features which we would intuitively regard as
qualitative. The idea of this section is that it is possible to make a construction on X, whose
homotopy type is sensitive to non-smooth features in X.
Deﬁnition 4.1 Let XRn. We deﬁne the open tangent complex to X,T0(X)to be the
subset of X×Sn1deﬁned by
T0(X) = (x, v) : lim
t0+
d(x+tv, X )
t= 0
where d(ξ, X)denotes infxXd(ξ , x). We deﬁne the closed tangent complex T(X)to be the
closure of T0(X)in X×Sn1.
Note ﬁrst that T(X) comes equipped with a projection p:T(X)X. For any xX, we
will denote by Tx(X) the ﬁber at x, i.e. p1(x). There is also the projection q:T(X)Sn1.
We have the following two useful propositions concerning this construction.
Proposition 4.2 Suppose that xXis a smooth point of X, i.e. so that there is a
neighborhood Uof xin Rn, and a smooth function f:URm, so that
UX=f1(0)
Df(ξ)has rank mfor every ξin U
Then Tx(X)
=Snm1.
Example 4.3 Let Lbe a line in the xy-plane, given by the equation ξ·(xx0) = 0, for
vectors ξand x0. Then we have q(T(L)) = η}, where ηis a unit vector perpendicular
to ξ, and
T(L)
=L× {±η}
More generally, Let WRnbe the hyperplane determined by the equation ξ·(xx0) = 0,
where ξand x0are n-vectors. Then T(W)
=W×S(ξ), where S(ξ)denotes the unit sphere
10
in the plane of vectors perpendicular to ξ. This result holds with Wreplaced by any halfplane
It is typically easy to work directly with the deﬁnition of the tangent complex in the case
of one-dimensional objects in the plane.
Example 4.4 Consider the example in the introduction, with XR2,X=R+× {0} ∪
{0} × R+.
We evaluate the ﬁbers Tx(X) directly. For any smooth point x, the ﬁber will consist
of two distinct points, i.e. a zero dimensional sphere S0. For points along the x-axis, the
two points will be (x, (1,0)) and (x, (1,0)), and along the y-axis, they will be (x, (0,1))
and (x, (0,1)). At the origin, though, the ﬁber T(0,0) (X) consists of four points, namely
((0,0),(±1,0)) and ((0,0),(0,±1)). We can easily verify that the tangent complex is actually
the union of two pieces, one from the tangent complex of R+×0 and the other from the
tangent complex of 0 ×R+:
T(R+× {0}) = (R+× {0})× {±e1}
and
T({0} × R+) = ({0} × R+)× {±e2}
Thus T(X) is equal to:
(R+× {0})×{±e1} ∪ ({0} × R+)×e2}
It is easy to see that this space is a disjoint union of four distinct half lines.
Example 4.5 Let XR3be the boundary of the positive octant, i.e.
X={(x, y, z) : x, y, z 0,and one of x,y,z is equal to zero}
In this case, Xis a union of three pieces, namely the intersections of Xwith the three co-
ordinate planes. Denote the intersection of Xwith the xy-plane by Xxy , and let Xyz and Xxz
be the other intersections. Each of these intersections is a quadrant in the corresponding
coordinate plane. From the previous example, we ﬁnd that T(Xxy)
=Xxy ×S1
xy, where S1
xy
denotes the unit circle in the xy-plane. There are similar descriptions for each of the other
coordinate planes. If we now examine the ﬁbers of the projection T(X)X, we ﬁnd the
following.
For any smooth point v(i.e. any point of Xwhich does not lie on a coordinate axis),
the ﬁber Tv(X) is a circle.
For any point vwhich lies on a coordinate axis, but which is not equal to the “cone
point” (0,0,0), Tv(X) is the union of two circles which overlap at a a pair of antipodal
points.
For the cone point, we have T(0,0,0) (X) is homeomorphic to the union of three circles
which pairwise overlap at pairs of antipodal points.
In order to analyze some higher dimensional examples, we will give a result which analyzes
the eﬀect of taking the product of a set in Rnwith a copy of R. We ﬁrst recall the notion of
the join.
11
Deﬁnition 4.6 Let XSn1Rn. By the join of Xwith Sk1Rk, we will mean all
points (x, v)in Sn+k1Rn+kso that x
kxkXwhenever x6= 0.
The join has an intrinsic meaning in terms of Xwithout reference to the embedding. The
join of Xand Yis denoted by XY, and is deﬁned to be the quotient X×Y×[0,1]/', where
'is the equivalence relation generated by the equivalences (x, y, 0) '(x0, y, 0) for all x, x0,
and (x, y, 1) '(x, y0,1) for all y, y0. The join of any space Xwith Skis homeomorphic to
the (k+ 1)-fold suspension of the space. In particular, we have SnSm
=Sn+m+1.
Proposition 4.7 Let XRn, and let Y=X×RRn+1. Then the ﬁbre T(x,t)(Y)is equal
to the join of the ﬁber Tx(X)with S0R. Informally we say that T(x,t)(Y)is the ﬁberwise
join of T(X)with the 0-sphere.
To illustrate the application of this idea, suppose that Xis obtained by folding a plane
in R3along a line in the plane. For example, consider the set X={(x, y, z)|x0, y
0,and x= 0 or y= 0}. This set is the product of the set Yin R2given by Y={(x, y)|x
0, y 0,and x= 0 or y= 0}and R. We analyzed the tangent complex for Yin this set
in Example 4.4 above, and found that the ﬁber T(0,0)(Y) consisted of four distinct points.
Proposition 4.7 now tells us that the ﬁber T(0,0,0)(X) is the join of these four distinct points
with the S0.
Note that this ﬁber is homeomorphic to the union of two circles along two points, as in
In general, it is possible to give an explicit description of the ﬁbers Tx(X) in the case
when xis a conelike singular point.
12
Deﬁnition 4.8 For any subset Lof Sn1Rn, we deﬁne the cone on L,cL, to be the set
cL ={rv|r[0,1],and vL}. Let XRn. We say xXis a conelike point in X
if there is a neighborhood Ucontaining xin Rn, with boundary ∂U , so that there is a map
f:U Dn, which is smooth and has a smooth inverse, so that f(XU) = c(f(X∂U )). In
other words, the singularity is locally diﬀeomorphic to the cone on the space f(X∂U ).
Remark: Conelike singularities are common. For instance, if Xis an algebraic variety,
and xis an isolated singular point, then xis conelike in the above described sense.
It is possible to analyze the ﬁber Tx(X) in the case of a conelike singularity. Since the
topological type of the tangent complex is unchanged by smooth changes of coordinates, it
is enough to study the case of cL, where LSn1Rn.Lis a subset of Rn, and as such we
may study its tangent complex T(L). For each xL, we have the ﬁber Tx(L)Rn×Sn1. If
we let q:Rn×Sn1Sn1denote the projection, we obtain the subset q(Tx(L)) Sn1. In
order to describe T(cL), we coordinatize the cone cL via coordinates (t, λ), where 0 t1,
and λL, with all points with t= 0 being identiﬁed with the single cone point. Here tis
the parameter describing the line segment from a point xLto the cone point.
Proposition 4.9 T(cL)is described as follows.
For t > 0,T(t,λ)(cL)is the join of Tλ(L)with S0, so is homeomorphic to the suspension
of Tλ(L).
Let pdenote the cone point, i.e. the origin. Then
q(Tp(cL)) = [
λL
S0q(Tλ(L))
Example 4.10 Consider the cone singularity, which occurs at the point (0,0,0) of the sub-
set Xof R3deﬁned by
x2+y2=z2and z0
In this case the ﬁber Tv(X) consists of a circle for all points vaway from the origin, since
these points are all smooth. However, the ﬁber at the origin is given by
T(0,0,0)(X)
={ξS2such that ξ·(0,0,1) 1/2}
This space is homeomorphic to an annulus S1×[0,1].
5 Homology detection of singular points
In this section, we will show that in many cases, homology groups can be used to detect and
distinguish between singular points. Let XRnbe a subset. What we will show is that for
many choices of Xand xX, the Betti numbers βkwill provide useful information about
the nature of the point x.
13
Example 5.1 Suppose that xis a smooth point in X, i.e. a point for which there is a
neighborhood UXof x, so that Uis diﬀeomorphic to a Euclidean disc Dkfor some k.
Then βj(Tx(X)) = 0 for j6= 0, k 1, and βj(Tx(X)) = 1 for j= 0, k 1
Example 5.2 Suppose that Xis a union of llines in R2, intersecting in a single point p.
Then β0(Tp(X)) = 2l, and βi(Tp(X)) = 0 otherwise.
Example 5.3 We consider the case where Xis the surface of a polyhedron. There are now
three distinct possibilities for a point PX, namely
1. Pis in the interior of a face of X
2. Plies on an edge of X, but is not a vertex.
3. Pis a vertex
It turns out that in all cases, β0(TP(X)) = 1, and that βi(TP(X)) = 0 for i2. We examine
the behavior of β1
1. Pis in the interior of a face of the polyhedron. In this case, Pis a smooth point of X,
so TP(X)is a circle. This tells us that β1(TP(X)) = 1.
2. In this case, a local smooth model for the space Xnear xis as the product of a line with
the space YR2which is the union of the non-negative xand y-axes. It now follows
from Proposition 4.7 that TP(X)is the join of S0with T(0,0) (Y), which is the union
of two circles with intersection a pair of distinct points. It is now readily veriﬁed that
β1(TP(X)) = 3.
3. In this case, we must count the number Nof faces containing P, or equivalently the
number of edges containing P.TP(X)is a union of Ncircles, with each pair of circles
intersecting in a pair of distinct points, and where all of the pairs of points are disjoint.
One ﬁnds that β1(TP(X)) = 1 + PN1
i=0 2i= 1 + N(N1). Note that in this case
N3.
Observe that all the diﬀerent cases are distinguished by the value of β1on TP(X).
6 Locating singular points
In the last section, we have shown how to use homology to determine whether or not a given
point is a singular point, and what type it is. An important question, though, is whether
one can use homological methods to locate singular points without prior knowledge of where
they might be. The key idea is the following.
14
Proposition 6.1 Let XSn1Rn, and as before let CX Rndenote the cone on X.
Let pdenote the cone point. Then the inclusion Tp(CX)T(CX)is a homotopy equiv-
alence, and hence induces an isomorphism on homology. More generally, let CRXdenote
{zCX :kzk ≤ R}. Then Tp(CRX)T(CRX)is also a homotopy equivalence.
Proof. There is a smooth deformation retraction of CX into the single point p. It is covered
by a deformation retraction of T(CX) into Tp(CX)
This means that if we have found a conelike neighborhood of a conelike singular point,
we can compute the homology of the ﬁber over the singular point. This fact suggests the
existence of an algorithm for location of singular points in that portion a set Xwhich is
contained in a rectangular subset URn, consisting of the following steps.
1. Compute H(T(X)). If the homology is that of a smooth subset, i.e. H(T(X)) '
H(Sk) for some k, then we assume that the rectangular region in question does not
contain any singular points, and we remove this rectangular region from consideration.
2. Divide the rectangular region into a family of smaller rectangular regions {Uα}αA, say
by bisecting or trisecting in each of the coordinate directions.
3. Apply step 1 to each of the smaller windows, retaining only those rectangular regions Uα
in which H(T(XUα)) is not that of a sphere.
4. Repeat step 3 until one arrives at a suﬃciently good approximation to the singular set.
Remark. The assumption that the “homological standardness” of the intersection XUα
implies that there are no singular points of Xin Uαis not a rigorous one. It is surely possible
to construct situations where H(T(XUα)) is isomorphic to H(Sk) for some k, but where
XUαdoes contain singular points. However, one generally expects that homological
complexity of Tx(X) will carry into homological complexity of T(XUα). If one suspects
that one has missed a singular point, though, one can subdivide the region more ﬁnely, and
begin at a ﬁner level of subdivision.
Remark. As we have described the algorithm above, it is designed to search for all possible
singular points. However, it is possible to modify it to search for singular points of a particular
type. For instance, if one is searching for the vertices of a cube, and is not interested in the
edges, one can use the calculations in Example 5.3 to see that if one’s criterion for retaining
a rectangular region is that β1(T(XUα)) 7, one will locate the vertices.
7 Point cloud approximation to T X
In order to apply the ideas described above to point cloud data, an attractive option is to ﬁnd
a method for associating to a set of point cloud data D ⊆ Rnwhich is obtained by sampling
from a geometric object Xa new set of point cloud data T(D) which one believes is what
one might obtain by sampling a ﬁnite set of points from T(X). There are many subtle and
interesting issues regarding such constructions, and many natural ways in which one might
proceed. One problem with all these methods is that they construct very large complexes.
We plan to discuss these issues in a systematic way in a future paper, but for the present we
will restrict ourselves to an ad hoc construction of a simplicial complex which is well related
15
to the tangent complex T(X), and for which the algorithm described above successfully
locates the singular set in a number of examples. The goal throughout the construction is
to make sure that not only is the vertex set as small as possible, but that the collections
of simplices should also be as small as possible. Therefore, in addition to choosing a small
vertex set, we use a criterion described below to “prune” edges. Our construction proceeds
as follows.
We suppose that we know the dimension of the original subset X, say l. The construction
begins by selecting a set B={β1, β2,...,βN}of base points from D. In order to maximize
coverage of the space by these points, one chooses them in a way which is biased in favor
of large interpoint distances. Speciﬁcally, a relatively large set Ris sampled from D, then
the sequence of points {βi}is chosen from Rin such a way that βiis the furthest point
in Rfrom the collection {β1, β2,...,βi1}. The number of Nbase points is set in advance.
At each base point β, we ﬁnd the knearest neighbors {βi, β2, . . . , βk}to βin the set D,
where kis a parameter we choose beforehand. We then perform local principal component
analysis [9] to obtain the best linear subspace approximation to Dnear β, and we write Lβ
for this subspace. For us, this means that we form the n×kmatrix Awhose columns are
the diﬀerences {β1β, β2β, . . . βkβ}, then construct the covariance matrix C=AAT.
We then diagonalize this matrix, and let Lβbe the span of the eigenvectors corresponding
to the llargest eigenvalues. If the set of llargest eigenvalues doesn’t “stand out”, we assume
that there is not a natural best ﬁtting l-dimensional linear subspace, and we omit the base
point β. Our criterion for “standing out” is as follows. We let λ1λ2 · · · λldenote the
llargest eigenvalues of the matrix C, and our criterion for inclusion is that λl1should be
less than a ﬁxed threshold, which is a parameter in the algorithm. We also choose parameters
δ,ρ, and a parameter ν. We next build a small simplicial complex whose vertex set is Bby
considering the Rips complex on Bfor a suitable value of ν, and then removing edges in a
way which is biased in favor of short edges and against 2-simplices with small angles. This
is done as follows. For each edge e={β1, β2}, we let L(e) denote the length of e. For any
other edge e0={β1, β0}, which contains β1as a vertex, we deﬁne σ(e, e0) to be the length of
the vector
β2β1
kβ2β1kβ0β1
kβ0β1k
and similarly for edges e00 ={β0, β2}. We let θ(e) denote the minimum value of σ(e, e0), as
e0varies over all edges which share a vertex with e. We now assign to the edge ethe score
L(e)
θ(e)1.5. We let Sdenote the subcomplex of the Rips complex obtained by removing all edges
whose score is greater than a certain threshhold. This threshhold is also a parameter in the
algorithm. We have constructed a small complex modelling the base space, i.e. the original
data set. In order to build a complex Tfor the tangent complex, we proceed as follows. For
each β∈ B, we now sample a ﬁxed number tof points {vβ
1, vβ
2,...,vβ
t}uniformly from the unit
sphere in Lβ. The vertex set of Tis the set {(β, vβ
i)}β∈B,1it. We deﬁne a graph structure
on this set as follows. For every pair of points {(β, vβ
i),{(β, vβ
j)}, we insert this potential edge
if and only if d(vi, vj)δ. For βand β0which are adjacent in S, we deﬁne a bipartite graph
structure on the set V(β, β0) = {vβ
1, vβ
2,...,vβ
t} ∪ {vβ0
1, vβ0
2,...,vβ0
t}as the intersection of two
bipartite graph structures Γ1and Γ2on V(β , β0). A pair {(β, vβ
i),(β0, vβ0
j)}spans an edge in
Γ1if and only if d(vβ
i, vβ0
j)pδ2+ρ2. In Γ2, we say {(β, vβ
i),(β0, vβ0
j)}spans an edge if and
16
only if vβ
iis among the mclosest points to vβ0
jin the set {vβ
1, vβ
2, . . . , vβ
t}and vβ0
jis among
the mclosest points to vβ
iin {vβ0
1, vβ0
2,...,vβ0
t}.mis again a parameter. We insert the edge
{(β, vβ
i),(β0, vβ0
j)}if and only if it is an edge both in Γ1and Γ2. If β, β0∈ B are not adjacent
in the complex S, we do not insert any edges of the form {(β , vβ),(β0, vβ0)}. This completes
the construction of the complex T. The rationale for this complicated construction is that
it in practice succeeds in removing small loops which otherwise distort the calculation.
8 Sample Results
We show the results of running our algorithm on various example point sets. The reader
will notice that in some cases, the singular set we obtain is “chunky”, i.e. that we have only
obtained a neighborhood of the singular set. This performance can certainly be improved
with more sampling. The purpose of this paper is show the validity of the concept, rather
than to demonstrate a fully optimized algorithm.
Figure 7: This ﬁgure shows the results of applying the algorithm to a curve with intersections in the plane.
Increasing redness indicates longer survival under the algorithm, and so the “reddest points” are those found
by the algorithm to be singular points. In this case, the algorithm searches for small sets for which the
tangent complex has more than two components, i.e for which β0>2. In this example, 5000 points were
used, and the algorithm had a running time of c:a 10 seconds.
References
[1] Carlsson, Gunnar and de Silva, Vin, Synthetic Delaunay triangulations, (in preparation).
[2] Edelsbrunner, Herbert, Letscher, David and Zomorodian, Afra, Topological persistence
and simpliﬁcation, Discrete Comput. Geom. 28 (2002), 511-533.
[3] Fan, Ting-Jun, Describing and Recognizing 3D Objects Using Surface Properties,
Springer Verlag, Berlin–New York, 1990.
17
Figure 8: These ﬁgures show the results of applying the algorithm to two curved surfaces which meet
transversely in a curve, which becomes the singular locus of the union of the two surfaces. This is obtained
by searching for small sets for which the tangent complex has β1>1. In both cases, point clouds of 20,000
points were used, with a running time of c:a 2 minutes.
[4] Federer, Herbert, Geometric measure theory, Die Grundlehren der mathematischen Wis-
senschaften, Band 153, Springer-Verlag New York Inc., New York 1969.
[5] Fisher, Robert B., From Surfaces to Objects: Computer Vision and Three-Dimensional
Scene Analysis, John Wiley and Sons, New York, 1989.
[6] Greenberg, Marvin J. and Harper, John R., Algebraic topology. A ﬁrst course, Mathe-
matics Lecture Note Series, 58 . Benjamin/Cummings Publishing Co., Inc., Advanced
[7] Hastie, Trevor, Tibshirani, Robert, and Friedman, Jerome. The elements of statistical
learning. Data mining, inference, and prediction, Springer Series in Statistics. Springer-
Verlag, New York, 2001.
[8] Hatcher, Allen, Algebraic topology, Cambridge University Press, Cambridge, 2002.
[9] Jolliﬀe, I.T., Principal component analysis, Springer Series in Statistics. Springer-Verlag,
New York, 1986.
18
Figure 9: This ﬁgure shows the results for a search for a vertex in a portion of the surface of a cube. In
this case, the search is for small sets for which the tangent complex has β1>3. Sample size: 20,000 points,
running time: c:a 1.5 minutes.
00.2 0.4 0.6 0.8 100.2 0.4 0.6 0.8 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Figure 10: This ﬁgure shows the results for a search for the edges in a three simplex. Sample size: 10,000
points, running time: c:a 2 minutes.
Figure 11: Results of a search for the edges in an icosahedron. Sample size: 20,000 points, running time: c:a
3 minutes.
19
Figure 12: A two dimensional projection of the results of searching for the vertices in a 4-simplex. Sample
size: 80,000 points, running time: c:a 5 minutes.
20
... There are many ways to obtain a persistence simplicial complex. The Vietoris-Rips complexes and Cěch complexes are the classical construction from a data-set [12,28]. More generally, we give a simplicial complex K equipped with a function f : K → R on the simplices of K. Choosing a sequence of integers a 0 < a 1 < · · · < a p < · · · , we have a filtration of simplicial complexes {K p } given by ...
... Let V be a weighted data-set of points in an Euclidean space, that is, a collection of finite point set V = {x 1 , x 2 , . . . , x n } with a weight function w : V → R. Then we have a Vietoris-Rips complex [12] derived from a distance δ. More precisely, a Vietoris-Rips complex is an abstract simplicial complex whose p-simplices correspond to the sets of p points which are pairwise within distance δ. ...
Preprint
Full-text available
In this paper, we introduce a persistent (co)homology theory for Cayley digraph grading. We give the algebraic structures of Cayley-persistence object. Specifically, we consider the module structure of persistent (co)homology and prove that the persistent cohomology is an algebra with respect to the persistence-cup product. As an application on manifolds, we show that the persistent (co)homology is closely related to the persistent map of fundamental classes.
... We denote by S k the subset of states in S encoding k−simplices, and by H k the corresponding subspace of H which encodes the k-th chain group defined in (2). We can encode the order k of a simplex σ in a state |k (k = 0, 1, . . . ...
Preprint
Persistent homology is a powerful mathematical tool that summarizes useful information about the shape of data allowing one to detect persistent topological features while one adjusts the resolution. However, the computation of such topological features is often a rather formidable task necessitating the subsampling the underlying data. To remedy this, we develop an efficient quantum computation of persistent Betti numbers, which track topological features of data across different scales. Our approach employs a persistent Dirac operator whose square yields the persistent combinatorial Laplacian, and in turn the underlying persistent Betti numbers which capture the persistent features of data. We also test our algorithm on point cloud data.
... In topological data analysis, it has been used to analyse data with persistent homology [10,14,41,42]. These complexes have been used heavily in computational topology, as a simplicial model for point-cloud data [15,16,17,20] and as simplicial completions of communication links in sensor networks [21,22,36]. For more on these complexes, the interested reader is referred to [1,2,4,5,6,7,18,25,33,38,39]. ...
Preprint
Full-text available
For a metric space $(X, d)$ and a scale parameter $r \geq 0$, the Vitoris-Rips complex $\mathcal{VR}(X;r)$ is a simplicial complex on vertex set $X$, where a finite set $\sigma \subseteq X$ is a simplex if and only if diameter of $\sigma$ is at most $r$. For $n \geq 1$, let $\mathbb{I}_n$ denotes the $n$-dimensional hypercube graph. In this paper, we show that $\mathcal{VR}(\mathbb{I}_n;r)$ has non trivial reduced homology only in dimensions $4$ and $7$. Therefore, we answer a question posed by Adamaszek and Adams recently. A (finite) simplicial complex $\Delta$ is $d$-collapsible if it can be reduced to the void complex by repeatedly removing a face of size at most $d$ that is contained in a unique maximal face of $\Delta$. The collapsibility number of $\Delta$ is the minimum integer $d$ such that $\Delta$ is $d$-collapsible. We show that the collapsibility number of $\mathcal{VR}(\mathbb{I}_n;r)$ is $2^r$ for $r \in \{2, 3\}$.
... Abstract simplicial complexes play a prominent role in persistent homology [41,42], a technique to extract topological features that is a cornerstone of applied algebraic topology. The basic idea is to replace a sequence of data points in a metric space by an abstract simplicial complex induced by a proximity parameter (e.g. the Čech complex or the Vietoris-Rips complex). ...
... Abstract simplicial complexes play a prominent role in persistent homology [8,14], a technique to extract topological features that is a cornerstone of applied algebraic topology. The basic idea is to replace a sequence of data points in a metric space by an abstract simplicial complex induced by a proximity parameter (e.g. ...
Preprint
We introduce an original notion of extra-fine sheaf on a topological space, for which \v{C}ech cohomology in strictly positive dimension vanishes. We provide a characterization of such sheaves when the topological space is a partially ordered set (poset) equipped with the Alexandrov topology. Then we further specialize our results to some sheaves of vector spaces and injective maps, where extra-fineness is (essentially) equivalent to the decomposition of the sheaf into a direct sum of subfunctors, known as interaction decomposition, and can be expressed by a sum-intersection condition. We use these results to compute the dimension of the space of global sections when the presheaves are freely generated over a functor of sets, generalizing classical counting formulae for the number of solutions of the linearized marginal problem (Kellerer and Mat\'u\v{s}). We finish with a comparison theorem between the \v{C}ech cohomology associated to a covering and the topos cohomology of the poset with coefficients in the presheaf, which is also the cohomology of a cosimplicial local system over the nerve of the poset. For that, we give a detailed treatment of cosimplicial local systems on simplicial sets. The appendixes present presheaves, sheaves and \v{C}ech cohomology, and their application to the marginal problem.
... Persistent homology is presented in the form of barcodes, which have two parts. The Vietoris-Rips (VR) simplicial complex describes the structural change at different spatial resolutions in one dimension, while the Betti number describes the dimensions [3]. ...
Chapter
The paper studies the topological changes from before and after cointegration, for the natural frequencies of the Z24 Bridge. The second natural frequency is known to be nonlinear in temperature, and this will serve as the main focal point of this work. Cointegration is a method of normalising time series data with respect to one another - often strongly-correlated time series. Cointegration is used in this paper to remove effects from Environmental and Operational Variations, by cointegrating the first four natural frequencies for the Z24 Bridge data. The temperature effects on the natural frequency data are clearly visible within the data, and it is desirable, for the purposes of structural health monitoring, that these effects are removed. The univariate time series are embedded in higher-dimensional space, such that interesting topologies are formed. Topological data analysis is used to analyse the raw time series, and the cointegrated equivalents. A standard topological data analysis pipeline is enacted, where simplicial complexes are constructed from the embedded point clouds. Topological properties are then calculated from the simplicial complexes; such as the persistent homology. The persistent homology is then analysed, to determine the topological structure of all the time series.
Article
This article combines the principal component analysis (PCA) with persistent homology for applications in biomolecular data analysis. We extend the technique of persistent homology to localized weighted persistent homology to fit the properties of molecules. We introduce this novel PCA in the study of the folding process of residues 1 to 28 of amyloid beta peptide in solution. We are able to determine seven metastable states of amyloid beta 1 to 28 using homology of dimension 2, corresponding to seven local minimums in the free energy landscape. We also give the transition information between the seven types and the disconnectivity graph. Our result is very robust under change of parameters. Furthermore persistent homology of dimension 1 also give consistent results. This method can be applied to different peptides and molecules.
Chapter
The construction of Mapper has emerged in the last decade as a powerful and effective topological data analysis tool that approximates and generalizes other topological summaries, such as the Reeb graph, the contour tree, split, and joint trees. In this paper we study the parallel analysis of the construction of Mapper. We give a provably correct parallel algorithm to execute Mapper on a multiple processors. Our algorithm relies on a divide and conquer strategy for the codomain cover which gets pulled back to the domain cover. We demonstrate our approach for topological Mapper then we show how it can be applied to the statistical version of Mapper. Furthermore, we discuss the performance results that compare our approach to a reference sequential Mapper implementation. Finally, we report the performance experiments that demonstrate the efficiency of our method. To the best of our knowledge this is the first algorithm that addresses the computation of Mapper in parallel.
Article
Full-text available
A problem of automatic comparison of spatial objects on maps with different scales for the same locality is considered in the article. It is proposed that this problem should be solved using methods of topological data analysis. The initial data of the algorithm are spatial objects that can be obtained from maps with different scales and subjected to deformations and distortions. Persistent homology allows us to identify the general structure of such objects in the form of topological features. The main topological features in the study are the connectivity components and holes in objects. The paper gives a mathematical description of the persistent homology method for representing spatial objects. A definition of a barcode for spatial data, which contains a description of the object in the form of topological features is given. An algorithm for comparing feature barcodes was developed. It allows us to find the general structure of objects. The algorithm is based on the analysis of data from the barcode. An index of objects similarity in terms of topological features is introduced. Results of the research of the algorithm for comparing maps of natural and municipal objects with different scales, generalization and deformation are shown. The experiments confirm the high quality of the proposed algorithm. The percentage of similarity in the comparison of natural objects, while taking into account the scale and deformation, is in the range from 85 to 92, and for municipal objects, after stretching and distortion of their parts, was from 74 to 87. Advantages of the proposed approach over analogues for the comparison of objects with significant deformation at different scales and after distortion are demonstrated.
Article
Full-text available
This paper tackles the problem of computing topological invariants of geometric objects in a robust manner, using only point cloud data sampled from the object. It is now widely recognised that this kind of topological analysis can give qualitative information about data sets which is not readily available by other means. In particular, it can be an aid to visualisation of high dimensional data. Standard simplicial complexes for approximating the topological type of the underlying space (such as, Cech, Rips, or a-shape) produce simplicial complexes whose vertex set has the same size as the underlying set of point cloud data. Such constructions are sometimes still tractable, but are wasteful (of computing resources) since the homotopy types of the underlying objects are generally realisable on much smaller vertex sets. We obtain smaller complexes by choosing a set of `landmark'points from our data set, and then constructing a "witness complex" on this set using ideas motivated by the usual Delaunay complex in Euclidean space. The key idea is that the remaining (non-landmark) data points are used as witnesses to the existence of edges or simplices spanned by combinations of landmark points. Our construction generalises the topology-preserving graphs of Martinetz and Schulten [MS94] in two direc-tions. First, it produces a simplicial complex rather than a graph. Secondly it actually produces a nested family of simplicial complexes, which represent the data at different feature scales, suitable for calculating persistent homology [ELZ00, ZC04]. We find that in addition to the complexes being smaller, they also provide (in a precise sense) a better picture of the homology, with less noise, than the full scale constructions using all the data points. We illustrate the use of these complexes in qualitatively analyzing a data set of 3 3 pixel patches studied by David Mumford et al [LPM03].
Article
We present the implementation results of a shape segmentation technique and an associated shape matching method whose input is a point sample from the shape. The sample is allowed to be noisy in the sense that they may scatter around the boundary of the shape instead of lying exactly on it. The algorithm is simple and mostly combinatorial in that it builds a single data structure, the Delaunay triangulation of the point set, and groups the tetrahedra to form the segments. A small set of weighted points are derived from the segments which are used as signatures to match shapes. Experimental results establish the effectiveness of the method in practice.
Book
1 Introduction.- 1.1 The Input.- 1.2 Issues in Shape Description.- 1.2.1 Criteria for shape description.- 1.2.2 Choosing segmented surface descriptions.- 1.3 Issues of Recognition.- 1.3.1 Description of models.- 1.3.2 Matching primitives and algorithms.- 1.4 Questions for the Research.- 1.5 The Contribution of the Research.- 1.6 Organization of the Book.- 2 Survey of Previous Work.- 2.1 Survey of Shape Descriptions.- 2.1.1 Volume descriptions.- 2.1.2 Curve/line descriptions.- 2.1.3 Surface descriptions.- 2.1.4 Summary.- 2.2 Survey of Recognition Systems.- 2.2.1 3DPO.- 2.2.2 Nevatia and Binford.- 2.2.3 ACRONYM.- 2.2.4 Extended Gaussian Image (EGI).- 2.2.5 Oshima and Shirai.- 2.2.6 Grimson and Lozano-Perez.- 2.2.7 Faugeras and Hebert.- 2.2.8 Bhanu.- 2.2.9 Ikeuchi.- 2.2.10 Summary.- 3 Surface Segmentation and Description.- 3.1 Curvature Properties and Surface Discontinuities.- 3.2 Detecting Surface Features.- 3.2.1 Method 1: using directional curvatures and scale-space tracking.- 3.2.2 Method 2: using principal curvatures at a single scale.- 3.2.3 Method 3: using anisotropic filtering.- 3.3 Space Grouping.- 3.4 Spatial Linking.- 3.5 Segmentation into Surface Patches.- 3.6 Surface Fitting.- 3.7 Object Inference.- 3.7.1 Labeling boundaries.- 3.7.2 Occlusion and connectivity.- 3.7.3 Inferring and describing objects.- 3.8 Representing Objects by Attributed Graphs.- 3.8.1 Node attributes.- 3.8.2 Link attributes.- 4 Object Recognition.- 4.1 Representation of Models.- 4.2 Overview of the Matching Process.- 4.3 Module 1: Screener.- 4.4 Module 2: Graph Matcher.- 4.4.1 Compatibility between nodes of the model view and scene graph.- 4.4.2 Compatibility between two pairs of matching nodes.- 4.4.3 Computing the geometric transform.- 4.4.4 Modifications based on the geometric transform.- 4.4.5 Measuring the goodness of a match.- 4.5 Module 3: Analyzer.- 4.5.1 Splitting objects.- 4.5.2 Merging objects.- 4.6 Summary.- 5 Experimental Results.- 5.1 The Models.- 5.2 A Detailed Case Study.- 5.2.1 Search nodes expanded in recognition.- 5.3 Results for Other Scenes.- 5.4 Parallel Versus Sequential Search.- 5.5 Unknown Objects.- 5.6 Occlusion.- 6 Discussion and Conclusion.- 6.1 Discussion.- 6.1.1 Problems of segmentation.- 6.1.2 Problems of approximation.- 6.2 Contribution.- 6.3 Future Research.- 6.3.1 From surface to volume.- 6.3.2 Applications.- A Directional Curvatures.- B Surface Curvature.- C Approximation by Quadric Surfaces.
Article
We consider coverage problems in sensor networks with minimal sens-ing capabilities. In particular, we demonstrate that a stationary collection of sensor nodes with no localization can verify coverage in a bounded domain of unknown topological type, so long as the boundary is not too pinched. The only sensing ca-pabilities required by the nodes are a binary form of distance estimation between nodes and a binary proximity sensor for the boundary. The methods we introduce come from persistent homology theory.