ChapterPDF Available

An Isometry Classification of Periodic Point Sets

Authors:

Abstract

We develop discrete geometry methods to resolve the data ambiguity challenge for periodic point sets to accelerate materials discovery. In any high-dimensional Euclidean space, a periodic point set is obtained from a finite set (motif) of points in a parallelepiped (unit cell) by periodic translations of the motif along basis vectors of the cell.
An isometry classification of periodic point sets ?
Olga Anosova1and Vitaliy Kurlin1[0000000153285351]
University of Liverpool, Liverpool L69 3BX, UK vitaliy.kurlin@gmail.com
http://kurlin.org
Abstract. We develop discrete geometry methods to resolve the data
ambiguity challenge for periodic point sets to accelerate materials dis-
covery. In any high-dimensional Euclidean space, a periodic point set is
obtained from a finite set (motif) of points in a parallelepiped (unit cell)
by periodic translations of the motif along basis vectors of the cell.
An important equivalence of periodic sets is a rigid motion or an isometry
that preserves interpoint distances. This equivalence is motivated by solid
crystals whose periodic structures are determined in a rigid form.
Crystals are still compared by descriptors that are either not isometry
invariants or depend on manually chosen tolerances or cut-off parameters.
All discrete invariants including symmetry groups can easily break down
under atomic vibrations, which are always present in real crystals.
We introduce a complete isometry invariant for all periodic sets of points,
which can additionally carry labels such as chemical elements. The main
classification theorem says that any two periodic sets are isometric if and
only if their proposed complete invariants (called isosets) are equal.
A potential equality between isosets can be checked by an algorithm,
whose computational complexity is polynomial in the number of motif
points. The key advantage of isosets is continuity under perturbations,
which allows us to quantify similarities between any periodic point sets.
Keywords: lattice
·
periodic set
·
isometry invariant
·
classification
1 Introduction: motivations and problem statement
One well-known challenge in applications is the curse of dimensionality meaning
that any dataset seems sparse in a high-dimensional space. This paper studies the
more basic ambiguity challenge in data representations meaning that equivalent
real-life object can often be represented in infinitely many different ways.
Data ambiguity makes any comparison unreliable. For example, humans
should be not be compared or identified by the average color of their clothes,
though such colors are easily accessible in photos. Justified comparisons should
use only invariant features that are independent of an object representation.
?Supported by the EPSRC grant Application-driven Topological Data Analysis
2 O. Anosova and V. Kurlin
Our objects are periodic point sets, which model all solid crystalline ma-
terials (crystals). Solid crystal structures are determined in a rigid form with
well-defined atomic positions. Atoms form strong bonds only in molecules, while
all inter-molecular bonds are much weaker and have universally agreed defini-
tions. Points at atomic centers can be labeled by chemical elements or any other
properties, e.g. radii. Later we explain how to easily incorporate labels into our
invariants. We start with the most fundamental model of a periodic point set.
The simplest example is a lattice Λ, a discrete set of points that are integer
linear combinations of any linear (not necessarily orthogonal) basis in Rn, see
Fig. 1. More generally, a periodic point set is obtained from a finite collection
(motif ) of points by periodic translations along all vectors of a lattice Λ.
Fig. 1. Left: periodic sets represented by different cells are organized in isometry
classes, which form a continuous space. Right: the new isoset resolves the ambiguity.
The same periodic point set Scan be obtained from infinitely many different
Minkowski sums Λ+M. For example, one can change a linear basis of Λand
get a new motif of points with different coordinates in the new basis.
The above ambiguity with respect to a basis is compounded by infinitely
many rigid motions or isometries that preserve inter-point distances, hence pro-
duce equivalent crystal structures. Shifting all points by a fixed vector changes
all point coordinates in a fixed basis, but not the isometry class of the set.
The curse of ambiguity for periodic sets can be resolved only by a complete
isometry invariant as follows: two periodic sets given by any decompositions
Λ+Minto a lattice and a motif should be isometric if and only if their complete
invariants coincide. Such a complete invariant should have easily comparable
values from which we could explicitly reconstruct an original crystal structure.
The final requirement for a complete invariant is its continuity under pertur-
bations, which was largely ignored in the past despite all atoms vibrate above the
absolute zero temperature. All discrete invariants including symmetry groups are
discontinuous under perturbation of points. A similarity between crystals should
be quantified in a continuous way to filter out nearly identical crystals obtained
as approximations to local energy minima in Crystal Structure Prediction [22].
An isometry classification of periodic point sets 3
Problem 1 formalizes the above curse of ambiguity for crystal structures.
Problem 1 (complete isometry classification of periodic point sets).Find a
function Ion the space of all periodic point sets in Rnsuch that
(1a) invariance : if any periodic sets S, Q are isometric, then I(S) = I(Q);
(1b) continuity :I(S) continuously changes under perturbations of points;
(1c) computability:I(S) = I(Q) is checked in a polynomial time in a motif size;
(1d) completeness : if I(S) = I(Q), then the periodic sets S, Q are isometric.
The main contribution is the new invariant isoset in Definition 9 whose com-
pleteness is proved in Theorem 10. Conditions (1cd) are proved in the recent
work [3] introducing the new research area of Periodic Geometry and Topology.
2 A review of the relevant work on periodic crystals
Despite any lattice can be defined by infinitely many primitive cells, there is
a unique Niggli’s reduced cell, which can be theoretically used for comparing
periodic sets [13, section 9.2]. Niggli’s and other reduced cells are discontinuous
under perturbations in the sense that a reduced cell of a perturbed lattice can
have a basis that substantially differs from that of a non-perturbed lattice [2].
Continuity condition (1b) fails not only for Niggli’s reduced cell, but also for
all discretely-valued invariants including symmetry groups. The 230 crystallo-
graphic groups in R3cut the continuous space of isometry classes into disjoint
pieces. This stratification shows many nearly identical crystals as distant.
The first step towards a complete isometry classification of crystals has re-
cently been done in [19] by introducing two proper distances between arbitrary
lattices that satisfy the metric axioms and are also continuous under perturba-
tions. Also [19, section 3] reviews many past tools to compare crystals.
The world’s largest Cambridge Structural Database (CSD) has more than
1M crystals. Each crystal is represented by one of infinitely many choices of
a unit cell and a motif Min the form of a Crystallographic Information File
(CIF). The CSD is a super-long list of CIFs with limited search tools, mainly by
chemical compositions, and without any organization by geometric similarity.
Quantifying crystal similarities is even more important for Crystal Structure
Prediction (CSP). A typical CSP software starts from a given chemical compo-
sition and outputs thousands of predicted crystals as approximations to local
minima of a complicated energy function. Any iterative optimization produces
many approximations to the same local minimum. These nearly identical crystals
are currently impossible to automatically identify in a reliable way [22].
Crystals are often compared by the Radial Distribution Function (RDF) that
measures the probability of finding one atom at a distance of rfrom a reference
atom, which is computed up to a manually chosen cut-off radius.
4 O. Anosova and V. Kurlin
The new concept of a stable radius in Definition 8 gives exact conditions
for a required radius depending on a complexity of a periodic set. The crystals
indistinguishable by their RDF or diffraction patterns are known as homometric
[20]. The most recent survey [21, Fig. S4] has highlighted pairs of finite atomic
arrangements that cannot be distinguished by any known crystal descriptors.
On a positive side, the mathematical approach in [12] has solved the al-
ready non-trivial 1-dimensional case for sets whose points have only integer (or
rational) coordinates. Briefly, any given points c0, . . . , cm1on the unit circle
S1Care converted into the Fourier coefficients d(k) =
m1
P
j=0
cjexp 2πijk
m,
k= 0, . . . , m 1. Then all point sets in the unit circle can be distinguished up
to circular rotations by the n-th order invariants up to n= 6, which are all
products of the form d(k1)··· d(kn) with k1+· · · +kn0 (mod m).
The more recent advances in Problem 1 are Density Functions [11] and Aver-
age Minimum Distances [25]. The k-density function ψk[S] of a periodic point set
SRnmeasures the fractional area of the region within a unit cell Ucovered
by exactly kclosed balls with centers aSand a radius t0. The density
functions satisfy conditions (1abc) and completeness (1d) in general position.
However, the density functions do not distinguish the following 1-dimensional
sets S15 ={0,1,3,4,5,7,9,10,12}+15Zand Q15 ={0,1,3,4,6,8,9,12,14}+15Z
with period 15, see [3, Example 11]. The sets S15 , Q15 were introduced at the
beginning of section 5 in [11] as U±V+ 15Zfor U={0,4,9}and V={0,1,3}.
The above sets S15, Q15 are distinguished by the faster Average Minimum
Distances (AMD), see [3, Example 6]. For any integer k1, AMDk(S) is the
distance from a point pSto its k-th nearest neighbor, averaged over all points
pin a motif of S. For k+, AMDk(S) behaves as n
k, see [25, Theorem 14].
3 Necessary concepts from computational geometry
In the Euclidean space Rn, any point pRnis represented by the vector ~p
from the origin of Rnto p. The Euclidean distance between points p, q Rn
is denoted by |pq|=|~p ~q|. For a standard orthonormal basis ~e1, . . . , ~en, the
integer lattice ZnRnconsists of all points with integer coordinates.
Definition 2 (a lattice Λ, a unit cell U, a motif M, a periodic set S=Λ+M).
For any linear basis ~v1, . . . , ~vnin Rn, a lattice is Λ={
n
P
i=1
λi~vi:λiZ}. The
unit cell U(~v1, . . . , ~vn) = n
P
i=1
λi~vi:λi[0,1)is the parallelepiped spanned
by the basis. A motif Mis any finite set of points p1, . . . , pmU. A periodic
point set is the Minkowski sum S=Λ+M={~u +~v :~u Λ, ~v M}. A unit
cell Uof a periodic set S=Λ+Mis primitive if any vector ~v that translates S
to itself is an integer linear combination of the basis of the cell U, i.e. ~v Λ.
An isometry classification of periodic point sets 5
Fig. 2. Left: three primitive cells U, U 0, U00 of the square lattice S. Other pictures
show different periodic sets Λ+M, which are all isometric to the square lattice S.
A primitive unit cell Uof any lattice has a motif of one point (the origin). If
Uis defined as the closed parallelepiped in Rn, hence includes 2nvertices, one
could count every vertex with weight 2nso that the sum is 1. All closed unit
cells in Fig. 2 are primitive, because four corners are counted as one point in U.
The first picture in Fig. 3 shows a small perturbation of a square lattice. The
new periodic set has a twice larger primitive unit cell with two points in a motif
instead of one. All invariants based on a fixed primitive unit cell such as Niggli’s
reduced cell [13, section 9.2] fail continuity condition (1b) in Problem 1.
Fig. 3. A continuous invariant should take close values on these nearly identical peri-
odic sets, though their symmetry groups and primitive cells substantially differ.
The auxiliary concepts in Definitions 3, 4, 5 follow Dolbilin’s papers [9], [5].
Definition 3 (bridge distance β(S)).For a periodic point set SRn, the
bridge distance is a minimum β(S)>0 such that any two points a, b Scan
be connected by a finite sequence a0=a, a1, . . . , am=bsuch that any two
successive points ai, ai+1 are close, i.e. the Euclidean distance |~ai1~ai| β(S)
for i= 1, . . . , m. Fig. 4 shows periodic sets with different bridge distances.
Definition 4 (m-regularity of a periodic set).For any point ain a periodic set
SRn, the global cluster C(S, a) is the infinite set of vectors ~
b~a for all points
bS. Points a, b Sare called isometrically equivalent if there is an isometry
f:C(S, a)C(S, b) such that f(a) = b. A periodic set SRnis called regular
if all points a, b Sare isometrically equivalent. A periodic set Sis m-regular
if all global clusters of Sform exactly m1 isometry classes.
6 O. Anosova and V. Kurlin
For any point aS, its global cluster is a view of Sfrom the position of a,
e.g. how we view all astronomical stars in the universe Sfrom our planet Earth.
Any lattice is 1-regular, because all its global clusters are related by translations.
Though the global clusters C(S, a) and C(S, b) at any different points a, b S
seem to contain the same set S, they can be different even modulo translations.
Fig. 4. Left: the periodic point set Q1has the four points (±2,±2) in the square unit
cell [0,10]2, so Q1isn’t a lattice, but is 1-regular by Definition 4, also β(Q1) = 6. All
local α-clusters are isometric, shown by red arrows for radii α= 5,6,8, see Definition 5.
Right: the periodic point set Q2has the extra point (5,5) in the center of [0,10]2and
is 2-regular with β(Q2) = 32. Local clusters have two isometry types.
The first picture in Fig. 4 shows the 1-regular set Q1R2, where all points
have isometric global clusters related by translations and rotations through
π
2, π, 3π
2, so Q1is not a lattice. The global clusters are infinite, hence distin-
guishing them up to isometry is not easier than distinguishing the original sets.
However, m-regularity can be checked in terms of local clusters defined below.
Definition 5 (local α-clusters C(S, a;α) and symmetry groups Sym(S, a;α)).
For a point ain a crystal SRnand any radius α0, the local cluster
C(S, a;α) is the set of vectors ~
b~a of lengths |~
b~a| αfor bS. An isometry
fIso(Rn) between clusters should match their centers. The symmetry group
Sym(S, a;α) consists of self-isometries of C(S, a;α) that fix the center a.
If α > 0 is smaller than the minimum distance between any points, then
every cluster C(S, a;α) is the single-point set {a}and its symmetry group O(Rn)
consists of all isometries fixing the center a. When the radius αis increasing,
the α-clusters C(S, a;α) become larger and can have fewer self-isometries, so the
symmetry group Sym(S, a;α) becomes smaller and eventually stabilizes.
An isometry classification of periodic point sets 7
The 1-regular set Q1in Fig. 4 for any point aQ1has the symmetry group
Sym(Q1, a;α) = O(R2) for α[0,4). The group Sym(Q1, a;α) stabilizes as Z2
for α4 as soon as the local α-cluster C(Q1, a;α) includes one more point.
4 The isotree of isometry classes and a stable radius
This section introduces the isotree and a stable radius in Definitions 6 and 8 by
comparing local clusters at radii αβand β, where βis the bridge distance.
Any isometry ABbetween local clusters should map the center of A
to the center of B. The isotree in Definition 6 is inspired by a dendrogram
of hierarchical clustering, though points are partitioned according to isometry
classes of local α-clusters at different radii α, not by a distance threshold.
Definition 6 (isotree IT(S) of α-partitions).Fix a periodic set SRnand
α0. Points a, b Sare called α-equivalent if their α-clusters C(S, a;α) and
C(S, b;α) are isometric. The α-equivalence class [C(S, a;α)] consists of all α-
clusters isometric to C(S, a;α). The α-partition P(S;α) is the splitting of Sinto
α-equivalence classes of points. The number of α-equivalence classes of α-clusters
is the cluster count |P(S;α)|. When the radius αis increasing, the α-partition
can be refined by subdividing α-equivalence classes of points of Sinto subclasses.
If we represent each α-equivalence class by an abstract point, the resulting points
form the isotree IT(S) of all α-partitions, see Fig. 5, 6.
The α-equivalence and isoset in Definition 9 can be refined by labels of points
such as chemical elements. Theorem 10 will remain valid for labelled points.
Recall that isometries include reflections, however an orientation sign can be
easily added to α-clusters, hence we focus on the basic case of all isometries.
When a radius αis increasing, α-clusters C(S, a;α) include more points,
hence are less likely to be isometric, so |P(S;α)|is a non-increasing function of
α. Fig. 5, 6 show α-clusters and isotrees of non-isometric 1D periodic sets S, Q
[20, p. 197, Fig. 2], which have identical 1D analogs of diffraction patterns.
Any α-equivalence class from P(S;α) may split into two or more classes,
which will not merge at any larger radius α0. Lemma 7 justifies that the isotree
IT(S) can be visualized as a merge tree of α-equivalence classes of clusters.
Lemma 7 (isotree properties).The isotree IT(S) has the following properties:
(7a) for α= 0, the α-partition P(S; 0) consists of one class;
(7b) if α < α0, then Sym(S, a;α0)Sym(S, a;α) for aS;
(7c) if α < α0, the α0-partition P(S;α0)refines P(S;α), i.e. any set from the
α0-partition P(S;α0) is included into a set from the α-partition P(S;α).
Proof. (7a) If α0 is smaller than the minimum distance rbetween point of S,
every cluster C(S, a;α) is the single-point set {a}. All these single-point clusters
are isometric to each other. So |P(S;α)|= 1 for all small radii α < r.
8 O. Anosova and V. Kurlin
Fig. 5. Left:S={0,1,3,4}+ 8Zhas t= 4 and is 2-regular by Definition 4. Right:
Local clusters with radii α= 0,1,2,3 represent vertices of the isotree IT(S) in Defini-
tion 6. All α-clusters are isometric for α < 2, form two isometry classes for α2.
Fig. 6. Left:Q={0,3,4,5}+ 8Zhas t= 3 and is 3-regular by Definition 4. Right:
Local clusters with radii α= 0,1,2,3 represent vertices of the isotree IT(Q) in Defini-
tion 6. All α-clusters are isometric for α < 1, form three isometry classes for α1.
(7b) For any point aS, the inclusion of clusters C(S, a;α)C(S, a;α0) implies
that any self-isometry of the larger cluster C(S, a;α0) can be restricted to a self-
isometry of the smaller cluster C(S, a;α). So Sym(S, a;α0)Sym(S, a;α).
(7c) If points a, b Sare α0-equivalent at the larger radius α0, i.e. the clusters
C(S, a;α0) and C(S, b;α0) are isometric, then a, b are α-equivalent at the smaller
radius α. Hence any α0-equivalence class is a subset of an α-equivalence class.
Property (7c) can be illustrated by the examples in Fig. 5 and 6. For α= 1,
all points of the periodic set S={0,1,3,4}+ 8Zare in the same α-equivalence
class with 1-cluster {0,1}. For α0= 2, Ssplits in two α0-equivalence classes: one
containing the points from 0+Zand 4+ Zwith the 2-clusters {0,1}and another
one containing 1 + Zand 3 + Zwith 2-clusters {−1,0,2}.
If a point set Sis periodic, the α-partitions of Sstabilize in the sense below.
An isometry classification of periodic point sets 9
Definition 8 (a stable radius).Let a periodic point set SRnand βbe an
upper bound of its bridge distance β(S) from Definition 3. A radius αβis
called stable if both conditions below hold:
(8a) the α-partition P(S;α) coincides with the (αβ)-partition P(S;αβ);
(8b) the symmetry groups stabilize: Sym(S, a;α) = Sym(S, a;αβ) for all points
aS, which is enough to check for points only from a finite motif of S.
A minimum radius αsatisfying the above conditions for the bridge distance
β(S) from Definition 3 can be called the minimum stable radius and denoted by
α(S). Upper bounds of α(S) and β(S) will be enough for all results below.
Due to Lemma (7bc), conditions (8ab) imply that the α0-partitions P(S;α0)
and the symmetry groups Sym(S, a;α0) remain the same for all α0[αβ, α].
Condition (8b) doesn’t follow from condition (8a) due to the following ex-
ample. Let Λbe the 2D lattice with the basis (1,0) and (0, β) for β > 1. Then
βis the bridge distance of Λ. Condition (8a) is satisfied for any α0, because
all points of any lattice are equivalent up to translations. However, condition
(8b) fails for any α < β + 1. Indeed, the α-cluster of the origin (0,0) contains
five points (0,0),(±1,0),(0,±β), whose symmetries are generated by the two
reflections in the axes x, y, but the (αβ)-cluster of the origin consists of only
(0,0) and has the symmetry group O(2).
Condition (8b) might imply condition (8a), but in practice it makes sense to
verify (8b) only after checking much simpler condition (8a). Both conditions are
essentially used in the proofs of Isometry Classification Theorem 10.
For the set S={0,1,3,4}+ 8Zin Fig. 5 with the bridge distance β(S) = 4,
any α6 is a stable radius, because the partition P(S;α4) splits Sinto the
same two classes for any α6. For the periodic set Q={0,3,4,5}+ 8Zin
Fig. 6 with the bridge distance β(Q) = 3, any α4 is a stable radius.
Any periodic set SRnwith mmotif points has at most m α-equivalence
classes, because any point of Scan be translated to a motif point. Hence it suffices
to check condition (8a) about α-partitions only for the mmotif points. Condi-
tion (8b) can be practically checked by testing if the inclusion Sym(S, a;α0)
Sym(S, a;α) from (7b) is surjective, which is needed only for one representative
cluster from at most misometry classes (exactly mis Sis m-regular).
A stable radius in [5] was defined by using the notations ρand ρ+t. This pair
changed to αβand α, because subsequent Theorem 10 is more conveniently
stated for the larger radius α. Any 1-regular set in R3with a bridge distance β
has a stable radius α= 7βor ρ= 6tin the past notations of [9].
5 Isosets completely classify periodic sets up to isometry
A criterion of m-regular sets [10, Theorem 1.3] has inspired us to introduce
the new invariant isoset in Definition 9, whose completeness (injectivity) in the
isometry classification of periodic sets will be proved in main Theorem 10.
10 O. Anosova and V. Kurlin
Definition 9 (isoset I(S;α) of a periodic point set Sat a radius α).Let a
periodic point set SRnhave a motif Mof mpoints. Split all points aM
into α-equivalence classes. Then each α-equivalence class consisting of (say) k
points in Mcan be associated with the isometry class of σ= [C(S, a;α)] of an
α-cluster centered at one of these kpoints aM. The weight of the class σis
defined as w=k/m. Then the isoset I(S;α) is defined as the unordered set of
all isometry classes with weights (σ;w) over all points aM.
All points aof a lattice ΛRnare α-equivalent for any α0, because all
α-clusters C(Λ, a;α) are isometrically equivalent to each other by translations.
Hence the isoset I(Λ;α) is one isometry class of weight 1 for any α.
All isometry classes σI(S;α) are in a 1-1 correspondence with all α-
equivalence classes in the α-partition P(S;α) from Definition 6. So I(S;α) with-
out weights is a set of points in the isotree IT(S) at the radius α. The size of
the isoset I(S;α) equals the cluster count |P(S;α)|. Formally, I(S;α) depends
on α, because α-clusters grow in α. To distinguish periodic point sets S, Q up
to isometry, we will compare their isosets at a common stable radius α.
An equality σ=ξbetween isometry classes of clusters means that there is
an isometry ffrom a cluster C(S, a;α) representing σto a cluster C(Q, b;α)
representing ξsuch that f(a) = b, i.e. frespects the centers of the clusters.
The set S={0,1,3,4}+ 8Zin Fig. 5 has the isoset I(S; 6) of two isometry
classes of 6-clusters represented by {−4,3,1,0,1,4,5}and {−3,2,0,1,5,6}
centered at 0. The set Q={0,3,4,5}+ 8Zin Fig. 6 has the isoset I(Q; 4) of
three isometry classes of 4-clusters represented by {−4,3,0,3,4},{−3,0,1,2},
{−4,1,0,1,4}. To conclude that S, Q are not isometric, Theorem 10 will require
us to compare their isosets at a common stable radius α6. In the above case
it suffices to say that the stabilized cluster counts differ: 2 6= 3.
An equality σ=ξbetween isometry classes means that there is an isometry
ffrom a cluster in σto a cluster in ξso that frespects the centers of the
clusters. This equality is checked in time O(kn2log k) for any dimension n3
by [1, Theorem 1(a)], where kis the maximum number of points in the clusters.
Theorem 10 (complete isometry classification of periodic point sets).For any
periodic point sets S, Q Rn, let αbe a common stable radius satisfying Defi-
nition 8 for an upper bound βof β(S), β (Q). Then S, Q are isometric if and only
if there is a bijection between their isosets respecting weights: I(S;α) = I(Q;α)
means that any isometry class (σ;w)I(S;α) of a weight wcoincides with a
class (ξ;w)I(Q;α) of the same weight wand vice versa.
Theoretically a complete invariant of Sshould include isosets I(S;α) for all
sufficiently large radii α. However, when comparing two sets S, Q up to isometry,
it suffices to build their isosets only at a common stable radius α.
The α-equivalence and isoset in Definition 9 can be refined by labels of points
such as chemical elements, which keeps Theorem 10 valid for labeled points.
An isometry classification of periodic point sets 11
Recall that isometries include reflections, however an orientation sign can be
easily added to α-clusters, hence we focus on the basic case of all isometries.
The proposed complete invariant for classification Problem 1 is the function
S7→ I(S;α) from any periodic point set Sto its isoset at a stable radius α, which
doesn’t need to be minimal. All points aof a lattice ΛRnare α-equivalent to
each other for α0, because all α-clusters C(Λ, a;α) are related by translations,
hence the isoset I(Λ;α) of any lattice is a single isometry class for any α.
Lemmas 11 and 12 help to extend an isometry between local clusters to full
periodic sets to prove the complete isometry classification in Theorem 10.
Lemma 11 (local extension).Let periodic sets S, Q Rnhave bridge distances
at most βand a common stable radius αsuch that α-clusters C(S, a;α) and
C(Q, b;α) are isometric for some aS,bQ. Then any isometry f:C(S, a;α
β)C(Q, b;αβ) extends to an isometry C(S, a;α)C(Q, b;α).
Proof. Let g:C(S, a;α)C(Q, b;α) be any isometry, which may not coincide
with fon the (αβ)-subcluster C(S, a;αβ). The composition f1giso-
metrically maps C(S, a;αβ) to itself. Hence f1g=hSym(S, a;αβ)
is a self-isometry. Since the symmetry groups stabilize by condition (8b), the
isometry hmaps the larger cluster C(S, a;α) to itself. Then the initial isometry
fextends to the isometry gh1:C(S, a;α)C(Q, b;α) as required.
Lemma 12 (global extension).For any periodic point sets S, Q Rn, let αbe
a common stable radius satisfying Definition 8 for an upper bound βof both
β(S), β(Q). Assume that I(S;α) = I(Q;α). Fix a point aS. Then any local
isometry f:C(S, a;α)C(Q, f (a); α) extends to a global isometry SQ.
Proof. We shall prove that the image f(b) of any point a0Sbelongs to Q,
hence f(S)Q. Swapping the roles of Sand Qwill prove that f1(Q)S,
i.e. fis a global isometry SQ. By Definition 3 the above points a, a0S
are connected by a sequence of points a=a0, a1, . . . , am=a0Ssuch that
|~ai1~ai| β,i= 1, . . . , m, where βis an upper bound of both β(S), β(Q).
The cluster C(S, a;α) is the intersection SB(a;α). The ball B(a;α) contains
the smaller ball B(a1;αβ) around the closely located center a1. Indeed, since
|~a ~a1| β, the triangle inequality for the Euclidean distance implies that any
cB(a1;α) with |~a1~c| αβsatisfies |~a ~c|≤|~a ~a1|+|~a1~c| α.
Due to I(S;α) = I(Q;α) the isometry class of C(S, a1;α) coincides with
an isometry class of C(Q, b;α) for some bQ, i.e. C(S, a1;α) is isometric to
C(Q, b;α). Then the clusters C(S, a1;αβ) and C(Q, b;αβ) are isometric.
By condition (8a), the splitting of Qinto α-equivalence classes coincides
with the splitting into (αβ)-equivalence classes. Take the (αβ)-equivalence
class [C(Q, b;αβ)] containing b. This class includes the point f(a1)Q,
because frestricts to the isometry f:C(S, a1;αβ)C(Q, f (a1); αβ) and
C(S, a1;αβ) was shown to be isometric to C(Q, b;αβ).
12 O. Anosova and V. Kurlin
The α-equivalence class [C(Q, b;α)] includes both band f(a1). The isometry
class [C(Q, b;α)] = [C(S, a1;α)] can be represented by the cluster C(Q, f(a1); α),
which is now proved to be isometric to C(S, a1;α).
We apply Lemma 11 for frestricted to C(S, a1;αβ)C(Q, f (a1), α β)
and conclude that fextends to an isometry C(S, a1;α)C(Q, f (a1); α).
Continue applying Lemma 11 to the clusters around the next center a2and
so on until we conclude that the initial isometry fmaps the α-cluster centered
at am=a0Sto an isometric cluster within Q, so f(a0)Qas required.
Lemma 13 (all stable radii of a periodic set).If αis a stable radius of a periodic
point set SRn, then so is any larger radius α0> α. Then all stable radii form
the interval [α(S),+), where α(S) is the minimum stable radius of S.
Proof. Due to Lemma (7bc), conditions (8ab) imply that the α0-partition P(S;α0)
and the symmetry groups Sym(S, a;α0) remain the same for all α0[αβ, α].
We need to show that they remain the same for any larger α0> α.
Below we will apply Lemma 12 for the same set S=Qand β=β(S).
Let points a, b Sbe α-equivalent, i.e. there is an isometry f:C(S, a;α)
C(S, b;α). By Lemma 12 the local isometry fextends to a global self-isometry
SSsuch that f(a) = b. Then all larger α0-clusters of a, b are isometric,
i.e. a, b are α0-equivalent and P(S;α) = P(S, α0). Similarly, any self-isometry of
C(S, a;α) extends to a global self-isometry, i.e. the symmetry group Sym(S, a;α0)
for any α0> α is isomorphic to Sym(S, a;α0).
Proof of Theorem 10. The part only if follows by restricting any given global
isometry f:SQbetween the infinite sets of points to the local α-clusters
C(S, a;α)C(Q, f (a); α) for any point ain a motif Mof S.
Hence the isometry class [C(S, a;α)] is considered equivalent to the class
[C(Q, f (a); α)], which can be represented by the α-cluster C(Q, b;α) centered at
a point bin a motif of Q. Since fis a bijection and the point aMwas arbitrary,
we get a bijection between isometry classes with weights in I(S;α) = I(Q;α).
The part if . Fix a point aS. The α-cluster C(S, a;α) represents a class
with a weight (σ, w)I(S;α). Due to I(S;α) = I(Q;α), there is an isometry
f:C(S, a;α)C(Q, f (a); α) to a cluster from an equal class (σ, w)I(Q;α).
By Lemma 12 the local isometry fextends to a global isometry SQ.
6 A discussion of further properties of isosets
This paper has resolved the ambiguity challenge for crystal representations,
which is common for many data objects [3]. Crystal descriptors [15] are often
based on ambiguous unit cells or computed up to a manual cut-off radii. Rep-
resentations of 2-periodic textiles [6] should be similarly studied up to periodic
isotopies [3, section 10] without fixing a unit cell. Definition 8 gives conditions
for a stable radius so that larger clusters will not bring any new information.
An isometry classification of periodic point sets 13
The recent survey of atomic structure representations [21] confirmed that
there was no complete invariant that distinguishes all crystals up to isometry.
Theorem 10 provides a complete invariant for the first time. The follow-up pa-
per [3] discusses computations and continuity of the new invariant isoset. Isosets
consisting of different numbers of isometry classes will be compared by the Earth
Mover’s Distance [14]. We thank all reviewers for their time and suggestions.
The recent developments in Periodic Geometry include complete classifica-
tion of periodic sequences [4,17], continuous maps of Lattice Isometry Spaces
in dimension two [18,7] and three [16,8], and applications to materials science
[23,26]. The latest ultra-fast and generically complete Pointwise Distance Dis-
tributions [24] justified the Crystal Isometry Principle (CRISP) saying that all
real periodic crystals live in a common space of isometry classes of periodic point
sets continuously parameterised by their complete invariants such as isosets.
References
1. Alt, H., Mehlhorn, K., Wagener, H., Welzl, E.: Congruence, similarity, and sym-
metries of geometric objects. Discrete & Comp. Geometry 3, 237–256 (1988)
2. Andrews, L., Bernstein, H., Pelletier, G.: A perturbation stable cell comparison
technique. Acta Crystallographica A 36(2), 248–252 (1980)
3. Anosova, O., Kurlin, V.: Introduction to periodic geometry and topology.
arXiv:2103.02749 (2021)
4. Anosova, O., Kurlin, V.: Density functions of periodic sequences (2022), https:
//arxiv.org/abs/22
5. Bouniaev, M., Dolbilin, N.: Regular and multi-regular t-bonded systems. J. Infor-
mation Processing 25, 735–740 (2017)
6. Bright, M., Kurlin, V.: Encoding and topological computation on textile structures.
Computers & Graphics 90, 51–61 (2020)
7. Bright, M., Cooper, A.I., Kurlin, V.: Geographic-style maps for 2-dimensional
lattices. arxiv:2109.10885 (early draft) (2021), http://kurlin.org/projects/
periodic-geometry-topology/lattices2Dmap.pdf
8. Bright, M., Cooper, A.I., Kurlin, V.: Welcome to a continuous world of 3-
dimensional lattices. arxiv:2109.11538 (early draft) (2021), http://kurlin.org/
projects/periodic-geometry-topology/lattices3Dmap.pdf
9. Dolbilin, N., Bouniaev, M.: Regular t-bonded systems in R3. European Journal of
Combinatorics 80, 89–101 (2019)
10. Dolbilin, N., Lagarias, J., Senechal, M.: Multiregular point systems. Discrete &
Computational Geometry 20(4), 477–498 (1998)
11. Edelsbrunner, H., Heiss, T., Kurlin, V., Smith, P., Wintraecken, M.: The density
fingerprint of a periodic point set. In: Proceedings of SoCG. pp. 32:1–32:16 (2021)
12. Gr¨unbaum, F., Moore, C.: The use of higher-order invariants in the determination
of generalized patterson cyclotomic sets. Acta Cryst. A 51, 310–323 (1995)
13. Hahn, T., Shmueli, U., Arthur, J.: Intern. tables for crystallography, vol. 1 (1983)
14. Hargreaves, C.J., Dyer, M.S., Gaultois, M.W., Kurlin, V.A., Rosseinsky, M.J.:
The earth mover’s distance as a metric for the space of inorganic compositions.
Chemistry of Materials (2020)
14 O. Anosova and V. Kurlin
15. Himanen, Land ager, M., Morooka, E., Canova, F., Ranawat, Y., Gao, D., Rinke,
P., Foster, A.: Dscribe: Library of descriptors for machine learning in materials
science. Computer Physics Communications 247, 106949 (2020)
16. Kurlin, V.: A complete isometry classification of 3-dimensional lattices.
arxiv:2201.10543 (early draft) (2022), http://kurlin.org/projects/
periodic-geometry-topology/lattices3Dmaths.pdf
17. Kurlin, V.: A computable and continuous metric on isometry classes of high-
dimensional periodic sequences (2022), https://arxiv.org/abs/22
18. Kurlin, V.: Mathematics of 2-dimensional lattices.
https://arxiv.org/abs/2201.05150 (early draft) (2022), http://kurlin.org/
projects/periodic-geometry-topology/lattices2Dmaths.pdf
19. Mosca, M., Kurlin, V.: Voronoi-based similarity distances between arbitrary crystal
lattices. Crystal Research and Technology 55(5), 1900197 (2020)
20. Patterson, A.: Ambiguities in the x-ray analysis of crystal structures. Physical
Review 65, 195 (1944)
21. Pozdnyakov, S., Willatt, M., Bart´ok, A., Ortner, C., Cs´anyi, G., Ceriotti, M.:
Incompleteness of atomic structure representations. Phys. Rev. Lett. 125, 166001
(2020), arXiv:2001.11696
22. Pulido, A., Chen, L., Kaczorowski, T., Holden, D., Little, M., Chong, S., Slater,
B., McMahon, D., Bonillo, B., Stackhouse, C., Stephenson, A., Kane, C., Clowes,
R., Hasell, T., Cooper, A., Day, G.: Functional materials discovery using energy–
structure–function maps. Nature 543, 657–664 (2017)
23. Ropers, J., Mosca, M.M., Anosova, O., Kurlin, V., Cooper, A.I.: Fast predictions of
lattice energies by continuous isometry invariants of crystal structures. In: Proceed-
ings of DACOMSIN: Data and Computation for Materials Science and Innovation.
https://arxiv.org/abs/2108.07233
24. Widdowson, D., Kurlin, V.: Pointwise distance distributions of periodic sets (2021),
http://kurlin.org/projects/periodic-geometry-topology/PDD.pdf
25. Widdowson, D., Mosca, M., Pulido, A., Kurlin, V., Cooper, A.: Average
minimum distances of periodic point sets - fundamental invariants for map-
ping all periodic crystals. MATCH Communications in Mathematical and
in Computer Chemistry 87, 529–559 (2022), http://kurlin.org/projects/
periodic-geometry-topology/AMD.pdf
26. Zhu, Q., Johal, J., Widdowson, D., Pang, Z., Li, B., Kane, C., Kurlin, V., Day,
G., Little, M., Cooper, A.: Analogy powered by prediction and structural invari-
ants: Computationally-led discovery of a mesoporous hydrogen-bonded organic
cage crystal. Journal of Amer Chem Soc (to appear)
... Our main focus is on "self-assembly" 1 properties of point sets in R d as the usual model for atomic structure of a solid structure. That is we study how local conditions for clusters of point sets imply periodicity of the global points sets. ...
... For multi-regular sets, the analogue of the regularity radius was introduced as the 'stable radius' in [1,2], with some upper bounds in terms of the underlying unit cell in [2,Lem. 3.7]. ...
... Asymptotically, as d → ∞, the smaller upper bound τ d · τ d−1 · . . . · τ1 for |S x (2R)| in Proposition 3.2 is significantly better than the larger upper bound τ d d . However, after taking the logarithm and using the general upper bound τ d ≤ 2 0.4011d(1+ o(1)) , the resulting terms differ only by a constant factor and thus we use the simpler bound of τ d d instead. ...
Article
Full-text available
Delone sets are discrete point sets X in Rd{\mathbb {R}}^d R d characterized by parameters ( r , R ), where (usually) 2 r is the smallest inter-point distance of X , and R is the radius of a largest “empty ball” that can be inserted into the interstices of X . The regularity radius ρ^d{\hat{\rho }}_d ρ ^ d is defined as the smallest positive number ρ\rho ρ such that each Delone set with congruent clusters of radius ρ\rho ρ is a regular system, that is, a point orbit under a crystallographic group. We discuss two conjectures on the growth behavior of the regularity radius. Our “Weak Conjecture” states that ρ^d=O(d2log2d)R{\hat{\rho }}_{d}={\textrm{O}(d^2\log _2 d)}R ρ ^ d = O ( d 2 log 2 d ) R as dd\rightarrow \infty d → ∞ , independent of r . This is verified in the paper for two important subfamilies of Delone sets: those with full-dimensional clusters of radius 2 r and those with full-dimensional sets of d -reachable points. We also offer support for the plausibility of a “Strong Conjecture”, stating that ρ^d=O(dlog2d)R{\hat{\rho }}_{d}={\textrm{O}(d\log _2 d)}R ρ ^ d = O ( d log 2 d ) R as dd\rightarrow \infty d → ∞ , independent of r .
... Similar minimizations over rotations or other continuous parameters are required for the complete invariant isosets [2] and density functions, which can be practically computed in low dimensions [14] whose completeness was proved for generic periodic point sets in R 3 [6,Theorem 2]. The density fingerprint [S] turned out to be incomplete [6, section 5] in the example below. ...
... Put the gaps in increasing order: g [1] ≤ g [2] ≤ · · · ≤ g [m] . ...
... So ψ 0 (t) linearly decreases from the initial value ψ 0 (0) = 1 − l except for m critical values of t where one of the gap intervals [ p i + r i + t, p i+1 − r i+1 − t] between successive growing intervals L i (t) and L i+1 (t) shrinks to a point. These critical radii t are ordered according to the gaps g [1] ≤ g [2] ≤ · · · ≤ g [m] . ...
Article
Full-text available
Periodic Geometry studies isometry invariants of periodic point sets that are also continuous under perturbations. The motivations come from periodic crystals whose structures are determined in a rigid form, but any minimal cells can discontinuously change due to small noise in measurements. For any integer k≥0k0k\ge 0, the density function of a periodic set S was previously defined as the fractional volume of all k-fold intersections (within a minimal cell) of balls that have a variable radius t and centers at all points of S. This paper introduces the density functions for periodic sets of points with different initial radii motivated by atomic radii of chemical elements and by continuous events occupying disjoint intervals in time series. The contributions are explicit descriptions of the densities for periodic sequences of intervals. The new densities are strictly stronger and distinguish periodic sequences that have identical densities in the case of zero radii.
... Indeed, the sequence {0, 1 + 1 , … , m + m } + (m + 1)ℤ is nearly identical to the set ℤ of integers for all i close to 0, but the periods 1 and 1 + m (unit cells) are arbitrarily different. This discontinuity was resolved by the complete and continuous invariant isoset [28] whose disadvantage was an approximate algorithm for a metric computation [29], which has a guaranteed multiplicative factor of about 4 in ℝ 3 . The next subsection discusses the much faster, continuous, and generically complete invariant PDD. ...
... is uniquely reconstructable (up to isometry) using the motif size m, a lattice Λ , and PDD(S;k) whose largest distance in every row is at least 2R(Λ) , where R(Λ) is the maximum distance from any point p ∈ ℝ n to Λ , see Theorem 4.4 of [8]. 5. PDD(S;100) distinguished all (more than 670 thousand) periodic crystals in the Cambridge Structural Database (CSD) through more than 200 billion pairwise comparisons over two days on a modest desktop, see Section 6 [8]. Hence, all real periodic materials have uniquely defined locations in a common Crystal Isometry Space continuously parameterized by the complete invariant isosets [28], though the faster PDD suffices in practice. ...
Article
Full-text available
The structure–property hypothesis says that the properties of all materials are determined by an underlying crystal structure. The main obstacle was the ambiguity of conventional crystal representations based on incomplete or discontinuous descriptors that allow false negatives or false positives. This ambiguity was resolved by the ultra-fast pointwise distance distribution, which distinguished all periodic structures in the world’s largest collection of real materials (Cambridge structural database). State-of-the-art results in property prediction were previously achieved by graph neural networks based on various graph representations of periodic crystals, including the Crystal Graph with vertices at all atoms in a crystal unit cell. This work adapts the pointwise distance distribution for a simpler graph whose vertex set is not larger than the asymmetric unit of a crystal structure. The new Distribution Graph reduces mean absolute error by 0.6–12% while having 44–88% of the number of vertices when compared to the Crystal Graph when applied on the Materials Project and Jarvis-DFT datasets using CGCNN and ALIGNN. Methods for hyper-parameters selection for the graph are backed by the theoretical results of the pointwise distance distribution and are then experimentally justified.
... The approach through bounded clusters led to the isoset invariant, 36 which was proved to be complete for all periodic point sets including singular ones in any Euclidean space n . The Lipschitz continuous metric on isosets is approximated with a proved error factor. ...
Article
Full-text available
The Cambridge Structural Database (CSD) played a key role in the recently established crystal isometry principle (CRISP). The CRISP says that any real periodic crystal is uniquely determined as a rigid structure by the geometry of its atomic centers without atomic types. Ignoring atomic types allows us to study all periodic crystals in a common space whose continuous nature is justified by the continuity of real-valued coordinates of atoms. Our previous work introduced structural descriptors pointwise distance distributions (PDD) that are invariant under isometry defined as a composition of translations, rotations, and reflections. The PDD invariants distinguished all nonduplicate periodic crystals in the CSD. This paper presents the first continuous maps of the CSD and its important subsets in invariant coordinates that have analytic formulas and physical interpretations. Any existing periodic crystal has a uniquely defined location on these geographic-style maps. Any newly discovered periodic crystals will appear on the same maps without disturbing the past materials.
... Though it is very tempting to reduce a periodic point set to a finite subset such as an extended motif, this reduction can lead only to many non-isometric subsets as in Fig. 5. Hence, there is no simple way to reduce a periodic point set to a single finite subset. Taking finite clouds around every atom in a motif can lead to a complete invariant of periodic point sets under isometry (Anosova & Kurlin, 2021), but the continuity under perturbations needs careful justifications . ...
Article
Full-text available
This paper was motivated by the articles ‘Same or different – that is the question’ in CrystEngComm (July 2020) and ‘Change to the definition of a crystal’ in the IUCr Newsletter (June 2021). Experimental approaches to crystal comparisons require rigorously defined classifications in crystallography and beyond. Since crystal structures are determined in a rigid form, their strongest equivalence in practice is rigid motion, which is a composition of translations and rotations in 3D space. Conventional representations based on reduced cells and standardizations theoretically distinguish all periodic crystals. However, all cell-based representations are inherently discontinuous under almost any atomic displacement that can arbitrarily scale up a reduced cell. Hence, comparison of millions of known structures in materials databases requires continuous distance metrics.
... More importantly, the above experiment justified the Crystal Isometry Principle (CRISP) saying that all real periodic crystals have unique locations determined by their complete isometry invariants in a common Crystal Isometry Space continuously parametrised by complete isometry invariants. Even if examples of periodic sets with the same PDD emerge, the slower isoset invariant is provably complete (Anosova and Kurlin 2021) and has continuous metrics ). ...
Article
Full-text available
Persistent homology is a popular and useful tool for analysing finite metric spaces, revealing features that can be used to distinguish sets of unlabeled points and as input into machine learning pipelines. The famous stability theorem of persistent homology provides an upper bound for the change of persistence in the bottleneck distance under perturbations of points, but without giving a lower bound. This paper clarifies the possible limitations persistent homology may have in distinguishing finite metric spaces, which is evident for non-isometric point sets with identical persistence. We describe generic families of point sets in metric spaces that have identical or even trivial one-dimensional persistence. The results motivate stronger invariants to distinguish finite point sets up to isometry.
Article
Full-text available
Zeolites are inorganic materials known for their diversity of applications, synthesis conditions, and resulting polymorphs. Although their synthesis is controlled both by inorganic and organic synthesis conditions, computational studies of zeolite synthesis have focused mostly on the design of organic structure-directing agents (OSDAs). In this work, we combine distances between crystal structures and machine learning (ML) to create inorganic synthesis maps in zeolites. Starting with 253 known zeolites, we show how the continuous distances between frameworks reproduce inorganic synthesis conditions from the literature without using labels such as building units. An unsupervised learning analysis shows that neighboring zeolites according to two different representations often share similar inorganic synthesis conditions, even in OSDA-based routes. In combination with ML classifiers, we find synthesis-structure relationships for 14 common inorganic conditions in zeolites, namely Al, B, Be, Ca, Co, F, Ga, Ge, K, Mg, Na, P, Si, and Zn. By explaining the model predictions, we demonstrate how (dis)similarities towards known structures can be used as features for the synthesis space, thus quantifying the intuition that similar structures often share inorganic synthesis routes. Finally, we show how these methods can be used to predict inorganic synthesis conditions for unrealized frameworks in hypothetical databases and interpret the outcomes by extracting local structural patterns from zeolites. In combination with OSDA design, this work can accelerate the exploration of the space of synthesis conditions for zeolites.
Article
An open-access full text can be found at the following site: https://www.sciencedirect.com/science/article/pii/S0010465523002345
Article
Full-text available
This paper develops geographic style maps containing two-dimensional lattices in all known periodic crystals parameterized by recent complete invariants. Motivated by rigid crystal structures, lattices are considered up to rigid motion and uniform scaling. The resulting space of two-dimensional lattices is a square with identified edges or a punctured sphere. The new continuous maps show all Bravais classes as low-dimensional subspaces, visualize hundreds of thousands of lattices of real crystal structures from the Cambridge Structural Database, and motivate the development of continuous and invariant-based crystallography.
Article
Full-text available
A periodic lattice in Euclidean space is the infinite set of all integer linear combinations of basis vectors. Any lattice can be generated by infinitely many different bases. This ambiguity was partially resolved, but standard reductions remain discontinuous under perturbations modelling atomic displacements. This paper completes a continuous classification of 2-dimensional lattices up to Euclidean isometry (or congruence), rigid motion (without reflections), and similarity (with uniform scaling). The new homogeneous invariants allow easily computable metrics on lattices considered up to the equivalences above. The metrics up to rigid motion are especially non-trivial and settle all remaining questions on (dis)continuity of lattice bases. These metrics lead to real-valued chiral distances that continuously measure lattice deviations from higher-symmetry neighbours. The geometric methods extend the past work of Delone, Conway, and Sloane.
Article
Full-text available
Mesoporous molecular crystals have potential applications in separation and catalysis, but they are rare and hard to design because many weak interactions compete during crystallization, and most molecules have an energetic preference for close packing. Here, we combine crystal structure prediction (CSP) with structural invariants to continuously qualify the similarity between predicted crystal structures for related molecules. This allows isomorphous substitution strategies, which can be unreliable for molecular crystals, to be augmented by a priori prediction, thus leveraging the power of both approaches. We used this combined approach to discover a rare example of a low-density (0.54 g cm-3) mesoporous hydrogen-bonded framework (HOF), 3D-CageHOF-1. This structure comprises an organic cage (Cage-3-NH2) that was predicted to form kinetically trapped, low-density polymorphs via CSP. Pointwise distance distribution structural invariants revealed five predicted forms of Cage-3-NH2 that are analogous to experimentally realized porous crystals of a chemically different but geometrically similar molecule, T2. More broadly, this approach overcomes the difficulties in comparing predicted molecular crystals with varying lattice parameters, thus allowing for the systematic comparison of energy-structure landscapes for chemically dissimilar molecules.
Article
Full-text available
The fundamental model of any solid crystalline material (crystal) at the atomic scale is a periodic point set. The strongest natural equivalence of crystals is rigid motion or isometry that preserves all inter-atomic distances. Past comparisons of periodic structures often used manual thresholds, symmetry groups and reduced cells, which are discontinuous under perturbations or thermal vibrations of atoms. This work defines the infinite sequence of continuous isometry invariants (Average Minimum Distances) to progressively capture distances between neighbors. The asymptotic behaviour of the new invariants is theoretically proved in all dimensions for a wide class of sets including non-periodic. The proposed near linear time algorithm identified all different crystals in the world's largest Cambridge Structural Database within a few hours on a modest desktop. The ultra fast speed and proved continuity provide rigorous foundations to continuously parameterise the space of all periodic crystals as a high-dimensional extension of Mendeleev's table of elements.
Article
Full-text available
Many-body descriptors are widely used to represent atomic environments in the construction of machine-learned interatomic potentials and more broadly for fitting, classification, and embedding tasks on atomic structures. There is a widespread belief in the community that three-body correlations are likely to provide an overcomplete description of the environment of an atom. We produce several counterexamples to this belief, with the consequence that any classifier, regression, or embedding model for atom-centered properties that uses three- (or four)-body features will incorrectly give identical results for different configurations. Writing global properties (such as total energies) as a sum of many atom-centered contributions mitigates the impact of this fundamental deficiency—explaining the success of current “machine-learning” force fields. We anticipate the issues that will arise as the desired accuracy increases, and suggest potential solutions.
Article
Full-text available
A textile structure is a periodic arrangement of threads in the thickened plane. A topological classification of textile structures is harder than for classical knots and links that are non-periodic and restricted to a bounded region. The first important problem is to encode all textile structures in a simple combinatorial way. This paper extends the notion of the Gauss code in classical knot theory, providing a tool for topological computation on these structures. As a first application, we present a linear time algorithm for determining whether a code represents a textile in the physical sense. This algorithm, along with invariants of textile structures, allowed us for the first time to classify all oriented textile structures woven from a single component up to complexity five.
Article
Full-text available
This paper develops a new continuous approach to a similarity between periodic lattices of ideal crystals. Quantifying a similarity between crystal structures is needed to substantially speed up the crystal structure prediction, because the prediction of many target properties of crystal structures is computationally slow and is essentially repeated for many nearly identical simulated structures. The proposed distances between arbitrary periodic lattices of crystal structures are invariant under all rigid motions, satisfy the metric axioms and continuity under atomic perturbations. The above properties make these distances ideal tools for clustering and visualizing large datasets of crystal structures. All the conclusions are rigorously proved and justified by experiments on real and simulated crystal structures.
Chapter
This paper contributes to the emergent area of Periodic Geometry, which studies continuous spaces of solid crystalline materials (crystals) by new methods of metric geometry. Since crystal structures are determined in a rigid form, their strongest practical equivalence is rigid motion or isometry preserving inter-point distances. The most fundamental model of any crystal is a periodic set of points at all atomic centers. The previous work introduced an infinite sequence of density functions that are continuous isometry invariants of periodic point sets. These density functions turned out to be highly non-trivial even in dimension 1 for periodic sequences of points in the line. This paper fully describes the density functions of any periodic sequence and their symmetry properties. The explicit description confirms coincidences of density functions that were previously computed via finite samples.KeywordsPeriodic sequenceIsometry invariantDensity functions
Article
It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established, we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the earth mover's distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the inorganic crystal structure database. The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision, the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.