ArticlePDF Available

Abstract

We present a machine learning framework to train and validate neural networks to predict the anisotropic elastic response of a monoclinic organic molecular crystal known as Octogen (β-HMX) in the geometrical nonlinear regime. A filtered molecular dynamic (MD) simulations database is used to train neural networks with a Sobolev norm that uses the stress measure and a reference configuration to deduce the elastic stored energy functional. To improve the accuracy of the elasticity tangent predictions originating from the learned stored energy, a transfer learning technique is used to introduce additional tangential constraints from the data while necessary conditions (e.g. strong ellipticity, crystallographic symmetry) for the correctness of the model are either introduced as additional physical constraints or incorporated in the validation tests. Assessment of the neural networks is based on (1) the accuracy with which they reproduce the bottom-line constitutive responses predicted by MD, (2) the robustness of the models measured by detailed examination of their stability and uniqueness, and (3) the admissibility of the predicted responses with respect to mechanics principles in the finite-deformation regime. We compare the neural networks’training efficiency under different Sobolev constraints and assess the models’ accuracy and robustness against MD benchmarks for β-HMX.
International Journal for Numerical Methods in Engineering manuscript No.
(will be inserted by the editor)
Molecular dynamics inferred neural network models for finite-strain1
hyperelasticity of monoclinic crystals: Sobolev training and validations2
against physical constraints3
Nikolaos N. Vlassis ·Puhan Zhao ·Ran Ma ·Tommy4
Sewell ·WaiChing Sun5
6
January 30, 20227
Abstract
We present a machine learning framework to train and validate neural networks to predict the
8
anisotropic elastic response of a monoclinic organic molecular crystal known as Octogen (
β
-HMX) in the
9
geometrical nonlinear regime. A filtered molecular dynamic (MD) simulations database is used to train the
10
neural networks with a Sobolev norm that uses the stress measure and a reference configuration to deduce
11
the elastic stored energy functional. To improve the accuracy of the elasticity tangent predictions originating
12
from the learned stored energy, a transfer learning technique is used to introduce additional tangential
13
constraints from the data while necessary conditions (e.g. strong ellipticity, crystallographic symmetry) for
14
the correctness of the model are either introduced as additional physical constraints or incorporated in the
15
validation tests. Assessment of the neural networks is based on (1) the accuracy with which they reproduce
16
the bottom-line constitutive responses predicted by MD, (2) the robustness of the models measured by
17
detailed examination of their stability and uniqueness, and (3) the admissibility of the predicted responses
18
with respect to mechanics principles in the finite-deformation regime. We compare the neural networks’
19
training efficiency under different Sobolev constraints and assess the models’ accuracy and robustness
20
against MD benchmarks for β-HMX.21
Keywords HMX, molecular dynamics, Sobolev training, hyperelasticity, deep learning22
1 Introduction23
Plastic-bonded explosives (PBXs) are highly filled polymer composites in which crystallites of one or more
24
energetic constituents are held together by a continuous polymeric binder phase. Detonation initiation in
25
PBXs is often achieved by transmitting a mechanical shock wave into the explosive charge. Shock passage
26
leads to an abrupt increase in stress, strain, and temperature in the material. In thermodynamic terms, the
27
magnitude of the increase of these properties is given by the Hugoniot jump relations, which yield the
28
locus of thermodynamic states immediately behind the shock discontinuity as a function of the input shock
29
strength (with a parametric dependence on the initial thermodynamic state of the material). The elastic
30
properties of the constituents in a PBX play an important role in determining the states on the Hugoniot
31
locus. The most obvious connection is their appearance in the reactant equation of state (EOS). For a useful
32
summary, see Hooks et al [
1
]. The isotropic EOS can be built around the isothermal compression curve,
33
typically by fitting
V=V(P)
to the 3rd-order Birch-Murnaghan (B-M) equation of state or some other
34
convenient functional form at room temperature or zero kelvin. For the B-M EOS, the fitting variables are
35
the bulk modulus
K
and the initial pressure derivative
K0
. More comprehensive models may account for
36
Nikolaos N. Vlassis, Ran Ma, WaiChing Sun (corresponding author)
Department of Civil Engineering and Engineering Mechanics, Columbia University, New York, New York
Puhan Zhao, Tommy Sewell
Department of Chemistry, University of Missouri, Columbia, Missouri
2 Nikolaos N. Vlassis et al.
crystal elastic anisotropy by incorporating the full elasticity tensor. The advantage of incorporating the full
37
elasticity tensor is the higher fidelity description of the elastic response. However, identifying necessary
38
material parameters may require inverse problems under shock conditions with precise measurement of
39
the pressure- and temperature- dependence of the elastic coefficients. Hence, the material parameters are
40
often inferred from the results of molecular dynamics simulations instead of experiments [
2
]. Furthermore,
41
the possible coupling between the volumetric and deviatoric responses may make it even more difficult to
42
formulate the proper inverse problem and determine the optimal set of parameters [3,4,5].43
The substance octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine (HMX, also called octogen due to the
44
symmetry of the molecular structure), as is the energetic constituent in many PBXs. HMX exhibits several
45
crystal polymorphs [
6
]. The thermodynamically stable form on the 300 K isotherm, for pressures between
46
0 and approximately 30 GPa is known as
β
-HMX, for which the crystal structure is monoclinic with a
47
unit cell containing two molecules [
7
]. Numerous theoretical studies of HMX physical properties and
48
thermo-mechanical response to shocks have been reported; we do not discuss them here, but Das et al [
8
]
49
provide a recent entry point into that literature. All MD simulations discussed below were performed for
50
β-HMX in the P21/nspace group setting.51
Previous work, such as Pereverzev and Sewell [
2
] for the case of
β
-HMX, has obtained pressure- and
52
temperature-dependent elastic coefficients by applying small strain increments to a sample at thermal
53
equilibrium at the desired thermodynamic state and determining the corresponding stress and elasticity
54
tangential tensor at that state. Here we assume that the finite strain elasticity that of a Green elastic material or
55
hyper-elastic material [
9
,
10
]. As such, we postulate that (1) the state of the stress in the current configuration
56
can be solely determined by the state of the deformation of the current configuration relative to one choice
57
of a reference configuration such as the crystal lattice vectors at (300 K, 1 atm) and (2) there exists an elastic
58
stored energy functional of which the derivative with respect to the strain measure is the energy-conjugated
59
stress measure. Compared with the former approach, which tabulates the elasticity tensor at prescribed
60
states for a given pressure and temperature, the hyperelasticity approach has several distinct advantages.
61
First, the prediction of the elastic strain energy, stress measure, and elastic tangential stress are all bundled
62
together into one scalar-valued tensor function, instead of separate calculations for stress and elastic tangent
63
that might not be consistent with each other. Second, unlike the more widely used tabular approach, the
64
hyperelasticity model does not require pressure as an input to predict elastic constitutive responses and
65
hence enables consistency easily. Finally, by assuming the existence of such an elastic stored energy, the
66
stability, and uniqueness of the constitutive responses as well as other attributes such as convexity, material
67
frame indifference, and symmetry can be more easily analyzed mathematically [10,3].68
Nevertheless, with a few exceptions, such as Holzapfel and Ogden [
11
], Holzapfel et al [
12
], and Latorre
69
and Mont
´
ans [
13
], the majority of hyperelasticity models are limited to isotropic materials or materials of
70
simple symmetry such as transverse isotropic and orthotropic. Hyperelastic models for materials of lower
71
symmetry such as monoclinic or triclinic are less common [
14
]. This can be attributed to the fact that the
72
strain and the stress for anisotropic materials are not necessarily co-axial, and handcrafting a mathematical
73
expression for the energy functional that leads to accurate predictions of stress and tangent, therefore,
74
becomes a challenging task. Alternatively, predictions of elastic responses can also be established via a
75
Gaussian process to generate constitutive laws (e.g. Frankel et al.[
15
] and Wang et al. [
16
], Fuhg and Bouklas
76
[
17
]). This family of non-parametric approaches is out of the scope of this study but will be considered in
77
the future.78
To overcome this technical barrier, we introduce a transfer learning approach that generates a neural
79
network model for the hyperelastic response of
β
-HMX from molecular dynamics (MD) simulations. Our
80
new contributions, to the best knowledge of the authors, are listed below:81
1.
Traditional supervised learning approaches often employ objective/loss functions that match the stress-
82
strain responses [
18
,
19
,
20
,
21
], the elastic stored energy [
22
,
23
], or matching the energy, stress, or elastic
83
tangent fields [
24
,
25
] with the raw data considered as the ground truth. Many of these supervised
84
learning models are obtained via training deep neural networks. This direct approach, however, is
85
not suitable for MD data where the change of one state to another will lead to fluctuation that makes
86
direct Sobolev training not productive [
26
]. To overcome this problem, we introduce a pre-training step
87
in which the data are pre-processed through a filter and the underlying non-fluctuating patterns are
88
extracted to train the neural network models.89
Training and validation of ML anisotropic constitutive law for β-HMX 3
2.
We introduce a transfer learning approach where the additional desirable attributes (e.g. frame invari-
90
ance) and necessary conditions for the correctness of the constitutive laws (e.g. material symmetry) can
91
be enforced with a simple re-training.92
3.
We also introduce a post-training validation procedure where the focus is not only on predicting stress-
93
strain responses but on the desirable properties of the elastic tangential operator. To compare to the
94
previous literature that employs measures in the geometrical linear regime to measure anisotropy, we
95
introduce a reverse mapping [
27
] that generates the infinitesimal small-strain tangent from the finite
96
strain counterpart. With these metrics available, we can examine the convexity and strong ellipticity
97
of the learned function and also evaluate whether predicted constitutive responses exhibit the same
98
evolution
of anisotropy as the MD benchmark while ensuring that the filtering process does not lead to
99
non-physical responses at the continuum level. The accuracy of the model is assessed by comparing
100
MD-simulated and learned stresses as functions of strain, and by comparing the pressure-dependent
101
tangent stiffness from the learned model against explicit predictions of the elastic tensor reported recently
102
[
2
] for
β
-HMX states on the
300 K
hydrostatic isothermal compression curve. The latter comparison, in
103
particular, provides an incisive test of the accuracy of the learned functional, as this information was not
104
used explicitly as part of the training set.105
The rest of the paper is organized as follows. We first provide a brief account of the database generation
106
procedure, including pertinent details of the MD simulations, the procedure to generate stress-strain data
107
from the MD predictions, and the procedure to filter out the high-frequency responses (Section 2). We
108
briefly review the setup of our hyperelastic model (Section 3) and then outline the major ingredients for
109
the supervised learning of the hyperelastic energy functional, including the Sobolev training, the Hessian
110
sampling techniques for controlling the higher-order derivatives and the way to incorporate the physical
111
constraints in the training procedure (Section 4). This section is followed by the validation procedure that
112
tests the attributes of the learned hyperelasticity models with physical constraints not included in the
113
training problems (Section 5). The results of the numerical experiments are reported in Section 6followed
114
by concluding remarks in Section 7.115
As for notations and symbols, bold-faced and blackboard bold-faced letters denote tensors (including
116
vectors which are rank-one tensors); the symbol ’
·
’ denotes a single contraction of adjacent indices of two
117
tensors (e.g.,
a·b=aibi
or
c·d=cij djk
); the symbol ‘:’ denotes a double contraction of adjacent indices of
118
tensors of rank two or higher (e.g.,
C:ε
=
Cijk l εkl
); the symbol ‘
’ denotes a juxtaposition of two vectors
119
(e.g.,
ab=aibj
) or two symmetric second-order tensors [e.g.,
(αβ)ijkl =αij βkl
]. We also define identity
120
tensors:
I=δij
,
I=δikδjl
, and
¯
I=δil δjk
, where
δij
is the Kronecker delta. We denote the Eulerian coordinate
121
as
{x1
,
x2
,
x3}
and the corresponding three orthogonal basis vectors as
e1
,
e2
, and
e3
accordingly. As for sign
122
conventions, unless specified, the directions of the tensile stress and dilative pressure are considered as
123
positive.124
2 Database generation via molecular dynamics simulations125
In this section, we discuss the specifics of the MD simulation setup used to generate the database used for
126
the hyperelastic energy functional discovery. We provide a theoretical background for the simulations as
127
well as details on the system setup. We demonstrate the output results for the simulations and describe the
128
post-processing procedure to render them suitable for our machine learning algorithms.129
Training data for the neural networks are obtained by computing the Cauchy stress tensor for isothermal
130
samples as functions of imposed tensorial strains. The strains used correspond variously to uniaxial
131
compression or tension, pure shear, and combination strains. The imposed strains are restricted to states
132
below the threshold for mechanical failure of
β
-HMX as predicted by the MD. By learning the underlying
133
free-energy functional, we can extract the hyperelastic response from second-order and higher-order strain
134
derivatives. Note that whereas the MD reflects the underlying free energy, it does not yield the energy
135
functional property in a simply computable way.136
4 Nikolaos N. Vlassis et al.
2.1 Force field137
The MD simulations were performed using LAMMPS [
28
] in conjunction with a modified version of the
138
all-atom, fully flexible, non-reactive force field originally developed for HMX by Smith and Bharadwaj (S-B).
139
[
29
,
30
,
31
,
32
,
33
] Intramolecular interactions are modeled using harmonic functions for covalent bonds,
140
three-center angles, and improper dihedral (”wag”) angles; and truncated cosine expansions for proper
141
dihedrals. Intermolecular non-bonded interactions between atoms separated by three or more covalent
142
bonds are modeled using Buckingham-plus-charge (exponential-6-1) pair terms. Here and in Refs. [
34
,
35
,
8
],
143
a steep repulsive pair potential was incorporated between non-bonded atom pairs to prevent ‘overtopping’
144
of the exponential-6-1 potential at short non-bonded separations
R
, which can occur under shock-wave
145
loading due to the global maximum in the potential at distances of approximately 1
˚
A with a divergence
146
to negative infinity as
R
0. Evaluation of dispersion and Coulomb pair terms was computed using the
147
particle-particle particle-mesh (PPPM) k-space method [
36
] with a cutoff value of 11
˚
A and with the PPPM
148
precision set to 106.149
2.2 MD Simulation cell setup150
Three-dimensionally periodic (3-D) primary simulation cells were generated starting from the unit-cell
151
lattice parameters for
β
-HMX (P2
1
/n space group setting) predicted by the force field (at 300 K and
152
1 atm), by simple replication of the unit cell in 3-D space. This results in a monoclinic-shaped primary
153
simulation cell. The mapping of the crystal frame to the Cartesian lab frame is
akˆ
x
,
bkˆy
, and
c
in the +z
154
space. Starting primary cell sizes for the uniaxial compressive and uniaxial tensile deformation cases were
155
approximately 30 nm parallel to the strain direction and approximately 10 nm transverse to it; those for
156
pure shear deformation were approximately 10 nm
×
10 nm
×
10 nm; and those for biaxial compression
157
were approximately 30 nm
×
30 nm
×
30 nm. Figure 1depicts a unit cell of
β
-HMX and snapshots of
158
representative simulation cells prior to the beginning of deformation. Table 1contains details of the system
159
sizes used.160
Fig. 1: Unit cell of
β
-HMX (panel (a)) and snapshots of representative simulation cells for (b) uniaxial
compressive and tensile deformation, (c) shear deformation, and (d) biaxial compression. Cyan for carbon,
navy for nitrogen, red for oxygen, and white for hydrogen.
Training and validation of ML anisotropic constitutive law for β-HMX 5
Table 1: System sizes for uniaxial compressive and tensile deformation, pure shear deformation, and biaxial
compression production simulations.
Simulation Lx(nm) Ly(nm) Lz(nm) Number of Molecules
Compression/Tension along ˆ
x30.3 10.5 10.6 12,880
Compression/Tension along ˆy10.5 30.3 10.6 12,992
Compression/Tension along ˆz10.5 10.5 30.4 12,800
Shear deformation 10.5 10.5 10.6 4,480
Biaxial compression 30.3 30.3 30.4 106,720
2.3 Simulation details161
MD trajectories were propagated using the velocity Verlet integrator in LAMMPS [
37
,
38
]. Primary cells
162
constructed as described in the preceding paragraph were thermally equilibrated in the isochoric-isothermal
163
(NVT) ensemble at 300 K by initially selecting atomic velocities from the 300 K Maxwell distribution followed
164
by 20 ps of trajectory integration. Temperature control was achieved using the Nos
´
e-Hoover thermostat
165
[
39
,
40
] as implemented in LAMMPS with the damping parameter set to
50.0 fs
. A 0.2 fs time step was used
166
for the thermal equilibration.167
Fifteen isothermal MD production simulations, comprising three apiece for uniaxial compression,
168
uniaxial tension, and biaxial compression, and six for pure shear (i.e., positive and negative shear directions
169
for three distinct shear cases) were performed at
T=
300 K using NVT integration in conjunction with the
170
LAMMPS fix deform command. The integration time step was 0.20 fs and the thermostat damping parameter
171
was set to 20.0 fs. The system potential energy, temperature, pressure, Cauchy stress-tensor components,
172
and primary cell lattice vectors were recorded at 10 fs intervals for subsequent analysis.173
For the uniaxial compressive and tensile deformation simulations, the prescribed strain was applied
174
parallel to the long direction of the primary cell while holding both the transverse cell lengths and the tilt
175
factors constant. The strain rate was set to the constant value
±
0.1/100 ps, applied uniformly at each time
176
step. The uniaxial deformation simulations were performed for 300 ps, resulting in a total strain of 0.3 for
177
those cases.178
For the shear simulations, the system was deformed along with one of the three tilt factors (i.e., xy,
179
xz, and yz) while the cell edge lengths were maintained at constant values. A constant strain rate of
±180
0.1/100 ps was applied for 300 ps, resulting in total positive or negative shear strains of 0.3.181
For the biaxial compression simulations, the primary cell was compressed along two axes simultaneously
182
in the lab frame (i.e.,
x
and
y
,
y
and
z
, or
x
and
z
) while holding the third cell length and the tilt factors
183
constant. The strain rate was set to
±
0.05/100 ps along both directions. Trajectory integration was performed
184
for 300 ps resulting in a strain of 0.15 along each of the two affected directions.185
2.3.1 MD results186
Figure 2contains the system potential energy, pressure, Cauchy stress-tensor components, and lattice vectors
187
vs. time for the case of uniaxial compressive deformation along
ˆy
. The effects of deformation are evident in
188
the potential energy and stress-tensor components (panels (a) and (c)), where it can be seen that the sample
189
yields at
t190 ps
. Data collected from the beginning of the simulations up to approximately 10 ps before
190
failure were used to train the energy functional.191
The Cauchy stress is obtained from the standard LAMMPS command and the expression can be found
192
there (cf. [41]).193
2.4 Filtering MD simulation data194
The raw data from the MD simulations are not expected to be smooth, due to thermal fluctuations. These
195
fluctuations may depend on the thermostat employed and the size of the system. This temperature fluc-
196
6 Nikolaos N. Vlassis et al.
Fig. 2: From MD, system (a) potential energy, (b) pressure, (c) Cauchy stress-tensor components, and (d)
lattice vectors vs. time for uniaxial compressive deformation along ˆy.
tuation, however, is not supposed to be captured by the hyperelasticity energy functional, which is only
197
designed to capture the macroscopic constitutive responses.198
To deal with the MD data, we can either introduce a regularization process during the machine learning
199
training or we can simply filter out the Gaussian noise that might otherwise affect the convexity and
200
therefore the stability of the hyperelasticity model.201
While one can filter the Cauchy stress tensor on a component-by-component basis, such a strategy may
202
lead to a filtered Cauchy stress that depends on the coordinate system. Thus, this strategy should be avoided.
203
While there are potentially more sophisticated techniques for filtering tensorial and multi-dimensional data
204
(e.g. Muti and Bourennane [
42
]), here we introduce a spectral decomposition on the Cauchy stress such that
205
σ=
3
a=1
σanana. (1)
Following this step, a 1D moving average filter is applied to each of the eigenvalues of the Cauchy stress
206
and to the Euler angles that represent the orthogonal basis vector—
na
. To remove the noise, we used a 1D
207
uniform filter on the data series that works similar to a rolling-average window. The temporal length of the
208
filter window is equal to that of 3 ps (300 MD observations). This length of the filter window is selected after
209
a manual trial-and-error such that we may suppress the noise of the tensorial time series without greatly
210
distorting the global recorded constitutive response. Note that highly fluctuated stress data may increase
211
the difficulty of Sobolev training the hyperelasticity energy functional but also affect the stability of the
212
constitutive responses at the continuum scale. Hence, this preliminary step is necessary.213
To examine whether the filter introduces significant bias to the filter data, we apply our filtering
214
procedure to two MD simulations with the same strain path but initiated from different initial conditions
215
and using different values for the thermostat coupling parameter. The filtered and unfiltered constitutive
216
responses are compared for both cases, as shown in Fig. 3. The two MD simulations demonstrate different217
fluctuation patterns but the filtered responses are very close The uniform filter used to process the data
218
appears to capture almost identical behaviors for both simulations.219
Training and validation of ML anisotropic constitutive law for β-HMX 7
0.0300.0250.0200.0150.0100.005
E11
1.2
1.0
0.8
0.6
0.4
0.2
0.0
S11 (GPa)
MD Simulation A
Filtered Data A
MD Simulation B
Filtered Data B
0.06 0.05 0.04 0.03 0.02 0.01
E11
1.2
1.0
0.8
0.6
0.4
0.2
0.0
S12 (GPa)
MD Simulation A
Filtered Data A
MD Simulation B
Filtered Data B
0.060.040.02
E11
1.2
1.0
0.8
0.6
0.4
0.2
0.0
S12 (GPa)
MD Simulation A
Filtered Data A
MD Simulation B
Filtered Data B
Fig. 3: Filtering of MD simulation data with a uniform filter for a compression test along the
x1
axis. The
filtering is performed for two MD simulations with different thermostat coupling parameters and thus
different RMS fluctuations about the local mean value of the stress along the trajectories.
3 Finite strain hyperelastic neural network functional for β-HMX220
In this work, we will approximate a finite strain hyperelastic energy functional for
β
-HMX using a feed-
221
forward neural network architecture trained with a modified Sobolev training loss function that incorporates
222
additional physical constraints via a transfer learning technique.223
The following assumptions and setup have been made:224
1.
There exists one reference configuration for the
β
-HMX for which the stored elastic energy is zero. This
225
configuration constitutes the reference configuration for the deformation mapping.226
2. We assume that all the data used in the training are purely elastic with no path dependence.227
3. Thermo-mechanical and rate-dependence effects on the elasticity are neglected.228
4. A filter is used to reduce the high-frequency responses.229
The stored energy functional
¯
ψ
can be written as a function of the deformation gradient
F
. The first
230
Piola-Kirchhoff stress
P
is conjugate to the deformation gradient
F
and can be obtained from the following
231
relation,232
P(F) = ¯
ψ(F)
F. (2)
Notice that a necessary condition for this energy functional to be correct is the material-frame indifference.
233
Here the deformation gradient is not sensitive to rigid-body translation. However, to ensure the
SO(
3
)234
equivalence, the machine learning generated energy functional must satisfy the following constraint,235
¯
ψ(F) = ¯
ψ(QF),QSO(3)(3)
A possible way to bypass the need to introduce additional constraints in the loss function is to derive
236
the energy functional as a function of the Green strain tensor Efor which:237
E0=1
2(C0I) = 1
2(F0T·F0I) = 1
2(FT·QT·Q·FI) = 1
2(FT·FI) = 1
2(CI) = E, (4)
so we then acquire an equivalent expression:238
¯
ψ(F) = ψ(E). (5)
The second Piola-Kirchhoff stress Sis conjugate to the Green strain E, which is derived as:239
S(E) = ψ
E. (6)
8 Nikolaos N. Vlassis et al.
The transformations between the two stress measures
(P)
and
S
and the Cauchy stress tensor as recorded
240
by the MD simulations are defined as:241
P=JσFT,S=JF1σFTand S=F1P. (7)
where Jis the determinant of the deformation gradient F.242
In addition to the frame invariance, another major benefit of expressing the energy functional in terms of
243
the Green strain tensor is that the resultant stress measure is symmetric and the elastic tangential operator244
possesses both major and minor symmetries. These symmetries may reduce the dimension of the input
245
parametric space 9 to 6 and hence simplify the training. Furthermore, while
C
and
E
can both be used as
246
the input for the inherently frame-indifferent energy functional that yields
S
as the first derivative,
E=0247
implies the energy functional becomes zero. Meanwhile, training
¯
ψ(F)
as the learned function can be more
248
convenient for implicit total Lagrangian solvers where the tangent corresponding to
PF
is required to
249
solve the linearized system of equation.250
As such, we will train two hyperelasticity functionals,
¯
ψ(F)
and
ψ(E)
, which take the deformation
251
gradient and the Green strain tensor as inputs respectively. We will then compare the results obtained from
252
numerical experiments. The relationships among elasticity tangential tensors corresponding to different
253
stress-strain conjugate pairs will also be discussed in Section 5.254
255
Remark 1
It should be noticed that there are other feasible choices, such as the cofactors of deformation
256
gradient or strain invariants, that may ensure material symmetry [
12
], guarantee polyconvexity [
43
], and
257
ensure material frame indifference. Our choices of directly using the deformation gradient and the Green
258
strain tensor are mainly for convenience and ease of implementation. In the case of
¯
ψ(F)
, the training
259
procedure is more complicated due to the necessity of enforcing frame invariance. However, the direct
260
access of the
PF
conjugate tangential stiffness may simplify the implementation of total Lagrangian code.
261
262
4 Stress-based Sobolev training for stored-energy function263
We introduce a neural network training technique that constructs the hyperelasticity energy functional using
264
solely the stress data and a single reference configuration where
F=I
. Recall that a feed-forward neural
265
network can be trained to approximate an energy functional
ψ
that takes the Green-Lagrange deformation
266
tensor
E
as input. This energy function is parametrized by weights
W
and biases
b
. The supervised learning
267
that minimizes the inner product of the difference between the true
ψ
and the approximated
ˆ
ψ
for
N
number
268
of data samples can be written as269
W0,b0=argmin
W,b 1
N
N
i=1
ψiˆ
ψi
2
2!, (8)
where
ψi=ψ(Ei)
and
ˆ
ψi=ˆ
ψ(Ei)
accordingly. While this approach could reduce the discrepancy of the
270
predicted and true free energy values—if the energy data are available, minimizing the energy discrepancy
271
does not guarantee that the stress predictions are accurate.272
In principle, calculating the Helmholtz free-energy from the detailed atomistic configurations is possible
273
[
44
]. However, in this work, our focus is on the cases where we have no direct access to sufficient Helmholtz
274
free-energy data. In the following numerical experiments on
β
-HMX, we instead only use a reference
275
configuration as well as the stress data collected from multiple deformed configurations to reconstruct the
276
elastic stored energy. Consequently, we introduce two trained neural networks that takes a proper strain
277
measure as input and output the elastic stored energy. The Sobolev training then attempts to adjust the
278
weights and biases of the neurons such that the derivatives of the stored energy matches the stress measures
279
conjugate to the input strain measure. In other words, we show that it is possible to have labels used for the
280
training that are different than the input and output of the neural network. This flexibility is proven to be
281
useful for tasks in which not all data are necessary available or of sufficient fidelity.282
Training and validation of ML anisotropic constitutive law for β-HMX 9
4.1 Sobolev constraints for the hyperelastic energy functional283
To introduce a hyperelasticity model suitable to incorporate into numerical solvers for boundary value
284
problems, the accuracy, stability, robustness, smoothness, and uniqueness of the hyperelasticity responses
285
are all important. Unlike neural networks that directly generate stress predictions, a hyperelasticity model
286
is required to be sufficiently smooth and differentiable to avoid discontinuity in the predicted stress and
287
elastic tangent [24,25,22].288
4.1.1 Hyperelastic energy functional ˆ
ψ(E)289
Consider the stored-energy functional solely constructed with the following data.290
1.
A reference configuration where the Green strain tensor equals to
Eref
with the corresponding second
291
Piola-Kirchhoff stress Sref and reference energy ψref;292
2.
A set of second Piola-Kirchhoff stress
Si
, i=1,2, ..., N calculated from Cauchy stress measured at
N293
number of deformed configurations inferred from MD simulations.294
The corresponding loss function reads,295
W0,b0=argmin
W,b
wψref
ψref ˆ
ψref
2
2+wSref
Sref ˆ
ψ
EE=Eref
2
2
+wS
N
N
i=1
Siˆ
ψ
EE=Ei
2
2
, (9)
where
ψref =ψ(Eref)
and
ˆ
ψref =ψ(Eref)
are the true and approximated values of the energy functional at
296
strain
E0
,
N
is the number of non-trivial stress data points, and
wψref
,
wSref
and
wS
are the weighting factors
297
for the multi-objective optimization. In this work, we use the configuration at (
300 K
,
1 atm
) as the reference
298
configuration and we assume that this configuration is undeformed such that Eref =0.299
4.1.2 Hyperelastic energy functional ¯
ψ(F)300
Another feasible option is to directly train the energy functional that related to the
PF
pair. The drawback
301
of this option is the more complex training, owning to the fact that the deformation gradient
F
is a two-point
302
tensor that is not necessarily symmetric. Hence, both the dimensions of the labels for the supervised training
303
increased. Furthermore, it is also necessary to introduce additional training step to ensure material frame
304
indifference which could be avoid if invariants or
E
is used as the input [
43
]. However, if such a training is
305
successful, the Hessian of this energy functional may give us the tangential stiffness tensor corresponding
306
to the
PF
pair. The bases of this tangential stiffness tensor makes it easy to incorporate into the linearized
307
system of equation for a total Lagrangian finite element solver, without requiring any additional algebraic308
operations to pull-back or push-forward between configurations. As such, this option is provided here. Here,
309
we assume that the data provided in Section 4.1.1 is provided and the identical reference condfiguration is
310
used. The corresponding loss function for the energy conjugate pair PFhyperelastic model is:311
W0,b0=argmin
W,b
w¯
ψref kψref ¯
ψrefk2
2+wPref
Pref ¯
ψ
FF=Fref
2
2
+wP
N
N
i=1
Pi¯
ψ
FF=Fi
2
2
, (10)
where
¯
ψref =ψ(Fref)
and
¯
ψref =ψ(Fref)
are the true and approximated values of the energy functional at
312
strain
E0
,
N
is the number of non-trivial stress data points, and
wψref
,
wPref
and
wP
are the weighting factors
313
for the multi-objective optimization.314
10 Nikolaos N. Vlassis et al.
4.1.3 Transfer learning to enforce frame indifference for ¯
ψ(F)315
A hyperelastic model described by the conjugate pair
PF
tensors is expected to satisfy the frame invariance
316
conditions described in Eq.
(3)
. To ensure that the frame invariance is preserved during training, we re-use a
317
previously trained neural network but modifying the loss function by introducing a number
L
of random
318
rotations
Ql
,
l=
1, 2, ...,
L
and penalizing the violation of the objectivity for a randomly selected sub-sample
319
of size Lfrom the initial training sample pool by adding the following weighted objectives:320
wψ1
L
L
l=1
ˆ
ψ(QlFl)ˆ
ψ(Fl)
+wP
1
L
L
l=1
ˆ
P(QlFl)Qlˆ
P(Fl)
2
2
+wC
1
L
L
l=1
ˆ
A(QlFl)QlQlˆ
A(Fl)
2
F,
(11)
where
ˆ
P
,
ˆ
A
are the neural network approximated stress and elastic stiffness tensors respectively and
wC
is
321
a weight for the multi-objective minimization. Note that, this additional step is not necessary for energy
322
functional ψ(E)since (QF)TQF =FTQTQF =C, for any rotation tensor QSO(3).323
4.2 Transfer learning to enforce crystal symmetries324
The monoclinic unit cell of the single crystal
β
-HMX in the
P
2
1/n
space group setting is shown in Figure 4.
325
The covariant crystal basis vectors
M1
,
M2
, and
M3
represent the crystal axis in the crystal configuration as
326
shown in Figure 4.327
α
γ
β
M1 = [100]
M2= [010]
M3M3= [001]
a
b
c
c*
e1e2
e3
Fig. 4: Monoclinic unit cell of
β
-HMX in the
P
2
1/n
space group setting. The lattice constants are
a=6.53 ˚
A
,
b=11.03 ˚
A
,
c=7.35 ˚
A
,
α=γ=90
, and
β=102.689
(at
295 K
) [
45
]. The Miller indices are associated
with the monoclinic crystal directions, while the vectors
e1
,
e2
, and
e3
denote the basis vectors of the global
Cartesian coordinate system.
Note that, under the monoclinic material symmetry as shown in Figure 4, the crystal structure renders
328
2-fold rotational symmetry. The crystal structure remains unchanged when the unit cell is rotated
180
with
329
respect to [010]. Therefore, the symmetry group of this monoclinic unit cell reads330
VQ={Q|Q=exp kπspn(M2)
kM2k,kZ}. (12)
Here, the infinitesimal rotation map and the finite rotation map are defined as [46],
spn(θ) = ε·θ, exp [spn(θ)]=I+sin(θ)
θspn(θ) + 1cos(θ)
θ2spn(θ)2,
where εis the third order permutation tensor and θ=kθkis the rotation angle.331
Consider two elastic deformations of the crystal,
F
and
F+
, where
F
is an arbitrary deformation and
332
F+=FQ,QVQ. The material symmetry of the β-HMX crystal requires that333
ψ(F+) = ψ(F),P+=∂ψ
FF+=PQ,A+
aBcD =AaMc N QMB QND ,QVQ, (13)
Training and validation of ML anisotropic constitutive law for β-HMX 11
where ψis the elastic free energy, P+and Pare the first Piola-Kirchhoff stress tensors evaluated at F+and334
F, and A+and Aare the elastic stiffness tensors, such that ˙
P=A:˙
Fand ˙
P+=A+:˙
F+.335
To ensure that the crystal symmetry is preserved, we re-use the previously trained functions
(10)
and
336
(11)
, and modify the loss function by introducing
M
number of rotations
QmVQ
,
m=
1, 2, ...,
M
based on
337
the material symmetry type, which serves as the penalty to the violation of the material symmetry to a
N338
number of samples:339
M
m=1 wψ1
N
N
i=1
ˆ
ψ(FiQm)ˆ
ψ(Fi)
+wˆ
P
1
N
N
i=1
P(FiQm)ˆ
P(Fi)Qm
2
2
+wC
1
N
N
i=1
ˆ
A(FiQm)ˆ
A(Fi)QmQm
2
F!.
(14)
5 Post-training validation of the predicted elastic tangential operators340
In this section, we introduce numerical tests to determine whether the predicted constitutive responses are
341
thermodynamically admissible, preserve the symmetry, and lead to unique and stable elastic responses. A342
subset of these criteria is required to constitute a correct constitutive law (e.g. material frame invariance),
343
while others such as the convexity and the strong ellipticity are not necessary conditions but are desirable
344
properties for stability and uniqueness of the boundary value problem. While in principle many of these
345
physics constraints/laws can be incorporated into the loss function in the supervised learning process,
346
putting all the constraints explicitly into the loss function is not necessarily always ideal, as the multiple
347
constraints may alter the landscape of the loss function and thus complicate the search for the optimal
348
energy functional [47].349
As such, our goal is to introduce a suite of necessary conditions which the learned hyperelasticity consti-
350
tutive law must fulfill. These necessary conditions, along with the fact that the hyperelasticity constitutive351
law must be capable of generating predictions within a threshold error, are necessary but not sufficient to
352
guarantee the safety of using the machine learning model for high-consequence high-risk predictions (such
353
as those for explosives).354
5.1 Mapping between finite and infinitesimal kinematics355
To examine the admissibility of the hyperelasticity model and compare the finite strain model with other
356
published results based on the infinitesimal strain assumption, the connections among the tangents of
357
different energy-conjugate pairs are provided below for completeness. Here our first goal is to obtain an
358
underlying small-strain tangent of the finite-strain counterpart by using the logarithmic and exponential
359
mappings, such that the elasticity tensors predicted here and those from the literature can be compared.
360
Recall that the logarithmic elastic strain ecan be defined as [27],361
e=ln U2=1
2ln C, , (15)
where
U
is the right-stretch tensor and
C
is the right Cauchy-Green strain tensor. The small-strain elastic
362
tensor Cσecan be obtained from the chain rule,363
Cσe=σ
∂e =σ
S:S
E:E
e=1
2
σ
S:S
E:C
e=1
2
σ
S:S
E:exp 2e
e, (16)
where
σ=J1F·S·FT
is the Cauchy stress. To compute the small-strain elasticity tensor, one first rewrites
364
Eq. (15) in an infinite series representation,365
C=exp 2e=
n=0
1
n!(2e)n. (17)
12 Nikolaos N. Vlassis et al.
As such, the Cartesian component of the derivative C/ereads [48],366
[C
e]ijkl =
n=1
2n
n!
m=1
[em1
ik ][enm
lj ]. (18)
Notice that is a infinite series. In practice, we may only include a sufficient but finite number of terms in
367
Eq.
(18)
to approximate the partial derivative
C/e
. Convergence studies and benchmark data on infinite
368
series representation can be found in Ortiz et al. [
49
]. An alternative representation based on spectral
369
decomposition is also possible (cf. Miehe [50]) but is out of the scope of this paper.370
The first tangential tensor
CPF
can be related to the second derivative of the hyperelastic energy
371
functional ψ(E),372
CPF=P
F=S
E·F·F·g+Sδ, (19)
where
g
is the metric tensor. For the Cartesian coordinate system used in our training loss function, the
373
indice notation of the metric tensor is simply
gij =δij
.This expression is derived from Marsden and Hughes
374
[
9
] (see page 215), where we simply use the chain rule to link the tangents
S/E
with
S/C
. Note that this
375
tensor corresponds to the first Piola-Kirchhoff stress and the deformation gradient, and does not possess
376
minor symmetry.377
In both Eq.
(16)
and Eq.
(19)
, the derivative
S/E
is obtained from the neural elastic stored energy,
378
while the rest of the terms can be obtained via either analytical solution or automatic differentiation.379
5.2 Strong ellipticity380
While many works are dedicated to training neural network to predict elastic responses of solids [
51
,
52
,
22
,
381
53
,
54
,
55
,
24
,
25
], surprisingly few among these analyze the stability and uniqueness of the learned neural
382
network constitutive laws or provide any evidence of the well-posedness for the trained model. Recent work
383
by Dominik et al [
43
] address this issue by enforcing the polyconvexity of hyperelastic energy functional
384
via invariants (cf. Hartmann and Neff [
56
]). Note that the onset of the loss of strong ellipticity does not
385
necessarily indicate that the learned elastic energy functional is erroneous. Rather, it is considered as an
386
indicator for the onset of materials instability or failure [
9
,
57
,
58
]. Physically, the loss of the strong ellipticity
387
may also lead to the vanished wave propagation speed [
9
]. As such, our focus here is not necessarily on
388
preventing the loss of strong ellipticity but rather on the search the points at which the onset of loss of strong
389
ellipticity may occur to provide more interpertable physical insight on the stability of the β-HMX crystal.390
Consider
A
to be the acoustic tensor corresponding to
CPF
and that
CPF
is the elastic tangential operator
391
for the energy conjugate pairs (P,F), that is,392
A(N) = N·CPF ·N(20)
The Legendre-Hadamard condition requires that for any pair of vectors
N
and
m
, the following condition
393
holds:394
m·A·m0, (21)
where
N
is a Lagrangian unit vector and
m
is an Eulerian vector. Because we assume that
β
-HMX is a
395
Green-elastic material, the necessary and sufficient conditions for strong ellipticity are (cf. Ogden [
10
] page
396
392)397
Aii (N)>0, i∈ {1, 2, 3}(22)
Aii (N)Ajj(N)Aij(N)2>0, j6=i∈ {1, 2, 3}(23)
det A(N)>0 (24)
for any
NR3
. Notice that the material response is nonlinear and the acoustic tensor may vary according
398
to the Eulerian vector
m
. A simple way to ensure the conditions
(22)
-
(24)
are satisfied is to create the
399
worst-case scenario, that is, find the infimum, and the unit vectors
N
that minimize
Aii (N)
,
Aii (N)Ajj(N)400
Training and validation of ML anisotropic constitutive law for β-HMX 13
Aij (N)2
, and
det A(N)
accordingly and check whether the three terms remain positive. Depending on the
401
parameterization, the corresponding minimization problems can be written as402
f(q) = Aii (N(q)), argmin
q
f(q),N(q)S2(25)
g(q) = Aii (N(q))Ajj (N(q)) Aij (N(q))2, argmin
q
g(q),N(q)S2(26)
d(q) = det A(N(q)), argmin
q
d(q),N(q)S2, (27)
where
q
represents a parametrization of the unit vector
N(q)
. Mota et al [
59
] provide a comprehensive
403
review of how different parameterizations, namely the spherical, stereographic, projective and tangent
404
parameterizations, may lead to different local mininizers of the acoustic tensor in the parametric space. For
405
spherical parameterization, a unit vector
N
is an element of the unit sphere
S2
which can be parameterized
406
by the spherical coordinates, that is, the polar angle φ[0, π]and the azimuthal angle θ[0, π]:407
N(φ,θ) = sin φcos θe1+sin φsin θe2+cos φe3, (28)
where {e1,e2,e3}is the the orthogonal basis for R3.408
To ensure stability for any given admissible deformation, we must ensure that Eqs.
(22)
-
(24)
are valid
409
for any
F
. While this can be, in principle, determined analytically for hand-crafted energy functionals,
410
the expression of the neural network energy functional would likely be too complicated to analyze. As
411
such, we again resort to constructing a test to check the hypothesis that the material demonstrates strongly
412
ellicipticity, via an attempt to find the minima, that is,413
f0(q,F) = Aii (N(q),F), argmin
q,F
f0(q,F),N(q)S2,FGL+(3)(29)
g0(q,F) = Aii (N(q),F)Ajj (N(q),F)Aij (N(q),F)2, argmin
q,F
g0(q),N(q)S2,FGL+(3)(30)
d0(q,F) = det A(N(q),F), argmin
q,F
d0(q,F),N(q)S2,FGL+(3). (31)
It is impossible to test all the possible deformation gradients in the MD simulations while maintaining the414
path independence of the constitutive responses, so we instead construct a test where we only consider a
415
range of possible deformation gradients and search for the minima within this range.416
The numerical strong ellipticity test is conducted via the following three steps.417
1.
We create two sets of point clouds in the parametric space with uniform spacing,
Vq={q1
,
q2
,
q3
, ....
}418
and
VF={F1
,
F2
,
F3
, ...
}
, and select the combination of
(q
,
F)
that minimizes
f0
,
g0
,
d0
. If there exist
419
other
(q
,
F)
combinations that yield a value sufficiently close to the minimum (say within 5% difference),
420
then the additional coordinates will be stored as the candidate position(s) for the gradient-free search.
421
This treatment is to ensure that more local optimal points can be identified and compared and to avoid
422
the issues exhibited in Mota et al [59].423
2.
We then use the candidate position determined from the previous step as the starting point and apply a
424
gradient-free optimizer via the third-party gradient-free optimizer library (cf. Blanke [
60
]) to examine
425
whether we can find new coordinates for which the functions f0(q,F),g0(q,F), and d0(q,F)are smaller426
than the candidate position(s) identified in Step 1.427
3.
If Eqs.
(22)
-
(24)
are not violated in the worst case obtained from Step 2, then we consider the neural
428
network functional to have passed the strong ellipticity test.429
5.3 Convexity and growth conditions430
In nonlinear elasticity in the finite strain regime, convexity is not necessary and can be over-restrictive for
431
physical phenomena that involve instability or buckling [
14
]. Nevertheless, the convexity condition has
432
to be satisfied to predict stable elastic responses under large deformation. The convexity condition can be
433
stated as (cf. [10]),434
14 Nikolaos N. Vlassis et al.
ψ(F0)ψ(F)tr(P·(FF0)) 0 (32)
Because convexity is not a requirement for realistic simulations (although it might be expected for HMX),
435
we do not incorporate this criterion in the training of the neural network. However, the uniqueness and
436
stability of the elasticity model are not only important for predicting realistic elastic responses but crucial if
437
the model will be deployed as the underlying elasticity model for crystal plasticity and damage models.438
Another important condition to prevent degenerated elastic behavior is from Rosakis and Simpson [
61
]
439
which requires440
ψ(F)as det F0+. (33)
Recall that
det F
0 only happen if the distance between two material points that was non-zero in the
441
reference configuration vanishes in the current configuration. Note that it is unlikely a material would
442
remain elastic if the volumetric deformation is extremely large. furthermore, enforcing these constraints
443
explicitly in the loss function is difficult due to the infinity. Nevertheless, the constraint may provide a
444
helpful indicator of the admissibility of the machine learning extrapolated predictions. As a result, we
445
suggest a post-training validation test where we generate the response for deformation gradients with
det F446
approaching zero and observe whether the resultant energy is monotonically increasing.447
5.4 Material Anisotropy448
A predictive elasticity model must preserve the overall crystal symmetry while capturing how the anisotropy
449
of the elasticity tensor evolves under arbitrary deformation. The degree of anisotropy of the elastic response
450
can be measured by various metrics available in the literature (cf. [
62
,
63
,
64
]). Many of these anisotropy
451
metrics (or indices) are intended for components of the elasticity tensor. Typically, the distinction between
452
the secant and tangential elastic tensors is not taken into account. This can be confusing for materials
453
undergoing finite deformation where both material and geometrical nonlinearities play important roles
454
in the degree of anisotropy of the constitutive response. More importantly, the impacts of the former and
455
latter types of non-linearity should be distinguished properly such that a meaningful evaluation can be
456
conducted.457
5.4.1 Ledbetter and Migliori general anisotropy index458
Here, we use the idea from previous work due to Ledbetter and Migliori [
65
], where the ratio between the
459
maximum and minimum shear-wave speed is used to define a degree of anisotropy measure. Interestingly,
460
this method can also be used to detect instability as the vanishing of the slowest wave speed is accompanied
461
by divergence of the Ledbetter-Migliori index.462
This measure can be easily extended to the finite strain regime by replacing the infinitesimal elasticity
463
tangent with the elasticity tensor corresponding to the first Piola-Kirchhoff stress and deformation gradient
464
[10]. This idea can be summarized into the following steps.465
1. Generate as many unit vectors Nas possible.466
2. Solve the Christoffel equation for each unit vector N, that is,467
det N·C(F)·Nρv2I=0 (34)
3. Pick the largest solution v2and the smallest solution v1. Then, the anistropy index is simply468
AI=v2
2/v2
1(35)
Here, instead of a Monte Carlo search, we can leverage the search formulated in Section 5.2 to obtain the
469
smallest eigenvalue
v1
and largest eigenvalue
v2
of the acoustic tensor. Again, the optimization is conducted
470
by using a uniformly spaced point cloud to search for the initial guess, then a gradient-free optimizer is
471
used to find the normal vectors that maximize and minimize v.472
Training and validation of ML anisotropic constitutive law for β-HMX 15
6 Results473
In this section, we discuss the performance of neural network models for discovering the hyperelastic
474
energy functional from the
β
-HMX MD simulation data. We describe the training setup of the networks
475
and compare the performance of the architectures. We then demonstrate the predictive capabilities of the
476
models against the present MD simulation data and elastic constants taken from the literature for the same
477
MD force field used here. Finally, we investigate the energy functional models in terms of how well they
478
satisfy desired properties from the hyperelasticity literature.479
6.1 Training performance and learning capacity480
In this section, we discuss the performance of the neural network architectures for the Sobolev constraints481
described in Section 4. We first demonstrate how we trained the neural networks to generate a hyperelastic
482
energy functional data from the MD simulation data. We use two different architectures to discover the
483
hyperelastic energy functional for
β
-HMX. The first architecture is based on the energy conjugate pair
SE484
(Model
M1
). The input and output variables are symmetric tensors and, thus, can be described by six
485
components. The second architecture is based on the energy conjugate pair
PF
(Model
M2
). In addition,
486
we also re-train Model
M2
with an additional material frame indifference constraint (Eq.
(11)
)in the loss
487
function (model
M3
). As the difference in the predictions obtained from Models
M2
and
M3
is minor, we
488
did not enforce the Eq.
(11)
explicitly in the the last model we trained (Model
M4
). Instead, only monoclinic
489
symmetry is enforced as an additional term for the weighted loss function in the re-training step to ensure
490
that the material symmetry is preserved.491
Table 2: Summary of the trained models.
Model
Description
M1
Energy conjugate pair
SF
model trained via the loss function described
in Eq. 9.
M2
Energy conjugate pair
PF
model trained via the loss function described
in Eq. 10.
M3
Energy conjugate pair
PF
model trained with pre-trained model
M2
and additional loss function Eq.
(11)
to enforce material frame indiffer-
ence.
M4
Energy conjugate pair
PF
model trained pre-trained model
M2
and
additional loss function Eq. (14) to enforce monoclinic symmetry.
The energy functional neural networks have a feed-forward architecture consisting of a hidden dense
492
layer (100 neurons / ReLU), followed by two multiply layers (cf. Vlassis and Sun [
25
]), then another hidden
493
dense layer (100 neurons / ReLU), and finally an output dense layer (Linear). The training and validation
494
procedures of the neural network are implemented in Python with machine learning libraries Keras [
66
] and
495
Tensorflow [
67
]. The kernel weight matrix of the layers was initialized with a Glorot uniform distribution
496
and the bias vector with a zero distribution.497
In total, 233,430 data points are generated from 15 MD simulations. As described in Section 2.3, this data
498
set includes multiple loading scenario such as uniaxial compressive, tensile, shear and bi-axial compressive
499
cases. This set of MD data is partitioned randomly into two subsets that are mutually exclusively to each
500
other. 70% of data (163,400 data points) are used to train the neural network energy functional, while 30% of
501
data (70,030 data points), which we refer to unseen data herein, are used to cross-validate the results. All the
502
models were trained for 1000 epochs with a batch size of 512, using the Nadam optimizer [
68
] initialized
503
with default values in the Keras library.504
16 Nikolaos N. Vlassis et al.
100101102103
Epoch
106
105
104
103
102
Loss
Stress Training Loss
100101102103
Epoch
1022
1019
1016
1013
1010
107
104
Loss
ψoTraining Loss
100101102103
Epoch
1010
109
108
107
106
105
104
103
Loss
So/PoTraining Loss
(a) (b) (c)
102
Epoch
106
105
104
103
102
Loss
Stress Training Loss
Model M1Training Loss
Model M1Validation Loss
Model M2Training Loss
Model M2Validation Loss
Fig. 5: Comparison of the training loss curves for the energy conjugate pair
SE
model (
M1
) and the
energy conjugate pair
PF
model (
M2
) for (a) the stress, (b) the energy, and (c) stress value at the state of
zero strain.
The loss function training curves for the architectures
M1
and
M2
are demonstrated in Fig. 5. The two
505
architectures appear to have similar accuracy so they will be used interchangeably below. The predictive
506
capabilities of M1and M2are further demonstrated in Section 6.2.1.507
(a) (b)
Model M2Training Loss
Model M2Validation Loss
Model M3Training Loss
Model M3Validation Loss
Fig. 6: Comparison of the training loss curves for (a) the energy and (b) stress frame invariance constraints
for the energy conjugate pair
PF
model (
M2
) without any additional constraints in the loss function and
the energy conjugate pair
PF
model (
M3
) trained with the additional frame invariance constraint loss
function Eq. (11).
To check and, if necessary, enforce the frame invariance of the neural network hyperelastic models as
508
described in Section 4.1.3, we conduct a transfer learning experiment by retraining the neural network
509
model
M2
. We first train the energy conjugate pair
PF
model (
M2
) for 1000 epochs without any frame
510
invariance constraints in the loss function (i.e., Eq.
(10)
). We record the frame invariance metrics during
511
training by applying random rotation
Q
tensors on the input deformation gradient tensors and examine
512
whether the material response is frame invariant; that is, whether the predicted energy remains the same
513
before and after rotation and whether the predicted stress tensor rotates accordingly. The trained model
514
M2
is then retrained with the additional frame invariance constraints in Eq.
(11)
for another 1000 epochs
515
(model
M3
). The comparison of the training curves for
M2
and
M3
is shown in Fig. 6. Model
M2
appears
516
Training and validation of ML anisotropic constitutive law for β-HMX 17
to already satisfy well the frame invariant properties, with the additional constraints of model
M3
mostly
517
improving the frame invariance energy constraints.518
(a) (b)
Model M2Training Loss
Model M2Validation Loss
Model M4Training Loss
Model M4Validation Loss
Fig. 7: Comparison of the training loss curves for (a) the energy and (b) stress symmetry constraints for the
energy conjugate pair
PF
model (
M2
) without any symmetry constraints in the loss function and the
energy conjugate pair
PF
model (
M4
) trained with the additional symmetry-constraint loss function
Eq. (14).
We also perform a transfer learning experiment by retraining the neural network model
M2
to ensure
519
it retains the observed
β
-HMX crystal symmetries as described in Section 4.2. We first train the energy
520
conjugate pair
PF
model (
M2
) for 1000 epochs without any symmetry constraints in the loss function
521
and record the symmetry metrics during training. By applying a rotation
Qsym
on the input deformation
522
gradient tensors, we check for the material response to retain the expected monoclinic symmetry behavior.
523
The check includes the constraints up to the first-order derivatives of the network. The trained model
M2
is
524
then retrained with the additional symmetry constraints in Eq.
(14)
for another 1000 epochs (model
M4
).
525
The results for the two training experiments are shown in Fig. 7, where the additional symmetry constraints
526
appear to be improving both the energy and the stress symmetry constraints.527
Remark 1.
Rescaling of the training data
. As a pre-processing step, we have normalized all data to avoid
528
the vanishing or exploding gradient problem that may occur during the back-propagation process [
69
]. The
529
Xisample of a measure Xis scaled to a unit interval via,530
Xi:=XiXmin
Xmax Xmin
, (36)
where
Xi
is the normalized sample point.
Xmin
and
Xmax
are the minimum and maximum values of the
531
measure
X
in the training data set such that all different types of data used in this paper (e.g. strain, stress,
532
etc) are all normalized within the range
[
0, 1
]
. After scaling all the measures involved in the training of the
533
neural networks to the unit interval, it is noted that no further fine-tuning of the multi-objective weight
534
parameters that are present in the loss functions is necessary for convergence.535
6.2 Validation of the constitutive responses536
In this section, we validate the neural network predicted constitutive response against MD simulation data
537
as well as
β
-HMX elastic coefficients from the literature. We also monitor the learned physical properties for
538
the trained models, such as the strong ellipticity, the energy growth, and the anisotropy index.539
18 Nikolaos N. Vlassis et al.
6.2.1 Validation against unseen MD simulations540
We validate the predictive performance of the learned models against unseen MD simulation loading paths.
541
The neural network architectures considered in this section are the energy conjugate pair
SE
model (
M1
)
542
and the energy conjugate pair PFmodel (M2).543
0.10 0.05 0.00 0.05 0.10
E11
7
6
5
4
3
2
1
0
1
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
0.15 0.10 0.05 0.00 0.05 0.10
E22
12
10
8
6
4
2
0
2
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
0.10 0.05 0.00 0.05 0.10
E33
6
5
4
3
2
1
0
1
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
(a) (b) (c)
Fig. 8: Comparison of the predicted 2nd Piola-Kirchhoff stress response against three uniaxial deformation
MD simulations for the conjugate pair
SE
model (
M1
). (a) Uniaxial compressive and tensile deforma-
tion along the
x1
axis. (b) Uniaxial compressive and tensile deformation along the
x2
axis. (c) Uniaxial
compressive and tensile deformation along the x3axis.
.
0.15 0.10 0.05 0.00 0.05 0.10 0.15
E12
1.0
0.5
0.0
0.5
1.0
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
0.15 0.10 0.05 0.00 0.05 0.10 0.15
E23
1.5
1.0
0.5
0.0
0.5
1.0
1.5
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
0.15 0.10 0.05 0.00 0.05 0.10 0.15
E13
1.0
0.5
0.0
0.5
1.0
1.5
2.0
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
(a) (b) (c)
Fig. 9: Comparison of the predicted 2nd Piola-Kirchhoff stress response against three shear MD simulations
for the conjugate pair
SE
model (
M1
). (a) Shear tests for positive and negative directions along the
e1e2
direction. (b) Shear tests for positive and negative directions along the
e2e3
direction. (c) Shear tests for
positive and negative directions along the e1e3direction.
The stress predictions of the networks against three uniaxial strains along the axes
x1
,
x2
, and
x3
are
544
demonstrated in Fig. 8and Fig. 11. All the symmetric stress tensor components are plotted against the main
545
loading direction of the MD simulation experiment. The predictions are compared against the raw MD
546
simulation data before the filtering pre-processing described in Section 2.4. The stress predictions for three
547
pure shear MD experiments in the positive and negative
e1e2
,
e2e3
, and
e1e3
directions are shown
548
in Fig. 9and Fig. 12. Finally, the stress predictions for three biaxial compression tests along the
x1
and
x2
,
x2
549
and
x3
, and
x1
and
x3
axes are shown in Fig. 10 and Fig. 13. It is noted that the stress fluctuations in the MD
550
shear data appear to have a larger magnitude than those of the axial simulations. However, the magnitude
551
of the fluctuations of the stress components is similar across all simulations; it appears to be larger in the
552
shear simulations due to the smaller scale of the stress response.553
Training and validation of ML anisotropic constitutive law for β-HMX 19
0.10 0.08 0.06 0.04 0.02 0.00
E11
10
8
6
4
2
0
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
0.08 0.06 0.04 0.02 0.00
E22
8
6
4
2
0
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
0.10 0.08 0.06 0.04 0.02 0.00
E11
14
12
10
8
6
4
2
0
Sij (∂ψ
∂Eij ) (GPa)
S11
S22
S33
S12
S23
S13
(a) (b) (c)
Fig. 10: Comparison of the predicted stress response against three biaxial MD simulations for the energy
conjugate pair
SE
model (
M1
). (a) Biaxial compression along the
x1
and
x2
axes. (b) Biaxial compression
along the x2and x3axes. (c) Biaxial compression along the x1and x3axes.
Both models are able to accurately capture the shear behavior of
β
-HMX, which differs greatly in the
554
positive vs. negative directions as seen in Fig. 9and Fig. 12. The shear stress response of the material appears
555
to be highly non-linear and exhibits directional dependence. This behavior is not expected to be captured
556
by a material model with an invariant formulation, as it requires specific treatment of the shear response
557
along different directions to replicate the directional dependent behavior even qualitatively. Here, however,
558
a more general representation of the material using the full second-order stress and strain tensors allows for
559
the neural network to automatically recover this behavior and rather precisely.560
Remark 2.As seen in Fig. 11, the predictions of the models
M2
and
M4
are very close. We have also
561
examined the other predictions and the discrepancies of the stress predictions inferred from
M2
and
M4
562
are also very minor. Hence, we do not include those comparisons in the paper for brevity. In the following
563
sections, the validation tests of the energy conjugate pair
PF
models’ properties will be performed on the
564
model M2as the behavior of the models M2,M3, and M4was observed to be similar.565
6.2.2 Validation of Strong ellipticity566
In this section, we perform the strong ellipticity tests as described in Section 5.2 on the trained neural
567
network. The neural network architecture used in this comparison is model
M2
, which uses the energy
568
conjugate pair
PF
. This model was chosen for the convenience of obtaining the fourth-order elasticity
569
tensor needed for the acoustic tensor checks.570
This check is performed by initially predicting the fourth-order elasticity tensor at a specific deformation
571
gradient level. We sample 1000 unit vectors
N
on the unit sphere
S2
in spherical coordinates by sampling
572
the polar angle
φ[
0,
π]
and the azimuthal angle
θ[
0,
π]
in a uniform grid, following Eq.
(28)
. These
573
vectors can be used to construct 1000 initial acoustic tensors following Eq.
(20)
. The acoustic tensors will
574
be used as the initial grid landscape for a gradient-free optimizer set to discover the minimum values of
575
the three strong ellipticity tests described in Eqs.
(29)
,
(30)
, and
(31)
. We use a Hill Climbing gradient-free
576
optimizer search, using the library implemented by Blanke [
60
], to find the pair of
(φ
,
θ)
that minimizes
577
the strong ellipticity check values. The Hill Climbing algorithm performs 10000 iterations of the search per
578
test to discover the minimum value of the check, which in most tests was obtained within the first 5000
579
iterations of the search.580
The predicted strong ellipticity test and the corresponding optimizer search for the minimum values are
581
demonstrated in Fig. 14 and Fig. 15 for two different elasticity tensors. In Fig. 14, we show the ellipticity test
582
results for the elasticity tensor close to the relaxed reference state, that is when the deformation gradient is
583
the identity tensor. The neural network passes all three ellipticity tests, discovering the minimum of all tests
584
to be greater than zero in the unit vector search space. In Fig. 15, we show the first strain state of a biaxial
585
compression simulation along the
x1
and
x2
axes where the strong ellipticity test fails – the acoustic tensor
586
determinant for Eq.
(29)
is found to be less than zero for the first time (compression of approximately 8%
587
along the x1and x2axes).588
20 Nikolaos N. Vlassis et al.
0.90 0.95 1.00 1.05 1.10
F11
7
6
5
4
3
2
1
0
1
PiJ (∂ψ
∂FiJ ) (GPa)
0.85 0.90 0.95 1.00 1.05 1.10
F22
12
10
8
6
4
2
0
2
PiJ (∂ψ
∂FiJ ) (GPa)
0.90 0.95 1.00 1.05 1.10
F33
6
5
4
3
2
1
0
1
PiJ (∂ψ
∂FiJ ) (GPa)
(a) (b) (c)
0.90 0.95 1.00 1.05 1.10
F11
7
6
5
4
3
2
1
0
1
PiJ (∂ψ
∂FiJ ) (GPa)
0.85 0.90 0.95 1.00 1.05 1.10
F22
12
10
8
6
4
2
0
2
PiJ (∂ψ
∂FiJ ) (GPa)
0.90 0.95 1.00 1.05 1.10
F33
6
5
4
3
2
1
0
1
PiJ (∂ψ
∂FiJ ) (GPa)
(d) (e) (f)
0.10.00.1
F13
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.5
Pij (∂ψ
∂Fij ) (GPa)
P11
P22
P33
P12
P23
P13
P21
P32
P31
Fig. 11: Comparison of the predicted 1st Piola-Kirchhoff stress response against three uniaxial deformation
MD simulations for the energy conjugate pair
PF
models
M2
and
M4
. (a,d) Uniaxial compressive and
tensile deformation along the
x1
axis for models
M2
and
M4
respectively. (b,e) Uniaxial compressive and
tensile deformation along the
x2
axis for models
M2
and
M4
respectively. (c,f) Uniaxial compressive and
tensile deformation along the x3axis for models M2and M4respectively.
0.15 0.10 0.05 0.00 0.05 0.10 0.15
F12
1.5
1.0
0.5
0.0
0.5
1.0
1.5
PiJ (∂ψ
∂FiJ ) (GPa)
0.15 0.10 0.05 0.00 0.05 0.10 0.15
F23
1.5
1.0
0.5
0.0
0.5
1.0
1.5
PiJ (∂ψ
∂FiJ ) (GPa)
0.15 0.10 0.05 0.00 0.05 0.10 0.15
F13
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.5
PiJ (∂ψ
∂FiJ ) (GPa)
(a) (b) (c)
0.10.00.1
F13
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.5
Pij (∂ψ
∂Fij ) (GPa)
P11
P22
P33
P12
P23
P13
P21
P32
P31
Fig. 12: Comparison of the predicted 1st Piola-Kirchhoff stress response against three shear MD simulations
for the energy conjugate pair
PF
model (
M2
). (a) Shear tests along the asymmetric positive and negative
e1e2
direction. (b) Shear tests along the asymmetric positive and negative
e2e3
direction. (c) Shear tests
for the asymmetric positive and negative e1e3direction.
Given that the machine learning generated constitutive responses match very well with the filtered
589
MD simulations (as shown in Figs. 8-13), the acoustic tensor losing positive definiteness is an indication of
590
Training and validation of ML anisotropic constitutive law for β-HMX 21
0.90 0.92 0.94 0.96 0.98 1.00
F11
10
8
6
4
2
0
PiJ (∂ψ
∂FiJ ) (GPa)
0.92 0.94 0.96 0.98 1.00
F22
8
6
4
2
0
PiJ (∂ψ
∂FiJ ) (GPa)
0.90 0.92 0.94 0.96 0.98 1.00
F11
12
10
8
6
4
2
0
PiJ (∂ψ
∂FiJ ) (GPa)
(a) (b) (c)
0.10.00.1
F13
1.5
1.0
0.5
0.0
0.5
1.0
1.5
2.0
2.5
Pij (∂ψ
∂Fij ) (GPa)
P11
P22
P33
P12
P23
P13
P21
P32
P31
Fig. 13: Comparison of the predicted 1st Piola-Kirchhoff stress response against three biaxial MD simulations
for the energy conjugate pair
PF
model (
M2
). (a) Biaxial compression along the
x1
and
x2
axes. (b) Biaxial
compression along the x2and x3axes. (c) Biaxial compression along the x1and x3axes.
unstable elastic responses corresponding to the shear mode along the
N
direction which could be potentially
591
physical (C. Picu, personal communication, 2021).592
6.2.3 Validation of energy growth for extrapolated predictions593
In this section, we perform the validation check described in Section 5.3 to monitor if the behavior of the
594
predicted energy functional degenerates for very large deformations. To test that, we impose deformation
595
gradients on the neural network model spanning several orders of magnitude with the
det F
decreasing
596
towards zero. The test is performed on the neural network architecture for the energy conjugate pair
597
of
PF
(model
M2
). As the Jacobian decreases, the energy functional values are expected to increase
598
monotonically (Eq.
(33)
). Therefore, we apply a sequence of volumetric compression deformation gradients
599
with the Jacobian approaching zero and plot the energy against the increasing pure volumetric deformation.
600
The results are shown in Fig. 16.601
Note that the
β
-HMX may exhibit plastic yielding or damage under high pressure. When this occurs, it
602
is not physically feasible to have an elastic response. Yet the continuum mechanics theory validation does
603
require that the stored energy approach infinity as a finite volume of HMX crystal collapses into a point [
61
].
604
This does not happen in our trained neural network model even though the growth rate within the training
605
data interval seems reasonable.606
A similar extrapolation issue has been investigated previously in Versino et al [
70
] in which the symbolic
607
regression requires an additional artificial data point added in order to prevent an incorrect prediction of
608
softening. Presumably, a similar treatment can also be applied either by adding a very large artificial data
609
point with a very large energy at the supposedly singular point or by rigorously enforcing the singularity
610
in the learned energy functional. At this point, robust ways to introduce singular data into the neural
611
network and the formulation of the loss function are not clear, but we intend to examine it in future studies.
612
Nevertheless, the results do reveal that the energy functional trained by the neural network may only be
613
valid within the interval of the data and that any extrapolated results outside of the data interval must
614
be used with caution, even if a significant number of physical constraints (e.g. material symmetry) have
615
already been applied as auxiliary objectives for the supervised learning.616
6.2.4 Stress-dependent anisotropy of HMX crystal617
In this section, we recover the predicted material response anisotropy index as described in Section 5.4.1
618
to monitor the evolution of the material’s degree of anisotropy. The anisotropy index is acquired for the
619
neural network architecture of the energy conjugate pair of
PF
(model
M2
). To obtain the index, the
620
22 Nikolaos N. Vlassis et al.
(a) (b) (c)
(d) (e) (f)
Fig. 14: Validation of strong ellipticity conditions (a), (b), (c) with the criteria in Eq.
(25)
, Eq.
(26)
, and Eq.
(27)
,
respectively, for an elasticity tensor close to the reference strain state. The unit vectors were sampled from
the surface of a unit sphere to perform the validation via a Hill Climbing gradient-free optimizer search (d,
e, f). The minimum value of the condition value discovered by the optimizer is marked.
fourth-order elasticity tensor is predicted at different deformation gradients along a prescribed loading
621
path. We then sample 1000 unit vectors
N
per deformation gradient in a uniform grid, following Eq.
(28)
,
622
by sampling the polar angle
φ[
0,
π]
and the azimuthal angle
θ[
0,
π]
. For each elasticity tensor, 1000
623
initial acoustic tensors are constructed according to Eq.
(20)
. The Hill Climbing algorithm performs 10000
624
iterations of the search deformation gradient sample to discover the minimum
v2
1
and the maximum
v2
2
625
values of each acoustic tensor. The anisotropy index
AI
is then calculated using Eq.
(35)
. The anisotropy
626
index calculated for three loading paths is demonstrated in Fig. 17.627
Interestingly, for all three uniaxial compressive deformation cases, the Ledbetter-Migliori anisotropy
628
index, which is the ratio of the fastest and slowest shear wave speeds of the
β
-HMX crystal, all tend to
629
increase significantly. The most significant changes occur when the uniaxial deformation is more than 8%. In
630
all three cases, the elastic degree of anisotropy is not very profound when the deformation is small. However,
631
in all three cases, the anisotropy index jumps from less than 10 to more than 40 in the deformation along the
632
x1
direction and more than two orders in the
x2
and
x3
direction. These results signify the importance of
633
capturing the evolving anisotropy of the HMX materials.634
6.2.5 Comparisons with literature calculations on elasticity635
We now provide the coefficients of the elastic tangents for the
SE
and
PF
conjugated pairs obtained
636
from the trained neural network energy functionals, models
M1
and
M2
and compare them with the
637
elasticity tangent for the σeconjugated pairs previously reported by Pereverzev and Sewell [2].638
In the present work, the strain measure is obtained differently in the sense that the models in these
639
papers introduce only one reference configuration such that
F=I
when the Cauchy pressure is at 10
4
640
GPa. Meanwhile, the strain measure in the
β
-HMX in [
2
] is reset at different reference pressure where the
641
MD simulation begins. This difference is minor for the atmospheric pressure case, for which the geometrical
642
Training and validation of ML anisotropic constitutive law for β-HMX 23
(a) (b) (c)
(d) (e) (f)
Fig. 15: (a) Loss of the strong ellipticity condition in the prediction of a biaxial compression test along the
x1
and
x2
axes. The unit vectors were sampled from the surface of a unit sphere to perform the validation via a
Hill Climbing gradient-free optimizer search (b). The minimum value of the condition value discovered by
the optimizer is marked.
106105104103102101100
detF
0
1000
2000
3000
4000
W(F)
min detF
Fig. 16: Results for the growth condition check, Eq.
(33)
. The predicted energy is monotonically increasing
as det Fapproaches 0. The minimum det Fin the training data set is also marked.
nonlinearity is insignificant, but may lead to significant differences in the values of the elastic tangent
643
coefficients for high-pressure cases. Note that, due to the anisotropic nature of the elastic responses, the
644
imposed Cauchy pressure may also lead to isochoric deformation due to volumetric-deviatoric coupling.
645
As such, the coordinates of the reference and current configurations
xi
and
XI
are not necessarily co-axial.
646
Hence, a direct comparison of the values of the coefficient is not productive.647
Furthermore, discrepancies may also be caused by the different data de-noising processes employed in
648
Pereverzev and Sewell [
2
]. In this paper, we employ a de-noising algorithm to filter out the high-frequent
649
24 Nikolaos N. Vlassis et al.
0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00
F11
0
10
20
30
40
anisotropy index, AI
0.90 0.92 0.94 0.96 0.98 1.00
F22
0
20
40
60
80
100
anisotropy index, AI
0.95 0.96 0.97 0.98 0.99 1.00
F11
0
20
40
60
80
100
120
140
anisotropy index, AI
(a) (b) (c)
Fig. 17: Anisotropy index
AI
calculated for the energy conjugate pair
PF
model (
M2
) for (a) a uniaxial
compressive deformation test along the
x1
axis, (b) a uniaxial compressive deformation test along the
x2
axis, and (c) a biaxial compression test along the x1and x2axes.
oscillation in the constitutive responses before the supervised learning is conducted whereas Pereverzev and
650
Sewell [
2
] employs a finite-difference approximation with a sufficiently large strain increment to calculate
651
the elasticity tensor.652
Nevertheless, a comparison of elasticity tangent operators from previous MD simulations, as well
653
as those obtained for different conjugate stress-strain pairs, does indicate the significance of geometric
654
nonlinearity in the material responses and the importance of taking it into consideration in numerical
655
simulations.656
In the MD simulation, the crystal cell of
β
-HMX is first equilibrated at a target temperature and pressure
657
through an isochoric-isothermal (NVT) simulation and then strains are imposed at different directions to
658
obtain the stress information for the differentiation. The comparison between the predicted and literature
659
reported elastic coefficients at 300 K temperature at pressure 10
4
GPa and 5 GPa is demonstrated in Tables 3
660
and 4, respectively, for the neural network models
M1
and
M2
as described in Section 6.1. It is noted that
661
for the energy conjugate pair of
PF
Model
M2
, the full tangent requires a
(
9
×
9
)
matrix to represent it in
662
the Voigt notation.663
Table 3: Comparison of the predicted
β
-HMX elastic coefficients (GPa) at pressure 10
4
GPa for the energy
conjugate pair
SE
(Model
M1
) and the energy conjugate pair
PF
(Model
M2
) to the ones reported by
Pereverzev and Sewell [
2
] for (
300 K
,
104GPa
). Note that
CPF
is not symmetric but the additional terms
are not shown for brevity.
Model M1Model M2Pereverzev and Sewell [2]
Dij Cijkl CSECPFCσe
D11 C1111 21.354 25.861 22.97
D22 C2222 22.149 18.092 22.62
D33 C3333 21.314 21.627 21.67
D44 C1212 8.616 5.8335 8.645
D55 C2323 10.982 8.225 10.407
D66 C1313 9.497 10.078 9.527
D12 C1122 8.789 6.898 9.2
D13 C1133 12.348 12.7828 12.32
D23 C2233 15.913 13.375 12.37
D15 C1123 -0.998 -0.584 -0.43
D25 C2223 4.247 -0.877 4.47
D35 C3323 2.192 -0.792 1.84
D46 C1213 2.484 1.571 2.248
Training and validation of ML anisotropic constitutive law for β-HMX 25
Table 3shows the results of different elastic tangents obtained from the neural network calculation and
664
those obtained from Pereverzev and Sewell [
2
]. While there are differences among the three tangents, they
665
are relatively minor. This is expected as the geometrical nonlinearity is not significant.666
Table 4: Comparison of the predicted
β
-HMX elastic coefficients (GPa) at pressure 5 GPa for the energy
conjugate pair
SE
(model
M1
) and the energy conjugate pair
PF
(model
M2
) with the ones reported
in Pereverzev and Sewell [
2
]. Note that
CPF
is not symmetric but the additional terms are not shown for
brevity.
Model M1Model M2Pereverzev and Sewell [2]
Dij Cijkl CSECPFCσe
D11 C1111 80.157 87.556 87.71
D22 C2222 68.666 53.441 67.08
D33 C3333 71.453 72.011 62.11
D44 C1212 0.358 0.033 19.461
D55 C2323 3.71 2.813 34.08
D66 C1313 -2.999 1.736 19.662
D12 C1122 44.048 26.828 36.93
D13 C1133 46.187 32.423 52.95
D23 C2233 55.267 52.603 46.49
D15 C1123 -0.939 0.639 -11.32
D25 C2223 10.358 -6.108 11.1
D35 C3323 -0.421 -6.120 2.48
D46 C1213 5.546 -4.082 6.06
Table 4, on the other hand, shows a more significant difference in the numerical values of the coefficients
667
for different energy-conjugated pairs. This is consistent with the derivation in Section 5.1 where the defor-
668
mation gradient at this point is no longer infinitesimal and the incorporation of the geometrical nonlinearity
669
is necessary to capture the elastic constitutive responses properly.670
7 Conclusions671
This paper introduce a mechanistic machine learning framework to infer anisotropic hyperelasticity energy
672
functional from l from molecular dynamic simulations for
β
-
HMX
. Conventionally, machine learning
673
constitutive laws are often formulated to match experimental data. As such, the discrepancy between
674
experimental data and the predictions is often the only term in the loss function for training and validation.
675
Here we attempt to formulate the training of hyperelastic model not only to mininizing the discrepancy of
676
data but also introduce additional objectives to ensure that the learned hyperelastic model obey the physics
677
constraints. To ensure the robustness of the predictions, we also introduce a set of validation tests to examine
678
the admissibility (e.g. preserving material symmetry, obeying growth conditions) and stability (convexity,
679
strong ellipticity) of the constitutive responses generated from the trained neural networks. With the usage
680
of Soblev training and automatic differentiation to facilitate the training of constitutive laws, the resultant
681
model exhibit highly accurate predictions within the training data range. These treatments are shown to be
682
effective in improving the accuracy and robustness of the predictions, while the theoretical validation may
683
provide the much-needed post hoc interpretability of the neural network constitutive laws to understand
684
the properties of the machine learning models. More importantly, the validation exercise may provide a
685
reliable way to reveal the weakness of the models and safeguard against cherry-picking interpretation,
686
which could be a key ingredient to make black-box neural network predictions more trustworthy.687
26 Nikolaos N. Vlassis et al.
8 Data availability statements688
The code used to conduct the validation tests will be available in a Github repository upon the publication
689
of this manuscript. The datasets generated and/or analyzed during the current study are available from the
690
authors upon reasonable request.691
9 Acknowledgements692
The authors are grateful for the insightful feedback and constructive suggestions given by the three reviewers.
693
Fruitful discussions with Andrey Pereverzev and Bahador Bahmani are gratefully acknowledged. The
694
efforts and labor hours are primarily supported by the Air Force Office of Scientific Research under grant
695
contracts FA9550-19-1-0318, with additional support provided to WCS and NNV from the the NSF CAREER
696
grant at National Science Foundation under grant contracts CMMI-1846875 and OAC-1940203, and high-
697
performance computing resources provided by Air Force Office of Scientific Research under grant contracts
698
FA9550-21-1-0027.699
References700
1.
Hooks DE, Ramos KJ, Bolme CA, Cawkwell MJ. Elasticity of crystalline molecular explosives. Propellants,
701
Explosives, Pyrotechnics 2015; 40(3): 333–350.702
2.
Pereverzev A, Sewell T. Elastic Coefficients of
β
-HMX as Functions of Pressure and Temperature from
703
Molecular Dynamics. Crystals 2020; 10(12): 1123.704
3. Borja RI. Plasticity: modeling & computation. Springer Science & Business Media . 2013.705
4.
Bryant EC, Sun W. A mixed-mode phase field fracture model in anisotropic rocks with consistent
706
kinematics. Computer Methods in Applied Mechanics and Engineering 2018; 342: 561–584.707
5.
Ma R, Sun W, Picu CR. Atomistic-model informed pressure-sensitive crystal plasticity for crystalline
708
HMX. International Journal of Solids and Structures 2021; 232: 111170.709
6.
Cady HH, Smith L. Studies on the Polymorphs of HMX. 2652. Los Alamos Scientific Laboratory of the
710
University of California . 1962.711
7.
Cady HH, Larson AC, Cromer DT. The crystal structure of
α
-HMX and a refinement of the structure of
712
β-HMX. Acta crystallographica 1963; 16(7): 617–623.713
8.
Das P, Zhao P, Perera D, Sewell T, Udaykumar HS. Molecular dynamics-guided material model for the
714
simulation of shock-induced pore collapse in
β
-octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine (
β
-
HMX
).
715
J. Appl. Phys. 2021; 130(8): 085901.716
9. Marsden JE, Hughes TJ. Mathematical foundations of elasticity. Courier Corporation . 1994.717
10. Ogden RW. Non-linear elastic deformations. Courier Corporation . 1997.718
11.
Holzapfel GA, Ogden RW. On planar biaxial tests for anisotropic nonlinearly elastic solids. A continuum
719
mechanical framework. Mathematics and mechanics of solids 2009; 14(5): 474–489.720
12.
Holzapfel GA, Sommer G, Regitnig P. Anisotropic mechanical properties of tissue components in human
721
atherosclerotic plaques. J. Biomech. Eng. 2004; 126(5): 657–665.722
13.
Latorre M, Mont
´
ans FJ. Anisotropic finite strain viscoelasticity based on the Sidoroff multiplicative
723
decomposition and logarithmic strains. Computational Mechanics 2015; 56(3): 503–531.724
14. Clayton JD. Nonlinear mechanics of crystals. 177. Springer Science & Business Media . 2010.725
15.
Frankel AL, Jones RE, Swiler LP. Tensor basis gaussian process models of hyperelastic materials. Journal
726
of Machine Learning for Modeling and Computing 2020; 1(1).727
16.
Wang J, Li T, Cui F, Hui CY, Yeo J, Zehnder AT. Metamodeling of constitutive model using Gaussian
728
process machine learning. Journal of the Mechanics and Physics of Solids 2021: 104532.729
17.
Fuhg JN, Bouklas N. On physics-informed data-driven isotropic and anisotropic constitutive models
730
through probabilistic machine learning and space-filling sampling. arXiv preprint arXiv:2109.11028 2021.
731
18.
Ghaboussi J, Garrett Jr J, Wu X. Knowledge-based modeling of material behavior with neural networks.
732
Journal of engineering mechanics 1991; 117(1): 132–153.733
19.
Lefik M, Schrefler BA. Artificial neural network as an incremental non-linear constitutive model for a
734
finite element code. Computer methods in applied mechanics and engineering 2003; 192(28-30): 3265–3283.735
Training and validation of ML anisotropic constitutive law for β-HMX 27
20.
Heider Y, Wang K, Sun W. SO (3)-invariance of informed-graph-based deep neural network for
736
anisotropic elastoplastic materials. Computer Methods in Applied Mechanics and Engineering 2020; 363:
737
112875.738
21.
Frankel AL, Jones RE, Alleman C, Templeton JA. Predicting the mechanical response of oligocrystals
739
with deep learning. Computational Materials Science 2019; 169: 109099.740
22.
Le B, Yvonnet J, He QC. Computational homogenization of nonlinear elastic materials using neural
741
networks. International Journal for Numerical Methods in Engineering 2015; 104(12): 1061–1084.742
23.
Teichert GH, Natarajan A, Ven V. dA, Garikipati K. Machine learning materials physics: Integrable deep
743
neural networks enable scale bridging by learning free energy functions. Computer Methods in Applied
744
Mechanics and Engineering 2019; 353: 201–216.745
24.
Vlassis NN, Ma R, Sun W. Geometric deep learning for computational mechanics Part I: Anisotropic
746
Hyperelasticity. Computer Methods in Applied Mechanics and Engineering 2020; 371: 113299.747
25.
Vlassis NN, Sun W. Sobolev training of thermodynamic-informed neural networks for interpretable
748
elasto-plasticity models with level set hardening. Computer Methods in Applied Mechanics and Engineering
749
2021; 377: 113695.750
26.
Czarnecki WM, Osindero S, Jaderberg M,
´
Swirszcz G, Pascanu R. Sobolev Training for Neural Networks.
751
In: ; 2017; Long Beach, CA, USA.752
27.
Cuitino A, Ortiz M. A material-independent method for extending stress update algorithms from
753
small-strain plasticity to finite plasticity with multiplicative kinematics. Engineering computations 1992.754
28. Plimpton S. Fast Parallel Algorithms for Short-Range Molecular Dynamics. J. Comp. Phys. 1995; 117: 1.755
29.
Smith GD, Bharadwaj RK. Quantum Chemistry Based Force Field for Simulations of
HMX
.J. Phys.
756
Chem. B 1999; 103: 3570.757
30.
Bedrov D, Smith GD, Sewell TD. Thermal Conductivity of Liquid Octahydro-1,3,57-Tetranitro-1,3,5,7-
758
Tetrazocine (HMX) from Molecular Dynamics Simulations. Chem. Phys. Lett. 2000; 324: 64.759
31. Kroonblawd MP, Mathew N, Jiang S, Sewell TD. A Generalized Crystal-Cutting Method for Modeling760
Arbitrarily Oriented Crystals in 3D Periodic Simulation Cells with Applications to Crystal-Crystal
761
Interfaces. Comput. Phys. Commun. 2016; 207: 232.762
32.
Mathew N, Sewell T. Pressure-Dependent Elastic Coefficients of
β
-
HMX
from Molecular Simulations.
763
Prop., Explos., Pyrotech. 2018; 43: 233.764
33.
Chitsazi R, Kroonblawd MP, Pereverzev A, Sewell TD. A Molecular Dynamics Simulation Study of
765
Thermal Conductivity Anisotropy in
β
-Octahydro-1,3,5,7-Tetranitro-1,3,5,7-Tetrazocine (
β
-
HMX
). Model.
766
Simul. Mater. Sc. 2020; 28: 025008.767
34.
Zhao P, Lee S, Sewell T, Udaykumar HS. Tandem Molecular Dynamics and Continuum Studies of
768
Shock-Induced Pore Collapse in TATB. Propellants, Explos. Pyrotech. 2020; 45: 1.769
35.
Kroonblawd MP, Fried LE. High Explosive Ignition through Chemically Activated Nanoscale Shear
770
Bands. Phys. Rev. Lett. 2020; 124: 206002.771
36. Hockney RW, Eastwood JW. Computer Simulation Using Particles. New York, NY: Hilger . 1988.772
37.
Verlet L. Computer “Experiments” on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones
773
Molecules. Phys. Rev. 1967; 159: 98.774
38.
Swope WC, Andersen HC, Berens PH, Wilson KR. A Computer Simulation Method for the Calculation
775
of Equilibrium Constants for the Formation of Physical Clusters of Molecules: Application to Small
776
Water Clusters. J. Chem. Phys. 1982; 76: 637.777
39.
Nos
´
e S. A Unified Formulation of the Constant-Temperature Molecular-Dynamics Methods. J. Chem.
778
Phys. 1984; 81: 511.779
40. Hoover WG. Canonical Dynamics: Equilibrium Phase-Space Distributions. Phys. Rev. A 1985; 31: 1695.780
41. LAMMPS Molecular Dynamic Simulator. LAMMPS is available at http://lammps.sandia.gov.781
42.
Muti D, Bourennane S. Multidimensional filtering based on a tensor approach. Signal Processing 2005;
782
85(12): 2338–2353.783
43.
Klein D, Fern
´
andez M, Martin RJ, Neff P, Weeger O. Polyconvex anisotropic hyperelasticity with neural
784
networks. 2021.785
44.
Vogiatzis GG, Breemen vLC, Theodorou DN, H
¨
utter M. Free energy calculations by molecular simula-
786
tions of deformed polymer glasses. Computer Physics Communications 2020; 249: 107008.787
45.
Eiland PF, Pepinsky R. The crystal structure of cyclotetramethylene tetranitramine. Zeitschrift f¨ur
788
Kristallographie-Crystalline Materials 1954; 106(1-6): 273–298.789
28 Nikolaos N. Vlassis et al.
46.
Simo J, Fox D, Rifai M. On a stress resultant geometrically exact shell model. Part II: The linear theory;
790
computational aspects. Computer Methods in Applied Mechanics and Engineering 1989; 73(1): 53–92.791
47.
Mavrotas G. Effective implementation of the
ε
-constraint method in multi-objective mathematical
792
programming problems. Applied mathematics and computation 2009; 213(2): 455–465.793
48.
Abraham R, Marsden JE, Ratiu T. Manifolds, tensor analysis, and applications. 75. Springer Science &
794
Business Media . 2012.795
49.
Ortiz M, Radovitzky R, Repetto E. The computation of the exponential and logarithmic mappings
796
and their first and second linearizations. International Journal for Numerical Methods in Engineering 2001;
797
52(12): 1431–1441.798
50.
Miehe C. Comparison of two algorithms for the computation of fourth-order isotropic tensor functions.
799
Computers & structures 1998; 66(1): 37–43.800
51.
Ghaboussi J, Pecknold DA, Zhang M, Haj-Ali RM. Autoprogressive training of neural network constitu-
801
tive models. International Journal for Numerical Methods in Engineering 1998; 42(1): 105–126.802
52.
Pernot S, Lamarque CH. Application of neural networks to the modelling of some constitutive laws.
803
Neural Networks 1999; 12(2): 371–392.804
53.
Hoerig C, Ghaboussi J, Insana MF. Data-driven elasticity imaging using cartesian neural network
805
constitutive models and the autoprogressive method. IEEE transactions on medical imaging 2018; 38(5):
806
1150–1160.807
54.
Fuhg JN, Marino M, Bouklas N. Local approximate Gaussian process regression for data-driven con-
808
stitutive laws: Development and comparison with neural networks. arXiv preprint arXiv:2105.04554
809
2021.810
55.
Huang DZ, Xu K, Farhat C, Darve E. Learning constitutive relations from indirect observations using
811
deep neural networks. Journal of Computational Physics 2020; 416: 109491.812
56.
Hartmann S, Neff P. Polyconvexity of generalized polynomial-type hyperelastic strain energy functions
813
for near-incompressibility. International journal of solids and structures 2003; 40(11): 2767–2791.814
57.
Merodio J, Neff P. A note on tensile instabilities and loss of ellipticity for a fiber-reinforced nonlinearly
815
elastic solid. Archives of Mechanics 2006; 58(3): 293–303.816
58.
Miehe C, Schr
¨
oder J, Becker M. Computational homogenization analysis in finite elasticity: material
817
and structural instabilities on the micro-and macro-scales of periodic composites and their interaction.
818
Computer Methods in Applied Mechanics and Engineering 2002; 191(44): 4971–5005.819
59.
Mota A, Chen Q, Foulk III JW, Ostien JT, Lai Z. A Cartesian parametrization for the numerical analysis
820
of material instability. International Journal for Numerical Methods in Engineering 2016; 108(2): 156–180.821
60.
Simon Blanke . Gradient-Free-Optimizers: Simple and reliable optimization with local, global,
822
population-based and sequential techniques in numerical search spaces..
https://github.com/823
SimonBlanke; since 2020.824
61.
Rosakis P, Simpson HC. On the relation between polyconvexity and rank-one convexity in nonlinear
825
elasticity. Journal of elasticity 1994; 37(2): 113–137.826
62.
Li Z, Bradt RC. The single-crystal elastic constants of cubic (3C) SiC to 1000 C. Journal of materials science
827
1987; 22(7): 2557–2559.828
63. Kube CM. Elastic anisotropy of crystals. AIP Advances 2016; 6(9): 095209.829
64.
Ranganathan SI, Ostoja-Starzewski M. Universal elastic anisotropy index. Physical Review Letters 2008;
830
101(5): 055504.831
65.
Ledbetter H, Migliori A. A general elastic-anisotropy measure. Journal of applied physics 2006; 100(6):
832
063516.833
66. Chollet F, others . Keras. https://keras.io; 2015.834
67.
Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous
835
Systems. 2015. Software available from tensorflow.org.836
68. Dozat T. Incorporating nesterov momentum into adam. 2016.837
69. Bishop CM, others . Neural networks for pattern recognition. Oxford university press . 1995.838
70.
Versino D, Tonda A, Bronkhorst CA. Data driven modeling of plastic deformation. Computer Methods in
839
Applied Mechanics and Engineering 2017; 318: 981–1004.840
... In order to train the ANN with respect to the stresses, gradients of the output with respect to the input are inserted into the loss [12,45,46,52,53]. This technique is also named as Sobolev training in [77,80]. Furthermore, physical knowledge can be inserted via constraint training processes [81]. ...
... An extension of the deep material network approach to fully coupled thermo-mechanical multiscale simulations of composite materials is given in [26]. The application of ANNs used as a surrogate for molecular dynamics simulations is discussed in [7,80]. ...
... In the work [45], the growth condition is fulfilled in a similar way by adding an additional energy term which is not directly included inψ ANN . In [80], a post-training validation test is suggested to check the growth condition. Thereby, deformation gradients with detF → 0 + are prescribed to the network and it is observed whether the resultant energy is monotonically increasing. ...
Article
Full-text available
Herein, we present a new data-driven multiscale framework called FEANN which is based on two main keystones: the usage of physics-constrained artificial neural networks (ANNs) as macroscopic surrogate models and an autonomous data mining process. Our approach allows the efficient simulation of materials with complex underlying microstructures which reveal an overall anisotropic and nonlinear behavior on the macroscale. Thereby, we restrict ourselves to finite strain hyperelasticity problems for now. By using a set of problem specific invariants as the input of the ANN and the Helmholtz free energy density as the output, several physical principles, e. g., objectivity, material symmetry, compatibility with the balance of angular momentum and thermodynamic consistency are fulfilled a priori. The necessary data for the training of the ANN-based surrogate model, i. e., macroscopic deformations and corresponding stresses, are collected via computational homogenization of representative volume elements (RVEs). Thereby, the core feature of the approach is given by a completely autonomous mining of the required data set within an overall loop. In each iteration of the loop, new data are generated by gathering the macroscopic deformation states from the macroscopic finite element simulation and a subsequently sorting by using the anisotropy class of the considered material. Finally, all unknown deformations are prescribed in the RVE simulation to get the corresponding stresses and thus to extend the data set. The proposed framework consequently allows to reduce the number of time-consuming microscale simulations to a minimum. It is exemplarily applied to several descriptive examples, where a fiber reinforced composite with a highly nonlinear Ogden-type behavior of the individual components is considered. Thereby, a rather high accuracy could be proved by a validation of the approach.
... In the meantime, NNs using invariants as inputs and the hyperelastic potential as output, thus also being a priori thermodynamically consistent, have become a fairly established approach [12,19,24,25,30,32,33,48]. Thereby, a more sophisticated training is applied, which allows the direct calibration of the network by tuples of stress and strain, i.e., the derivative of the energy with respect to the deformation is used in the loss term. This technique is also named Sobolev training [52,54]. Alternatively, in order to ensure thermodynamic consistency a posteriori, a previously trained network predicting stress coefficients can be used to construct a pseudo-potential [20]. ...
Preprint
Full-text available
In the present work, a hyperelastic constitutive model based on neural networks is proposed which fulfills all common constitutive conditions by construction, and in particular, is applicable to compressible material behavior. Using different sets of invariants as inputs, a hyperelastic potential is formulated as a convex neural network, thus fulfilling symmetry of the stress tensor, objectivity, material symmetry, polyconvexity, and thermodynamic consistency. In addition, a physically sensible stress behavior of the model is ensured by using analytical growth terms, as well as normalization terms