Content uploaded by Waiching Sun
Author content
All content in this area was uploaded by Waiching Sun on Oct 06, 2021
Content may be subject to copyright.
Component-based machine learning paradigm
for discovering rate-dependent and
pressure-sensitive level-set plasticity models
Nikolaos N. Vlassis
Postdoctoral Research Scientist
Department of Civil Engineering and Engineering Mechanics
Columbia University
New York, New York 10027
Email: nnv2102@columbia.edu
WaiChing Sun ∗
Associate Professor
Department of Civil Engineering and Engineering Mechanics
Columbia University
New York, New York 10027
Email: wsun@columbia.edu
Conventionally, neural network constitutive laws for path-
dependent elasto-plastic solids are trained via supervised
learning performed on recurrent neural network, with the
time history of strain as input and the stress as input. How-
ever, training neural network to replicate path-dependent
constitutive responses require significant more amount of
data due to the path dependence. This demand on diverse
and abundance of accurate data, as well as the lack of in-
terpretability to guide the data generation process, could be-
come major roadblocks for engineering applications. In this
work, we attempt to simplify these training processes and im-
prove the interpretability of the trained models by breaking
down the training of material models into multiple super-
vised machine learning program for elasticity, initial yield-
ing and hardening laws that can be conducted sequentially.
To predict pressure-sensitivity and rate dependence of the
plastic responses, we reformulate the Hamliton-Jacobi equa-
tion such that the yield function is parametrized in prod-
uct space spanned by the principle stress, the accmulated
plastic strain and time. To test the versatility of the neural
network meta-modeling framework, we conduct multiple nu-
merical experiments where neural networks are trained and
validated against (1) data generated from known benchmark
models, (2) data obtained from physical experiments and (3)
data inferred from homogenizing sub-scale direct numerical
simulations of microstructures. The neural network model is
also incorporated into an offline FFT-FEM model to improve
the efficiency of the multiscale calculations.
∗Corresponding author
1 Introduction
One of the century-old challenges for mechanics re-
searchers is to formulate plasticity theory that predicts the
relationship among strain history, plastic deformation, and
stress for materials governed by different deformation mech-
anisms. As plastic deformation accumulates, the dissipation
and plastic work may lead the yielding criteria to evolve, and
cause a variety of hardening/softening mechanisms to mani-
fest as the evolution of microstructures, such as twinning [1],
dislocation [2], pore collapse [3], void nucleation [4], and re-
arranging of particles [5]. Generations of scholars including
Coulomb [6], von Mises [7], Drucker and Prager [8] spent
decades to create new plasticity theories to incorporate new
causality relations and hypotheses for path-dependent ma-
terials. In stress-based plasticity theories, yield function is
expressed as a function of stress, internal variables, and the
the hardening laws (e.g. isotropic, kinematic hardening, rota-
tional, mixed-mode, hardening) as deduced from experimen-
tal observations and sub-scale micro-mechanical simulations
(e.g. dislocation dynamic, molecular simulations).
In the past decades, new plasticity models are often gen-
erated by modifying existing models with different expres-
sions of the yield functions or the hardening laws. For in-
stance, a search in Google Scholar for ”modified Johnson-
Cook model” and ”modified Gurson model” reveals more
than 632,000 and 8,350 results respectively 1. A vast major-
ity of these published works are dedicated to manually mod-
ifying the original model with new evolution laws or shapes
of the yield surface that accompany new physics, new mate-
1as of 7/8/2021
1 Copyright c
by ASME
rials, or new insights for more precise predictions. While
this conventional workflow has led to numerous improve-
ments in modeling, the more sophisticated models are often
inherently harder to tune due to the expansion of parametric
space. This expansion does not only make it less feasible
to determine the optimal mathematical expressions through
a manner trial-and-error effort (even after the causality of the
yielding and hardening is known [9]), but also require a more
complicated inverse problems to identify the material param-
eters [10,11,12].
The recent success of deep neural networks has inspired
a new trend where one may simply build a forecast engine
by training a network with a pair of strain and stress his-
tories [13,14]. To replicate the history dependence of the
plastic deformation, an earlier neural network approach work
would employ strain and stress from multiple previous time
steps to predict new stress states [15] whereas more recent
works such as [16] and [17] employ recurrent neural net-
works, such as the Long Short-Term Memory (LSTM) and
Gated Recurrent neural networks to introduce memory ef-
fects. The promise and expectation are that the continuous
advancement of neural networks or other machine learning
techniques might one day replace the modeling paradigm
currently employed in the engineering industry with supe-
rior accuracy, efficiency, and robustness [18,16]. However,
the early success in the 90s and the recent resurrection of
optimism about neural network predictions on constitutive
responses so far have a limited impact on industry appli-
cations. This reluctance is not entirely unjustified. In fact,
recent studies and workshops conducted by US Department
of Energy have cited the lack of domain awareness, inter-
pretability, and robustness as some of the major technical
barriers for the revolution of AI for scientific machine learn-
ing [19]. To facilitate changes in the industry, the trustwor-
thiness of the predictions is necessary and interpretability is
a necessary condition to overcome these obstacles [20].
As such, our focus in this work is to explore the possi-
bility of building an interpretable machine learning paradigm
capable of serving as the interface between plasticity model-
ers and artificial intelligence. As such, our focus has shifted
from solely using AI to make predictions to building AI
to create plasticity theories compatible with domain knowl-
edge, easily interpreted, and capable of not only improving
the accuracy but also the robustness of existing models. We
propose the training of elastoplasticity models through mul-
tiple supervised learning to generate the model components
of knowledge separately (i.e. elastic stored energy, yielding
function, and hardening laws). The resultant model is the
composition of these machine-generated knowledge compo-
nents and is fully interpretable by modelers.
To achieve this goal, we have recast both rate-
independent and rate-dependent plasticity as a Hamilton-
Jacobi problem in a parametric space that is spanned by the
principal stress, the accumulated plastic strain, the plastic
strain rate and the real time. Meanwhile, the anisotropy of
plastic yielding is achieved by mapping the yield function
level sets of the same material under different orientations
through a supervised neural network training on a chosen
yield function projection basis. These treatments enable us to
create a general mathematical description for a large number
of existing plasticity models, including von Mises plasticity,
Drucker-Prager, and Cam-clay models combined with any
possible hardening law as merely special cases of the level
set plasticity model. Instead of solving the level set extension
problem in the parametric space, we formulate a supervised
learning problem generate the constitutive updates from neu-
ral computation and speed up calculation compared to clas-
sical hierarchical multiscale computation [21,22,23,24].
More importantly, this new AI-enabled framework rep-
resents a new paradigm shift where the goal of machine
learning has shifted from merely generating forecast engines
for mechanistic predictions to creating interpretable math-
ematical models that inherently obey physical laws with the
assistance of machine learning. The resultant model does not
require the usage of recurrent neural networks, it is easier to
train, and provides more robust results for blind predictions.
2 Level set plasticity
The goal of this section is to extend the previous pub-
lished work [25] to incorporate pressure-sensitivity and rate-
dependence. The mathematical framework is very similar
except that the introduction of the pressure-dependence and
the rate-dependence may lead to a higher dimensional space
for the Hamilton-Jacobi problems and therefore higher de-
mand on data. These new implications are highlighted in this
section. Details of the initial implementation of the level set
plasticity model can be referred to [25]. The algorithm used
to generate the yield function level sets and trained the plas-
ticity model neural networks is summarized in Algorithm 1.
Here we formulate the machine learning plasticity prob-
lem, not in a single supervised learning, but by splitting the
task into multiple smaller ones (predicting a stored elastic-
ity energy functional, predicting a yield surface, introducing
a mapping for anisotropy), each constituting one neural net-
work trained for a sub-goal. Then, complex behaviors can
be predicted by integrating these networks in a level set plas-
ticity framework. This treatment does not only improve the
predictions but more importantly introduces a learning struc-
ture where the casual relations of the individual components
are clearly defined without losing the generality of individ-
ual model predictions. As pointed out by recent work for in-
terpretable machine learning [20] and [26], this component
design helps promoting both simulatability (the ability to
internally simulate and reason about the overall predictions)
and modularity (the ability to interpret portions of the pre-
dictions independently) of the AI-generated models.
We introduce a new concept of treating the yield sur-
faces in the parametric space composed of stress, accu-
mulated plastic strain, strain rate as a level set. We also
discuss the importance of leveraging material symmetries
to reduce the data demand for the supervised machine
learning problem. Previously, [27] and [25], have intro-
duced NURB and machine learning-based interpolations re-
spectively to generate yield surfaces with isotropic rate-
independent plasticity. The key departure here is the new
2 Copyright c
by ASME
!!
!
"= !#= !!
!
"
!#
!!
!
"= !#= !!
!
"
!#
!!
!
"= !#= !!
!
"
!#
STEP 1:
Gather yield
surface points
STEP 2:
Generate level set
(data augmentation)
STEP 3:
Train NN yield
function level set
yield surface
points
level set predicted yield
surface
Fig. 1. Universal training process for level set yield functions: 1) gather yield surface data points, 2) generate level set through the initialization
process, and 3) train neural network on the level set data (the zeroth level of the predicted level set is the approximated yield surface).
capacity to generalize the learning algorithm for anisotropic
rate-dependent/independent plasticity.
An important factor that dictate whether the training of
the machine learning model with limited data could be suc-
cessful is how material symmetry can be leveraged. For ex-
ample, the data collection can be significantly reduced for
isotropic plasticity, as the principal strain and stress are co-
axial. Another important aspect to consider is how to lever-
age material symmetry to select the coordinate system that
represents the same data in the parametric space. For in-
stance, a Euclidean space spanned by the values of the three
principal stress could be sufficient for isotropic yield func-
tion and hence leads to a simpler supervised learning prob-
lem than those that use all 6 stress components. Furthermore,
the choice of the coordinate system may affect how one plan
to collect the data and vice versa. For instance, while it is
possible to formulate the level set problem with the princi-
pal stress as the Cartesian basis, i.e., (σ1,σ2,σ3), it might be
even more efficient to consider the usage of (q,p)stress for
experimental data obtained from conventional triaxial tests
where only two distinctive principal stress can be controlled.
In this latter case, the anisotropy and the dependence of all
three invariants of the constitutive responses could not be
sufficiently captured from the data gathered by the set of the
experiments alone. Hence increasing the dimensions of the
parametric space for the elasticity energy and the yield func-
tion would not be beneficial. In the numerical experiments
we conducted, we adopt the cylindrical coordinates (see Eq.
(1)) for the π-plane orthogonal to the hydrostatic axis where
σ1=σ2=σ3(cf. [28]). This treatment enables us to detect
any symmetry on the π-plane that might allow us to reduce
the dimensions of the data and potentially simplify the train-
ing of the neural network with less data.
σ1
σ2
σ3
=R0R00
σ00
1
σ00
2
σ00
3
=
√2/2 0 √2/2
0 1 0
−√2/2 0 √2/2
1 0 0
0p2/3 1/√3
0−1/√3p2/3
σ00
1
σ00
2
σ00
3
.
(1)
Before translating the yield surface fΓdata points into
a yield function level set, we reduce the dimensionality of
the stress point xrepresentation. In the case of the isotropic
pressure-dependent plasticity, we can reduce the stress rep-
resentation from 6 dimensions x(σ11,σ22 ,σ33,σ12 ,σ23,σ13 )
(already reduced from 9 due to balance of angular momen-
tum) to an equivalent three stress invariant representation
b
x(p,ρ,θ). In this representation, pis the mean pressure, ρ
and θare the Lode’s radius and angle respectively.
The yield function is then postulated to be a signed dis-
tance function defined as:
φ(b
x,ξ,˙
ξ,t) =
d(b
x)outside fΓ(inadmissible stress)
0 on fΓ(yielding)
−d(b
x)inside fΓ(elastic region)
,
(2)
where d(b
x)is the minimum Euclidean distance between any
point b
xof the solution domain Ωof the stress space where
the signed distance function is defined and the yield surface
fΓ={b
x∈R3|f(b
x) = 0}, defined as:
d(b
x) = min(|b
x−b
xΓ|),(3)
where b
xΓis the yielding stress for a given value of accumu-
lated plastic strain ξand its rate ˙
ξat time t. The plastic in-
ternal variable ξis monotonically increasing and represents
3 Copyright c
by ASME
the history-dependent behavior of the material. The time t
signifies a snapshot of the current state of the level set φfor
the current value of the plastic internal variable ξand its rate
˙
ξ.
2.1 Data augmentation through signed distance func-
tion generation
We can now pre-process the stress point cloud of the
yield surface for a given ξand ˙
ξby solving the Eikonal equa-
tion ∇ˆ
xφ=1 while prescribing the signed distance function
to 0 at ˆ
x∈fΓ. For every stress point in the yield surface
data set, we generate a discrete number of auxiliary points
that construct a signed distance function. In the context of
level set theory, this can be seen as solving the level set ini-
tialization problem. In the context of machine learning, the
signed distance function construction can be interpreted as
a method of data augmentation – a large number of auxil-
iary data samples where fΓ6=0 are introduced to improve
the training performance as well as the accuracy and robust-
ness of both the learned function fΓand equally importantly
its stress gradient ∂fΓ/∂σi j. A schematic of the yield surface
data pre-processing into a signed distance function is demon-
strated in Fig. 1.The color is the value of the signed distance
yield function. It is negative in the elastic region and posi-
tive in the inadmissible stress region. The material yields if
the current stress is at the location where the value of yield
function equals to zero. It is noted that the signed distance
function has been selected as the preferred level set function
due to the simplicity of the implementation – the yield func-
tion can be formulated on other level set function bases, the
benefits of which will be considered in future work.
2.2 Hardening as a level set extension problem
After pre-processing the yield surface fΓdata points for
a sequence of internal variable values ξand rates ˙
ξinto a
level set by solving the level set initialization problem, we
will recover the velocity function of a Hamilton-Jacobi equa-
tion of a level set extension problem to describe the temporal
evolution of the level set. A general Hamilton-Jacobi equa-
tion reads:
∂φ
∂t+v·∇ˆ
xφ=0,(4)
where vis the normal velocity field that describes the geo-
metric evolution of the boundary (yield surface fΓ). In the
context of plasticity, the velocity field corresponds to the ob-
served hardening mechanism. The velocity vector field can
be described by a magnitude scalar function Fand a direc-
tion vector field n=∇ˆ
xφ/∇ˆ
xφsuch that:
v=F·n.(5)
Substituting into Eq. (4):
∂φ(ξ,˙
ξ)
∂t+F(ξ,˙
ξ)|∇ˆ
xφ(ξ,˙
ξ)|=0,
where Fi≈φi+1(ξi+1,˙
ξi+1)−φi(ξi,˙
ξi)
∆t.
(6)
In the above equation, Fi(p,ρ,θ,ξ,˙
ξ) = F(p,ρ,θ,ξ,˙
ξ,ti)
for i=0,1,2,...,n+1 is the finite difference approxi-
mated scalar velocity (hardening) function that corresponds
to the pre-processed collection of signed distance functions
{φ0,φ1,...,φn+1}at time {t0,t1,...,tn+1}. Thus, we have
recast a yield function finto a signed distance function
φ, such that f(p,ρ,θ,ξ,˙
ξ) = φ(p,ρ,θ,ξ,˙
ξ). We can now
formulate a machine learning problem to approximate the
level set yield function fwith its neural network yield func-
tion b
f=b
fp,ρ,θ,ξ,˙
ξ|W,bcounterpart, parametrized by
weights Wand biases bto be optimized during training.
The training objective for the neural network optimiza-
tion is to minimize the following loss function at training
samples (ˆx,ξ,˙
ξ,t)for i∈[1,...,N]:
W0,b0=argmin
W,b1
N
N
∑
i=1
fi−b
fi
2
2
+wpsign−
3
∑
A=1
σA,i
∂b
fi
∂σA,i,
(7)
where we have added a penalty term, weighted by a factor
wp, that will activate when the the yield function is not obey-
ing convexity during training.
It is noted that the Hamilton-Jacobi equation described
in this section will not be solved numerically – while theo-
retically possible (e.g fast marching solver). Its solution will
be directly predicted by a neural network. The zeroth level
of the neural network predicted level set is the yield surface.
The neural network approximated velocity field is the data-
driven hardening mechanism.
Remark 1 Rescaling of the training data. In every loss
function in this work, we have introduced scaling coeffi-
cients γαto remind the readers that it is possible to change
the weighting to adjust the relative importance of different
terms in the loss function. These scaling coefficients may
also be viewed as the weighting function in a multi-objective
optimization problem. In practice, we have normalized all
data to avoid the vanishing or exploding gradient problem
that may occur during the back-propagation process [29]. As
such, normalization is performed before the training as a pre-
processing step. The Xisample of a measure Xis scaled to a
unit interval via,
4 Copyright c
by ASME
Algorithm 1 Training of a pressure and rate dependent
isotropic yield function level set neural network.
Require: Data set of Nsamples: stress measures σat yield-
ing, accumulated plastic strain εp, and accumulated plas-
tic strain rate ˙
εp, a Llevels number of levels (isocontours)
for the constructed signed distance function level set (data
augmentation), and a parameter ζ>1 for the radius range
of the constructed signed distance function.
1. Project stress onto π-plane
Initialize empty set of π-plane projection training sam-
ples (ρi,θi,pi)for iin [0,...,N].
for i in [0,...,N]do
Spectrally decompose σi=∑3
A=1σA,in(A)
i⊗n(A)
i.
Transform (σ1,i,σ2,i,σ3,i)into σ00
1,i,σ00
2,i,σ00
3,ivia
Eq. (1).
ρi←qσ002
1,i+σ002
2,i
θi←tan−1σ00
2,i
σ00
1,i
pi←3
√3σ00
3,i
end for
2. Construct yield function level set (data augmentation)
Initialize empty set of augmented training samples
(ρm,θm,pm,εp,m,˙
εp,m,fm)for min [0,...,N×Llevels].
m←0.
for i in [0,...,N]do
for jin [0,...,Llevels]do
ρm←ζj
Llevels ρithe signed distance
function is constructed for a radius range of [0,ζρi]
θm←θi
pm←pi
εp,m←εp,i
˙
εp,m←˙
εp,i
fm←ζj
Llevels ρi−ρithe signed distance
function value range is [−ρi,(ζ−1)ρi]
Rescale (ρm,θm,pm,εp,m,˙
εp,m,fm)into
(ρm,θm,εp,m,˙
εp,m,fm)via Eq. (8).
m←m+1
end for
end for
3. Train neural network b
f(ρm,θm,εp,m,˙
εp,m)with loss
function Eq. (7).
4. Output trained yield function b
fneural network and exit.
Xi:=Xi−Xmin
Xmax −Xmin
,(8)
where Xiis the normalized sample point. Xmin and Xmax are
the minimum and maximum values of the measure Xin the
training data set such that all different types of data used in
this paper (e.g. energy, stress, stress gradient, stiffness) are
all normalized within the range [0,1].
2.3 High-order Sobolev training
In this work, we distinguish between the material’s elas-
tic and plastic behaviors by training two different neural net-
work model components – a hyperelastic energy functional
and a yield function level set that evolves according to accu-
mulated plastic strain. These components are then combined
in a specific form of return mapping algorithm (Algorithm 1)
that may take an arbitrary elasticity model, and a yield func-
tion with generic hardening law to generate the constitutive
update for the class of inelastic materials that has a distinct
elastic region defined in a parametric space. The hypere-
lastic network counterpart is expected to have interpretable
derivatives – the first derivative of the energy functional with
respect to the strain should be a valid stress tensor, and the
second derivative a valid stiffness tensor. We adopt a Sobolev
training objective, first introduced in [30], and we extend it
to higher-order constraints, to train the energy functional ap-
proximator b
ψe(εe|W,b)using the following loss function:
W0,b0=argmin
W,b1
N
N
∑
i=1kψe
i−b
ψe
ik2
2+
∂ψe
i
∂εe
i−∂b
ψe
i
∂εe
i
2
2
+
∂2ψe
i
∂εe
i⊗∂εe
i−∂2b
ψe
i
∂εe
i⊗∂εe
i
2
2.
(9)
A benefit of using Sobolev training is the notable data ef-
ficiency. Sobolev training has been shown to produce more
accurate and smooth predictions for the energy, stress, and
stiffness fields for the same amount of data compared to clas-
sical L2norm approaches that would solely constrain the pre-
dicted energy values [25].
3 Numerical experiments
In this section, we demonstrate the AI’s capacity to re-
discover plasticity models from the literature, we explore
the model’s ability to capture highly complex new harden-
ing modes, and, finally, showcase how the AI can discover
the yield surface for a new polycrystal material and replace
the plasticity model in a finite element simulation. To test
whether the machine learning approach can be generalized,
we purposely test the AI against a wide range of material
data sets for soil, rock, poly-crystal, and steel. In particu-
lar, we employ three types of data sets, (1) data generated
from known literature models, (2) data obtained from exper-
iments, and (3) data obtained from sub-scale direct numeri-
cal simulations of microstructures. The first type of data is
used as a benchmark to verify whether the neural network
can correctly deduce the correct plastic deformation mech-
anisms (yield surface and hardening) when given the corre-
sponding data. The second and third types of data are used to
validate and examine the AI’s ability to discover new plastic
deformation mechanisms with a geometrical interpretation in
the stress space.
5 Copyright c
by ASME
3.1 Verification examples
The purpose of this example is to showcase our algo-
rithms capacity to reproduce the modeling capacity of clas-
sical plasticity theory. We first demonstrate our algorithms
ability to recover yield surface and hardening mechanisms
from the classical plasticity literature. We then demon-
strate the frameworks capacity to make predictions cali-
brated on experimental data for pressure-dependent and rate-
dependent plasticity.
3.1.1 Verification on classical plasticity theories
The proposed AI can readily reproduce numerous yield
function models from the plasticity literature, following the
same universal data pre-processing and neural network train-
ing algorithm. For this benchmark experiment, we gener-
ate synthetic data sets for four initial yield surfaces of in-
creasing shape complexity: the J2 [7] (cylinder), Drucker-
Prager [8] (cone), Modified Cam-Clay [31] (oval), and Ar-
gyris [32] (ovoid with triangular cross-section) yield sur-
faces. We simultaneously study four common hardening
mechanisms that transform and/or translate these surfaces
in the 3D stress space: isotropic hardening (cylinder dila-
tion), rotational hardening (cone rotation), kinematic hard-
ening (translation along the hydrostatic axis), and softening
(shrinking).
The data sets for these yield surfaces are populated by
sampling from the above-mentioned literature yield func-
tions. The sampling was performed as a uniform grid of the
stress invariants and the accumulated plastic strain. We sam-
ple 50 data points along the mean pressure axis, 100 data
points along the angle axis, and 10 data points along the ac-
cumulated plastic strain axis (a total of 50000 data samples
per yield function data set). The yield surface data points
are pre-processed into a signed distance function level set
database through the level set initialization procedure. For
each yield surface, 15 levels are constructed: the yielding
level, 7 in the elastic region, and 7 in the region of inadmis-
sible stress. After data augmentation, the training data set
consists of 750000 level set sample points.
For each level set database, we train a feed-forward neu-
ral network to approximate the initial yield function and its
evolution. The yield function neural networks consist of a
hidden Dense layer (100 neurons / ReLU), followed by two
Multiply layers, then another hidden Dense layer (100 neu-
rons / ReLU) and an output Dense layer (Linear). The use
of Multiply layers was first introduced in [25] to increase the
continuity of the activation functions of neural network func-
tional approximators. They were shown to allow for greater
control over the network’s higher-order derivatives and the
application of higher-order Sobolev constraints in the loss
function. The layers’ kernel weight matrix was initialized
with a Glorot uniform distribution and the bias vector with
a zero distribution. All the models were trained for 2000
epochs with a batch size of 128 using the NAdam optimizer,
set with default values of the Keras library [33].
The neural network predicted yield surfaces are demon-
strated in Fig. 2. For each model, three surfaces are shown
for three different levels of accumulated plastic strain. It is
highlighted that, given an accumulated plastic strain value,
we can recover the entire yield locus.
3.1.2 Level set plasticity model discovery for rate-
dependent and anisotropic materials
In this section, we test the frameworks capacity to make
predictions on rate-dependent and anisotropic data.
To test the trained neural network prediction of rate-
dependent responses, we incorporate data from the published
work [34] for steel that exhibits different yielding stress un-
der different strain rates. In the numerical experiments, we
use experimental data collected at strain rates ranging from
0 to 0.02s−1as the training data, sampled in a uniform grid
of 10 strain rate increments. The yield surface is sampled at
25 points along the mean pressure axis, at 100 points along
the angles axis, and at 10 points along the accumulated plas-
tic strain axis (a total of 250000 sample points). The data
are pre-processed into signed distance functions of 15 lev-
els, generating 3750000 training sample points. The neural
network used for this viscoplastic model training follows the
same architecture as the yield function neural networks de-
scribed in the previous section.
We use the experimental data collected at strain rates
10−4, 5 ×10−1, and 0.02s−1to validate the ability of the
model to make blind predictions for unseen events. Figure
3(a) shows the results of the six predictions that the AI gen-
erated for unseen data. The left figure shows the stress-strain
predictions on the uniaxial tensile tests of three different
loading rates, while the right figure shows the stress-strain
predictions on the simple shear test counterpart. In both
cases, the predictions match well with the unseen benchmark
data that is excluded from the training data set.
As for the anisotropic predictions, Figure 4shows the
machine learning generated mapping that predicts how the
yield surface in the principal stress space evolves for dif-
ferent material orientations. The data we employed in this
second experiment is generated from an FFT solver that sim-
ulates the polycrystal plasticity of a specimen composed of
FCC crystal grains. We sample the material constitutive be-
havior at 10 microstructure orientations at 150 Lode angle
sampling directions and pre-process the data into signed dis-
tance functions of 15 levels, generating 22500 training sam-
ple points for the projection mapping neural network. Work-
ing on the pressure independent stress space, the network
inputs the true stress invariants and the microstructural ori-
entation information that describes the anisotropy – in the
case of polycrystals studied in this work, the polycrystal ori-
entations as three Euler angles, and outputs the reference
stress space invariants. The network has the following layer
structure: Dense layer (200 neurons / ReLU), Multiply layer,
Dense layer (200 neurons / ReLU), Multiply layer, followed
by three more Dense layers (200 neurons / ReLU) and an
output Dense layer (Linear). The layers’ kernel weight ma-
trix was initialized with a Glorot uniform distribution and the
bias vector with a zero distribution. The model was trained
for 2000 epochs with a batch size of 256 using the NAdam
6 Copyright c
by ASME
!!= !"= !#
!!
!#
!"!!= !"= !#
!#
!"
!!
!!= !"= !#
!!= !"= !#
!#!#
!!
!!
!"!"
#
/
!
"
#
!
"
#
$
%
#
$
%
0
1
/
0
1
2
3
4
2
3
4
Fig. 2. AI can rediscover classical plasticity models: J2 plasticity model with isotropic hardening (top left), Drucker-Prager model with
rotational hardening (top right), MCC model with kinematic hardening (Bauschinger effect) (bottom left), and Argyris model with softening
(bottom right). The corresponding benchmark and predicted strain-stress curves are also demonstrated. The stress measure is in kPa.
(a)
Fig. 3. Neural network predicted viscoplastic response for increas-
ing loading strain rates for a tension (left) and shear (right) load-
ing test performed on mild-steel beams (experimental data obtained
from [34]).
optimizer. The predictions of the mapping suggest that it is
possible to generate a single mapping function that maps all
yield surfaces obtained from different polycrystal specimens
of different orientation onto a reference stress domain - de-
noted as (σ00
1,σ00
2,σ00
3).
3.2 Demonstration of model discovery capacity
Yield surface discovery in the literature has been limited
by the difficulty of deriving mathematical expressions for
higher-complexity geometrical shapes that represent them.
Additional obstacles arise when there is need to describe the
smooth transition from the shape of the initial yield surface
to that of a state with more accumulated plasticity. The al-
gorithm’s capability to discover new yield surfaces and hard-
ening mechanisms automatically directly from the data over-
comes these impediments.
To test this, we construct a fictitious yield surface
(b)
Fig. 4. The framework can capture anisotropic responses by pro-
jecting anisotropic yield surfaces onto a master projection basis curve
using a neural network stress space mapping ϕNN .
database that is based on the Argyris model [32] and com-
bines the Modified Cam-Clay [31] hardening mechanism
along with a transformation of the elastic region’s cross-
section from a triangular shape to a circle. The yield surface
is sampled at a total of 50000 points and pre-processed to
generate 750000 level set sample points. The predictions for
the yield surface and underlying level set for increasing ac-
cumulated plastic strain are demonstrated in Fig. 5. Deriving
a mathematical expression for this data set is not straightfor-
ward. Even if the derivation is sucessful, the resultant mathe-
7 Copyright c
by ASME
Increasing accumulated plastic strain
!!!!!!
!
"= !#= !!!
"= !#= !!!
"= !#= !!
!
"= !#= !!!
"= !#= !!!
"= !#= !!
!
"!
"!
"
!
"!
"!
"
!!!!!!
!#!#!#
!#
!#!#
Fig. 5. AI discovered yield surfaces and hardening mechanisms as
evolving level sets from synthetic data. The yield surface and cor-
responding yield function level set are evolved in according to the
increasing accumulated plastic strain.
matical expression might require additional material param-
eters that lacks physics underpinning. The capability of the
neural network to approximate arbitrary function therefore
offers us a flexible and simple treatment to handle the evolu-
tion of yield function.
To analyze the sensitivity with respect to the random
neural network weight initialization, we have repeated the
supervised learning for the synthetic yield function problem
showcased in Fig. 5 five times, each with a different random
seed. The results, which are shown in Fig. 6, indicate that
the training and the resultant losses for both the training and
testing cases are close. This result suggests that the training
is not very sensitive to the random seeds. Furthermore, the
small difference in the training and testing loss of the yield
function also suggests that there is no significant overfitting.
Our proposed algorithm also automates the discovery
of yield surfaces for new materials. We generate a yield
surface database for a randomly generated polycrystal mi-
crostructure through efficient data sampling of the invariant
stress space with FFT solver elastoplastic simulations. To
gather the yield surface data points for the polycrystal mate-
rial, we subdivide the π-plane uniformly at 140 Lode’s angles
and sample the stress space with monotonic loading simula-
tions at each angle direction. The yield surface data points
are gathered as soon as yielding is first detected, recording
the stress response and the accumulated plastic strain. The
FFT simulations provide 157500 sample points that are pre-
processed into 2362500 level set sample points. It is noted
that the material was observed to be pressure-independent.
Thus, sampling on the π-plane at a constant mean pressure
was enough to capture the entire stress response.
The yield surface data points are pre-processed into a
level set data base and the results of the trained polycrys-
tal neural network yield function are demonstrated in Fig. 5.
The neural network parameters for the new model training
in this section remain identical as previously described. In-
vesting the modeling effort to describe the complex yielding
behavior of a material could be proven futile – especially if
Fig. 6. Loss vs. epoch for the synthetic yield function shown in Fig
5for the training data set (TOP) and the testing data set for cross
validation (BOTTOM). The test data set is mutually exclusive with the
training data set.
Increasing accumulated plastic strain
!!
!!
!!
!
"= !#= !!!
"= !#= !!
!
"= !#= !!
!
"
!
"!
"
!#!#!#
polycrystal
microstructure
Fig. 7. Yield function level set of a new polycrystal microstructure
for increasing accumulated plastic strain.
the material is highly heterogeneous. Conceiving a new yield
function for every new material studied can become rather
impractical and automation in yield surface generation can
accelerate the plasticity study of novel materials.
3.3 Offline multiscale FFT-FEM numerical experi-
ments
In engineering practice, a constitutive law is seldom
used as a standalone forecast engine but is often incorporated
into a solver that provides a discretized numerical solution.
Here, we test whether the AI-generated models can be de-
8 Copyright c
by ASME
ployed into an existing finite element solution. The yield
surface neural networks combined with a hyperelastic en-
ergy functional neural network can be readily plugged into
a strain space return mapping algorithm to make strain-stress
predictions. In this work, we utilize a linear elasticity energy
functional as the neural network that will provide the elastic
response in the algorithm. We train a two layer feed-forward
neural network that inputs the elastic volumetric εe
vand de-
viatoric εe
sstrain invariants to approximate the hyperelastic
energy funtional ψe. The network is trained on 2500 data
points sampled from a uniform grid of (εe
v,εe
s)pairs. The ar-
chitecture consists of a hidden Dense layer (100 neurons /
ReLU), followed by two Multiply layers, then another hid-
den Dense layer (100 neurons / ReLU), and an output Dense
layer (Linear). The models were trained for 1000 epochs
with a batch size of 32 using the NAdam optimizer [35], set
with default values in the Keras library. Using a Sobolev
training framework, the model was optimized with a higher-
order H2training objective – the loss function constrains the
predicted energy, stress, and stiffness similar to (9). The re-
sulting stress predictions for the literature yield surfaces for
random cyclic loading and unloading strain paths are demon-
strated in Fig. 2for each approximated yield surface model.
We have also successfully incorporated the trained neu-
ral network plasticity model into a finite element solver to de-
liver an excellent match with the higher-cost FFT-FEM pre-
dictions for unseen loading paths not included in the training
data set. The discovered yield function for a randomly gen-
erated polycrystal microstructure is demonstrated in Fig. 7.
In Fig. 8, the polycrystal plasticity model trained by a neu-
ral network is used to replace the FFT solver that provides
the constitutive updates from DNS simulations at the sub-
scale level. The simulation is performed on a square plate
with a circular hole supported on frictionless rollers on the
top and bottom surfaces. Results shown in Fig. 8indicate
that the NN-FEM model is capable of replacing the compu-
tationally heavy FFT-FEM simulations (cf. [36]) with a frac-
tion of the cost. In this offline multiscale problem, the finite
element contains 960 elements with 2880 integration points.
An FFT-FEM framework may take an average of 11110 (ap-
proximately 3.85 seconds per integration point) seconds to
complete the incremental constitutive updates for all integra-
tion points whereas the neural network counterpart require
an average of 230 seconds (approximately 0.08 seconds per
integration point) to finish the same task in a MacBook Pro
with 8-core CPU. As for the overhead cost to generate the
training data from the FFT polycrystal simulations, the time
to generate the training data set for the polycrystal yield func-
tion (157500 yield function sample points) is approximately
5 hours.
4 Discussion
The proposed algorithm provides a general approach of
discovering complex yield surface shapes and their evolu-
tion directly from data. In the result section of this work, all
yield functions and hardening mechanisms are predicted by
neural networks without any specific modeler intervention
!
5
!
6
!
7
!
8
Fig. 8. The discovered yield function can be readily implemented in
FEM simulations, replacing the FFT solver. The accumulated plas-
tic strain profile for an FEM simulation and the predicted stress re-
sponses at different points of the domain against the FFT benchmark
simulations are also shown. The stress measure is in kPa
with hand-crafted derivations. All models in this work, be it
models from the plasticity literature or models designed for
new materials, followed identical data pre-processing, neural
network training, and return mapping implementation proce-
dures.
Our neural network yield functions provide a unique ad-
vantage in crafting interpretable data-driven plasticity mod-
els. The capacity to predict and visualize the entire yield
locus at every time step of an elastoplastic simulation al-
lows for the anticipation of elastic or plastic responses and
the inspection of thermodynamic consistency (e.g convex-
ity). Especially, by adopting a lower-dimensional stress rep-
resentation (Lode’s coordinates), not only is the model com-
plexity reduced but also a transparent yield surface data sam-
pling scheme becomes possible. The alternative of random
sampling of strain paths comes with the uncertainty of suffi-
ciently visiting the yield surface in the entire stress space.
9 Copyright c
by ASME
4.1 Physics underpinning for the partition of elastic and
plastic strain
Decomposing the elastoplastic behavior prediction into
two simple feed-forward neural networks – a hyperelastic en-
ergy functional and a yield function – is central to the al-
gorithm’s interpretability and allows for a clear-cut distinc-
tion of elastic and plastic behavior. This is not necessarily
true with the classical recurrent network approach, such as
the common LSTM or GRU [16,37]. When training neu-
ral networks with these architectures, the elastic and plastic
constitutive responses are often indistinguishable. This treat-
ment does not only cause issues with interpretability but also
renders the black-box predictions vulnerable to erroneous
causality or correlation structures. For instance, experimen-
tal data of the nonlinear elasticity response may actually af-
fect the yielding response as there is no explicit mechanism
to distinguish the two. For instance, the models trained on
monotonic loading data can readily predict non-monotonic
constitutive responses due to the explicitly defined elastic
range whereas the black-box alternative cannot (see Fig. 2).
Furthermore, the recurrent network’s dependency on the in-
put strain rate, the importance of the sampling frequencies in
the time domain, and the more difficult training due to the
vanishing or exploding [38] are rarely addressed in the ma-
chine learning plasticity literature.
Note that the machine learning algorithm proposed here
does not exhibit better interpretability than the hand-craft
counterpart, but is easier to interpret than the RNN and the
multi-step ANN approaches that do not provide definite dis-
tinction between the elastic region and the yielding. An ex-
ception is the recent work by Huang and Darve [39], in which
the neural network partitioned the total strain into elastic and
plastic components via a partition-of-unity function. Never-
theless, when a continuous weighting function (such as sig-
moid function), is used to partition for the elastic and plastic
strain, it may introduce a transition zone where the materials
are considered both path-independent and path-dependent.
4.2 Representation of parametric space and geometri-
cal interpretation of elastoplasticity models
Another advantage of the interpretable machine learning
approach is that the geometrical interpretation is helpful for
determining the optimal data exploration strategies. Given
the fact both real experiments and direct numerical simula-
tions are often costly to conduct, a Monte Carlo simulation
to randomly sample the parametric space for path-dependent
materials is too costly to be feasible [40]. By introducing
the level set to define the yielding criterion, however, we can
conceptualize the elastic range as a multi-dimensional ob-
ject in a Euclidean space. This feature may help us to visual
the abstract concept of yielding on a Euclidean space and
help us estimating the sufficiency of the data by defining a
proper metrics in the parametric vector space and decide the
distribution of data that helps us better captures important
features such as replicating sharp gradient, determining con-
vexity by checking the Hessian and ensuring connectivity of
the learned models. These tasks are not necessarily impossi-
ble but are difficult to achieve with a black-box model.
4.3 Smoothness of the machine learning plasticity
model
Training the neural networks of this work with a higher
degree of continuity activation functions and higher-order
Sobolev loss function constraints allows one to control the
prediction accuracy of the derivatives of the approximated
functionals. This control of stress gradient to the yield
function is crucial whereas the automatic differentiation
used in the back-propagation can help us generate suffi-
ciently smooth elastoplastic tangent operators suitable for
PDE solvers. On the other hand, classical black-box neural
network elastoplasticity approaches usually do not control
the quality of the derivatives of the trained functions. While
finite difference methods can be used to approximate the tan-
gent tensor obtained from neural network without Sobolev
training if necessary [41], the smoothness and accuracy of
the approximated tangent cannot be guaranteed. Further-
more, the Sobolev training and higher-order activation func-
tions allow controlling the smoothness and continuity of the
yield surface. This can be a more efficient alternative to the
current practice where a plasticity model with a non-smooth
yield surface either requires specific algorithmic algorithm
to genrate incremental constitutive updates [42] or modified
manually into a smoothed version to bypass the numerical
barrier [43,44].
In principle, the approach may generate a sufficiently
smooth yield surface in parametric space of different dimen-
sions (e.g. principal stress space, strain space, porosity-stress
space). However, if the yield surface is non-smooth for phys-
ical reasons, then (1) specific supervised learning algorithms
that detect the singular point and (2) the corresponding spe-
cific treatment to handle the bifurcated stress gradient of the
yield surface are both necessary. Furthermore, unlike the
classical hand-crafted model or models generated from ge-
ometric learning (see [45]) that are designed for an entire
class of materials of similar but distinctive microstructures,
the proposed algorithm is designed to generate a surrogate
model specifically tailored for one RVE or specimen.
4.4 Comparison with parameter identification of pre-
determined models
Note that, while both parameter identification and su-
pervised machine learning involve solving inverse problems
and, in many cases, multi-objective optimization, the pro-
posed approach does not assume specific forms of equations
a priori for the hyperelasticity energy functional and yield
function. With a sufficient neural network architecture, the
neural network approach may offer more flexibility in find-
ing the optimal forms of equations (see universal approxi-
mation theorem [46] ). However, this flexibility comes at the
expense of having to deal with the Banach space (cf. Parhi
and Nowak [47] and Weinan and Wojtowytsch [48]) of much
higher dimensions (of the neural network learned function)
than the Euclidean space for a typical parameter identifica-
tion problem.
10 Copyright c
by ASME
A similar analogy can be drawn between nonparamet-
ric/symbolic regression and polynomial regression where
the lack of predetermined form of the former approach of-
fers greater flexibility but also increases the difficulty of the
inverse problem. As demonstrated in the previous work
(cf. Wang et al. [49]), even in the case where the inverse
problem is merely used to determine the optimal set of
choices among a handful of pre-determined components of
the elasto-plasticity model, the additional effort and cost to
solve the combinatorial optimization on top of the CPU time
required to identify the parameter identification process can
be enormous.
This complexity motivates us to propose this alternative
paradigm that enable us to learn the elasto-plasticity prob-
lem in a divide-and-conquered manner, i.e., (1) learning the
elasticity first, (2) then the initial yield function and (3) the
hardening/softening rules that evolve the yield function, all
with multilayer perceptrons. In the cases we demonstrated
here, there is no need to use recurrent neural networks that
are more difficult to train well and apply regularization than
the simpler multilayer perceptrons [50]. In the future, we
may explore proper ways to generate more complex rules for
the yield function evolution with recurrent neural network,
but this is out of the scope of this study.
5 Conclusion
We propose a generalized machine learning paradigm
capable of generating pressure-sensitive and rate-dependent
plasticity models consisting of with interpretable compo-
nents. The component approach enables geometrical inter-
pretation of the hyperelastic energy and yield function in
the corresponding stress and strain spaces. This treatment
allow us examine thermodynamic constraints through geo-
metrical interpretation (e.g. convexity) and provide a higher
degree of modularity and simulatability required to interpret
mechanisms of plastic deformation. In the numerical exper-
iments presented in this paper, we first verify the capacity of
the paradigm to recover existing plasticity models with the
corresponding data. Then we provide additional examples
to show that the revised Hamilton-Jacobi solution formu-
lated for rate-dependent plasticity may generate model from
experimental data for steel. Finally, the machine learning
paradigm is used to generate macroscopic elasto-plasticity
surrogate model from FFT simulations of polycrystal con-
sists of FCC grains. The resultant macroscopic surrogate
model is tested against FFT direct numerical simulation at
the Gauss point. The results of the numerical experiments
that the generated models are able to recover old plasticity
laws but also capable of deducing new ones, with a reason-
able level of predictive and descriptive accuracy for the given
amount of data. This interpretability is necessary for ensur-
ing trustworthiness for engineering applications.
Acknowledgements
The authors would like to thank Dr. Ran Ma for pro-
viding the implementation of the polycrystal microstructure
generation and the FFT solver. The authors are supported by
the NSF CAREER grant from Mechanics of Materials and
Structures program at National Science Foundation under
grant contracts CMMI-1846875 and OAC-1940203, the Dy-
namic Materials and Interactions Program from the Air Force
Office of Scientific Research under grant contracts FA9550-
17-1-0169 and FA9550-19-1-0318. These supports are grate-
fully acknowledged. The views and conclusions contained
in this document are those of the authors, and should not
be interpreted as representing the official policies, either ex-
pressed or implied, of the sponsors, including the Army Re-
search Laboratory or the U.S. Government. The U.S. Gov-
ernment is authorized to reproduce and distribute reprints for
Government purposes notwithstanding any copyright nota-
tion herein.
References
[1] Cheng, Z., Zhou, H., Lu, Q., Gao, H., and Lu, L., 2018.
“Extra strengthening and work hardening in gradient
nanotwinned metals”. Science, 362(6414).
[2] Van der Giessen, E., and Needleman, A., 1995. “Dis-
crete dislocation plasticity: a simple planar model”.
Modelling and Simulation in Materials Science and En-
gineering, 3(5), p. 689.
[3] Aydin, A., Borja, R. I., and Eichhubl, P., 2006. “Geo-
logical and mathematical framework for failure modes
in granular rock”. Journal of Structural Geology, 28(1),
pp. 83–98.
[4] Gurson, A. L., 1977. “Continuum Theory of Duc-
tile Rupture by Void Nucleation and Growth: Part I—
Yield Criteria and Flow Rules for Porous Ductile Me-
dia”. Journal of Engineering Materials and Technol-
ogy, 99(1), 01, pp. 2–15.
[5] Martin, C., Bouvard, D., and Shima, S., 2003. “Study
of particle rearrangement during powder compaction
by the discrete element method”. Journal of the Me-
chanics and Physics of Solids, 51(4), pp. 667–693.
[6] Coulomb, C. A., 1773. Essai sur une application des
regles de maximis et minimis a quelques problemes de
statique relatifs a l´
architecture. Tech. rep., Mem. Div.
Sav. Acad.
[7] Mises, R. v., 1913. “Mechanik der festen k¨
orper
im plastisch-deformablen zustand”. Nachrichten von
der Gesellschaft der Wissenschaften zu G¨
ottingen,
Mathematisch-Physikalische Klasse, 1913, pp. 582–
592.
[8] Drucker, D. C., and Prager, W., 1952. “Soil mechan-
ics and plastic analysis or limit design”. Quarterly of
applied mathematics, 10(2), pp. 157–165.
[9] Sun, X., Bahmani, B., Vlassis, N. N., Sun, W., and
Xu, Y., 2021. “Data-driven discovery of interpretable
causal relations for deep learning material laws with
uncertainty propagation”. Granular Matter, pp. doi:
10.1007/s10035–021–01137–y.
[10] Ehlers, W., and Scholz, B., 2007. “An inverse algorithm
for the identification and the sensitivity analysis of the
parameters governing micropolar elasto-plastic granu-
11 Copyright c
by ASME
lar material”. Archive of Applied Mechanics, 77(12),
pp. 911–931.
[11] Wang, K., Sun, W., Salager, S., Na, S., and Khaddour,
G., 2016. “Identifying material parameters for a micro-
polar plasticity model via x-ray micro-computed tomo-
graphic (ct) images: lessons learned from the curve-
fitting exercises”. International Journal for Multiscale
Computational Engineering, 14(4).
[12] Jang, J., and Smyth, A. W., 2017. “Model updating of a
full-scale fe model with nonlinear constraint equations
and sensitivity-based cluster analysis for updating pa-
rameters”. Mechanical Systems and Signal Processing,
83, pp. 337–355.
[13] Ghaboussi, J., Pecknold, D. A., Zhang, M., and Haj-
Ali, R. M., 1998. “Autoprogressive training of neu-
ral network constitutive models”. International Journal
for Numerical Methods in Engineering, 42(1), pp. 105–
126.
[14] Ghaboussi, J., Garrett Jr, J., and Wu, X., 1991.
“Knowledge-based modeling of material behavior with
neural networks”. Journal of engineering mechanics,
117(1), pp. 132–153.
[15] Lefik, M., and Schrefler, B. A., 2003. “Artificial neu-
ral network as an incremental non-linear constitutive
model for a finite element code”. Computer meth-
ods in applied mechanics and engineering, 192(28-30),
pp. 3265–3283.
[16] Mozaffar, M., Bostanabad, R., Chen, W., Ehmann, K.,
Cao, J., and Bessa, M., 2019. “Deep learning predicts
path-dependent plasticity”. Proceedings of the National
Academy of Sciences, 116(52), pp. 26414–26420.
[17] Wang, K., and Sun, W., 2018. “A multiscale multi-
permeability poroplasticity model linked by recur-
sive homogenizations and deep learning”. Computer
Methods in Applied Mechanics and Engineering, 334,
pp. 337–380.
[18] Chi, H., Zhang, Y., Tang, T. L. E., Mirabella, L., Dal-
loro, L., Song, L., and Paulino, G. H., 2021. “Univer-
sal machine learning for topology optimization”. Com-
puter Methods in Applied Mechanics and Engineering,
375, p. 112739.
[19] Baker, N., Alexander, F., Bremer, T., Hagberg, A.,
Kevrekidis, Y., Najm, H., Parashar, M., Patra, A.,
Sethian, J., Wild, S., et al., 2019. Workshop report
on basic research needs for scientific machine learn-
ing: Core technologies for artificial intelligence. Tech.
rep., USDOE Office of Science (SC), Washington, DC
(United States).
[20] Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl,
R., and Yu, B., 2019. “Definitions, methods, and appli-
cations in interpretable machine learning”. Proceed-
ings of the National Academy of Sciences, 116(44),
pp. 22071–22080.
[21] Liu, Y., Sun, W., Yuan, Z., and Fish, J., 2016. “A nonlo-
cal multiscale discrete-continuum model for predicting
mechanical behavior of granular materials”. Interna-
tional Journal for Numerical Methods in Engineering,
106(2), pp. 129–160.
[22] Wang, K., and Sun, W., 2016. “A semi-implicit
discrete-continuum coupling method for porous media
based on the effective stress principle at finite strain”.
Computer Methods in Applied Mechanics and Engi-
neering, 304, pp. 546–583.
[23] Feyel, F., and Chaboche, J.-L., 2000. “Fe2 multi-
scale approach for modelling the elastoviscoplastic be-
haviour of long fibre sic/ti composite materials”. Com-
puter methods in applied mechanics and engineering,
183(3-4), pp. 309–330.
[24] Hartmaier, A., Buehler, M. J., and Gao, H., 2005.
“Multiscale modeling of deformation in polycrystalline
thin metal films on substrates”. Advanced Engineering
Materials, 7(3), pp. 165–169.
[25] Vlassis, N. N., and Sun, W., 2021. “Sobolev training
of thermodynamic-informed neural networks for inter-
pretable elasto-plasticity models with level set harden-
ing”. Computer Methods in Applied Mechanics and En-
gineering, 377, p. 113695.
[26] Molnar, C., Casalicchio, G., and Bischl, B., 2018. “iml:
An r package for interpretable machine learning”. Jour-
nal of Open Source Software, 3(26), p. 786.
[27] Coombs, W. M., and Motlagh, Y. G., 2017. “Nurbs
plasticity: yield surface evolution and implicit stress in-
tegration for isotropic hardening”. Computer Methods
in Applied Mechanics and Engineering, 324, pp. 204–
220.
[28] Borja, R. I., 2013. Plasticity: modeling & computation.
Springer Science & Business Media.
[29] Bishop, C. M., et al., 1995. Neural networks for pattern
recognition. Oxford university press.
[30] Czarnecki, W. M., Osindero, S., Jaderberg, M.,
´
Swirszcz, G., and Pascanu, R., 2017. “Sobolev train-
ing for neural networks”.
[31] Roscoe, K., and Burland, J., 1970. “On the generalized
stress-strain behavior of “wet” clay: 60. k. h. roscoe
and j. b. burland. engineering plasticity (papers for a
conference held in cambridge, mar. 1968), cambridge,
university press, 535–609 (1968)”. Journal of Terrame-
chanics, 7(2), pp. 107–108.
[32] Argyris, J., Faust, G., Szimmat, J., Warnke, E., and
Willam, K., 1974. “Recent developments in the finite
element analysis of prestressed concrete reactor ves-
sels”. Nuclear Engineering and Design, 28(1), pp. 42–
75.
[33] Chollet, F., et al., 2015. Keras. https://keras.io.
[34] Cowper, G. R., and Symonds, P. S., 1957. Strain-
hardening and strain-rate effects in the impact loading
of cantilever beams. Tech. rep., Brown Univ Provi-
dence Ri.
[35] Dozat, T., 2016. Incorporating nesterov momentum
into adam.
[36] Kochmann, J., Wulfinghoff, S., Reese, S., Mianroodi,
J. R., and Svendsen, B., 2016. “Two-scale fe-fft-and
phase-field-based computational modeling of bulk mi-
crostructural evolution and macroscopic material be-
havior”. Computer Methods in Applied Mechanics and
Engineering, 305, pp. 89–110.
12 Copyright c
by ASME
[37] Fuchs, A., Heider, Y., Wang, K., Sun, W., and Kaliske,
M., 2021. “Dnn2: A hyper-parameter reinforcement
learning game for self-design of neural network based
elasto-plastic constitutive descriptions”. Computers &
Structures, 249, p. 106505.
[38] Pascanu, R., Mikolov, T., and Bengio, Y., 2013. “On
the difficulty of training recurrent neural networks”. In
International conference on machine learning, PMLR,
pp. 1310–1318.
[39] Xu, K., Huang, D. Z., and Darve, E., 2021. “Learning
constitutive relations using symmetric positive definite
neural networks”. Journal of Computational Physics,
428, p. 110072.
[40] Giunta, A., Wojtkiewicz, S., and Eldred, M., 2003.
“Overview of modern design of experiments methods
for computational simulations”. In 41st Aerospace Sci-
ences Meeting and Exhibit, p. 649.
[41] Hashash, Y., Jung, S., and Ghaboussi, J., 2004. “Nu-
merical implementation of a neural network based ma-
terial model in finite element analysis”. International
Journal for numerical methods in engineering, 59(7),
pp. 989–1005.
[42] de Souza Neto, E. A., Peric, D., and Owen, D. R., 2011.
Computational methods for plasticity: theory and ap-
plications. John Wiley & Sons.
[43] Abbo, A., and Sloan, S., 1995. “A smooth hyperbolic
approximation to the mohr-coulomb yield criterion”.
Computers & structures, 54(3), pp. 427–441.
[44] Matsuoka, H., and Nakai, T., 1974. “Stress-
deformation and strength characteristics of soil under
three different principal stresses”. In Proceedings of
the Japan Society of Civil Engineers, no. 232, Japan
Society of Civil Engineers, pp. 59–70.
[45] Vlassis, N. N., Ma, R., and Sun, W., 2020. “Geomet-
ric deep learning for computational mechanics part i:
Anisotropic hyperelasticity”. Computer Methods in Ap-
plied Mechanics and Engineering, 371, p. 113299.
[46] Scarselli, F., and Tsoi, A. C., 1998. “Universal approxi-
mation using feedforward neural networks: A survey of
some existing methods, and some new results”. Neural
networks, 11(1), pp. 15–37.
[47] Parhi, R., and Nowak, R. D., 2020. “Banach space
representer theorems for neural networks and ridge
splines”. arXiv preprint arXiv:2006.05626.
[48] Weinan, E., and Wojtowytsch, S., 2020. “On the
banach spaces associated with multi-layer relu net-
works: Function representation, approximation the-
ory and gradient descent dynamics”. arXiv preprint
arXiv:2007.15623.
[49] Wang, K., Sun, W., and Du, Q., 2019. “A cooperative
game for automated learning of elasto-plasticity knowl-
edge graphs and models with ai-guided experimenta-
tion”. Computational Mechanics, 64(2), pp. 467–499.
[50] Zaremba, W., Sutskever, I., and Vinyals, O., 2014. “Re-
current neural network regularization”. arXiv preprint
arXiv:1409.2329.
13 Copyright c
by ASME