Content uploaded by Waiching Sun

Author content

All content in this area was uploaded by Waiching Sun on Oct 06, 2021

Content may be subject to copyright.

Component-based machine learning paradigm

for discovering rate-dependent and

pressure-sensitive level-set plasticity models

Nikolaos N. Vlassis

Postdoctoral Research Scientist

Department of Civil Engineering and Engineering Mechanics

Columbia University

New York, New York 10027

Email: nnv2102@columbia.edu

WaiChing Sun ∗

Associate Professor

Department of Civil Engineering and Engineering Mechanics

Columbia University

New York, New York 10027

Email: wsun@columbia.edu

Conventionally, neural network constitutive laws for path-

dependent elasto-plastic solids are trained via supervised

learning performed on recurrent neural network, with the

time history of strain as input and the stress as input. How-

ever, training neural network to replicate path-dependent

constitutive responses require signiﬁcant more amount of

data due to the path dependence. This demand on diverse

and abundance of accurate data, as well as the lack of in-

terpretability to guide the data generation process, could be-

come major roadblocks for engineering applications. In this

work, we attempt to simplify these training processes and im-

prove the interpretability of the trained models by breaking

down the training of material models into multiple super-

vised machine learning program for elasticity, initial yield-

ing and hardening laws that can be conducted sequentially.

To predict pressure-sensitivity and rate dependence of the

plastic responses, we reformulate the Hamliton-Jacobi equa-

tion such that the yield function is parametrized in prod-

uct space spanned by the principle stress, the accmulated

plastic strain and time. To test the versatility of the neural

network meta-modeling framework, we conduct multiple nu-

merical experiments where neural networks are trained and

validated against (1) data generated from known benchmark

models, (2) data obtained from physical experiments and (3)

data inferred from homogenizing sub-scale direct numerical

simulations of microstructures. The neural network model is

also incorporated into an ofﬂine FFT-FEM model to improve

the efﬁciency of the multiscale calculations.

∗Corresponding author

1 Introduction

One of the century-old challenges for mechanics re-

searchers is to formulate plasticity theory that predicts the

relationship among strain history, plastic deformation, and

stress for materials governed by different deformation mech-

anisms. As plastic deformation accumulates, the dissipation

and plastic work may lead the yielding criteria to evolve, and

cause a variety of hardening/softening mechanisms to mani-

fest as the evolution of microstructures, such as twinning [1],

dislocation [2], pore collapse [3], void nucleation [4], and re-

arranging of particles [5]. Generations of scholars including

Coulomb [6], von Mises [7], Drucker and Prager [8] spent

decades to create new plasticity theories to incorporate new

causality relations and hypotheses for path-dependent ma-

terials. In stress-based plasticity theories, yield function is

expressed as a function of stress, internal variables, and the

the hardening laws (e.g. isotropic, kinematic hardening, rota-

tional, mixed-mode, hardening) as deduced from experimen-

tal observations and sub-scale micro-mechanical simulations

(e.g. dislocation dynamic, molecular simulations).

In the past decades, new plasticity models are often gen-

erated by modifying existing models with different expres-

sions of the yield functions or the hardening laws. For in-

stance, a search in Google Scholar for ”modiﬁed Johnson-

Cook model” and ”modiﬁed Gurson model” reveals more

than 632,000 and 8,350 results respectively 1. A vast major-

ity of these published works are dedicated to manually mod-

ifying the original model with new evolution laws or shapes

of the yield surface that accompany new physics, new mate-

1as of 7/8/2021

1 Copyright c

by ASME

rials, or new insights for more precise predictions. While

this conventional workﬂow has led to numerous improve-

ments in modeling, the more sophisticated models are often

inherently harder to tune due to the expansion of parametric

space. This expansion does not only make it less feasible

to determine the optimal mathematical expressions through

a manner trial-and-error effort (even after the causality of the

yielding and hardening is known [9]), but also require a more

complicated inverse problems to identify the material param-

eters [10,11,12].

The recent success of deep neural networks has inspired

a new trend where one may simply build a forecast engine

by training a network with a pair of strain and stress his-

tories [13,14]. To replicate the history dependence of the

plastic deformation, an earlier neural network approach work

would employ strain and stress from multiple previous time

steps to predict new stress states [15] whereas more recent

works such as [16] and [17] employ recurrent neural net-

works, such as the Long Short-Term Memory (LSTM) and

Gated Recurrent neural networks to introduce memory ef-

fects. The promise and expectation are that the continuous

advancement of neural networks or other machine learning

techniques might one day replace the modeling paradigm

currently employed in the engineering industry with supe-

rior accuracy, efﬁciency, and robustness [18,16]. However,

the early success in the 90s and the recent resurrection of

optimism about neural network predictions on constitutive

responses so far have a limited impact on industry appli-

cations. This reluctance is not entirely unjustiﬁed. In fact,

recent studies and workshops conducted by US Department

of Energy have cited the lack of domain awareness, inter-

pretability, and robustness as some of the major technical

barriers for the revolution of AI for scientiﬁc machine learn-

ing [19]. To facilitate changes in the industry, the trustwor-

thiness of the predictions is necessary and interpretability is

a necessary condition to overcome these obstacles [20].

As such, our focus in this work is to explore the possi-

bility of building an interpretable machine learning paradigm

capable of serving as the interface between plasticity model-

ers and artiﬁcial intelligence. As such, our focus has shifted

from solely using AI to make predictions to building AI

to create plasticity theories compatible with domain knowl-

edge, easily interpreted, and capable of not only improving

the accuracy but also the robustness of existing models. We

propose the training of elastoplasticity models through mul-

tiple supervised learning to generate the model components

of knowledge separately (i.e. elastic stored energy, yielding

function, and hardening laws). The resultant model is the

composition of these machine-generated knowledge compo-

nents and is fully interpretable by modelers.

To achieve this goal, we have recast both rate-

independent and rate-dependent plasticity as a Hamilton-

Jacobi problem in a parametric space that is spanned by the

principal stress, the accumulated plastic strain, the plastic

strain rate and the real time. Meanwhile, the anisotropy of

plastic yielding is achieved by mapping the yield function

level sets of the same material under different orientations

through a supervised neural network training on a chosen

yield function projection basis. These treatments enable us to

create a general mathematical description for a large number

of existing plasticity models, including von Mises plasticity,

Drucker-Prager, and Cam-clay models combined with any

possible hardening law as merely special cases of the level

set plasticity model. Instead of solving the level set extension

problem in the parametric space, we formulate a supervised

learning problem generate the constitutive updates from neu-

ral computation and speed up calculation compared to clas-

sical hierarchical multiscale computation [21,22,23,24].

More importantly, this new AI-enabled framework rep-

resents a new paradigm shift where the goal of machine

learning has shifted from merely generating forecast engines

for mechanistic predictions to creating interpretable math-

ematical models that inherently obey physical laws with the

assistance of machine learning. The resultant model does not

require the usage of recurrent neural networks, it is easier to

train, and provides more robust results for blind predictions.

2 Level set plasticity

The goal of this section is to extend the previous pub-

lished work [25] to incorporate pressure-sensitivity and rate-

dependence. The mathematical framework is very similar

except that the introduction of the pressure-dependence and

the rate-dependence may lead to a higher dimensional space

for the Hamilton-Jacobi problems and therefore higher de-

mand on data. These new implications are highlighted in this

section. Details of the initial implementation of the level set

plasticity model can be referred to [25]. The algorithm used

to generate the yield function level sets and trained the plas-

ticity model neural networks is summarized in Algorithm 1.

Here we formulate the machine learning plasticity prob-

lem, not in a single supervised learning, but by splitting the

task into multiple smaller ones (predicting a stored elastic-

ity energy functional, predicting a yield surface, introducing

a mapping for anisotropy), each constituting one neural net-

work trained for a sub-goal. Then, complex behaviors can

be predicted by integrating these networks in a level set plas-

ticity framework. This treatment does not only improve the

predictions but more importantly introduces a learning struc-

ture where the casual relations of the individual components

are clearly deﬁned without losing the generality of individ-

ual model predictions. As pointed out by recent work for in-

terpretable machine learning [20] and [26], this component

design helps promoting both simulatability (the ability to

internally simulate and reason about the overall predictions)

and modularity (the ability to interpret portions of the pre-

dictions independently) of the AI-generated models.

We introduce a new concept of treating the yield sur-

faces in the parametric space composed of stress, accu-

mulated plastic strain, strain rate as a level set. We also

discuss the importance of leveraging material symmetries

to reduce the data demand for the supervised machine

learning problem. Previously, [27] and [25], have intro-

duced NURB and machine learning-based interpolations re-

spectively to generate yield surfaces with isotropic rate-

independent plasticity. The key departure here is the new

2 Copyright c

by ASME

!!

!

"= !#= !!

!

"

!#

!!

!

"= !#= !!

!

"

!#

!!

!

"= !#= !!

!

"

!#

STEP 1:

Gather yield

surface points

STEP 2:

Generate level set

(data augmentation)

STEP 3:

Train NN yield

function level set

yield surface

points

level set predicted yield

surface

Fig. 1. Universal training process for level set yield functions: 1) gather yield surface data points, 2) generate level set through the initialization

process, and 3) train neural network on the level set data (the zeroth level of the predicted level set is the approximated yield surface).

capacity to generalize the learning algorithm for anisotropic

rate-dependent/independent plasticity.

An important factor that dictate whether the training of

the machine learning model with limited data could be suc-

cessful is how material symmetry can be leveraged. For ex-

ample, the data collection can be signiﬁcantly reduced for

isotropic plasticity, as the principal strain and stress are co-

axial. Another important aspect to consider is how to lever-

age material symmetry to select the coordinate system that

represents the same data in the parametric space. For in-

stance, a Euclidean space spanned by the values of the three

principal stress could be sufﬁcient for isotropic yield func-

tion and hence leads to a simpler supervised learning prob-

lem than those that use all 6 stress components. Furthermore,

the choice of the coordinate system may affect how one plan

to collect the data and vice versa. For instance, while it is

possible to formulate the level set problem with the princi-

pal stress as the Cartesian basis, i.e., (σ1,σ2,σ3), it might be

even more efﬁcient to consider the usage of (q,p)stress for

experimental data obtained from conventional triaxial tests

where only two distinctive principal stress can be controlled.

In this latter case, the anisotropy and the dependence of all

three invariants of the constitutive responses could not be

sufﬁciently captured from the data gathered by the set of the

experiments alone. Hence increasing the dimensions of the

parametric space for the elasticity energy and the yield func-

tion would not be beneﬁcial. In the numerical experiments

we conducted, we adopt the cylindrical coordinates (see Eq.

(1)) for the π-plane orthogonal to the hydrostatic axis where

σ1=σ2=σ3(cf. [28]). This treatment enables us to detect

any symmetry on the π-plane that might allow us to reduce

the dimensions of the data and potentially simplify the train-

ing of the neural network with less data.

σ1

σ2

σ3

=R0R00

σ00

1

σ00

2

σ00

3

=

√2/2 0 √2/2

0 1 0

−√2/2 0 √2/2

1 0 0

0p2/3 1/√3

0−1/√3p2/3

σ00

1

σ00

2

σ00

3

.

(1)

Before translating the yield surface fΓdata points into

a yield function level set, we reduce the dimensionality of

the stress point xrepresentation. In the case of the isotropic

pressure-dependent plasticity, we can reduce the stress rep-

resentation from 6 dimensions x(σ11,σ22 ,σ33,σ12 ,σ23,σ13 )

(already reduced from 9 due to balance of angular momen-

tum) to an equivalent three stress invariant representation

b

x(p,ρ,θ). In this representation, pis the mean pressure, ρ

and θare the Lode’s radius and angle respectively.

The yield function is then postulated to be a signed dis-

tance function deﬁned as:

φ(b

x,ξ,˙

ξ,t) =

d(b

x)outside fΓ(inadmissible stress)

0 on fΓ(yielding)

−d(b

x)inside fΓ(elastic region)

,

(2)

where d(b

x)is the minimum Euclidean distance between any

point b

xof the solution domain Ωof the stress space where

the signed distance function is deﬁned and the yield surface

fΓ={b

x∈R3|f(b

x) = 0}, deﬁned as:

d(b

x) = min(|b

x−b

xΓ|),(3)

where b

xΓis the yielding stress for a given value of accumu-

lated plastic strain ξand its rate ˙

ξat time t. The plastic in-

ternal variable ξis monotonically increasing and represents

3 Copyright c

by ASME

the history-dependent behavior of the material. The time t

signiﬁes a snapshot of the current state of the level set φfor

the current value of the plastic internal variable ξand its rate

˙

ξ.

2.1 Data augmentation through signed distance func-

tion generation

We can now pre-process the stress point cloud of the

yield surface for a given ξand ˙

ξby solving the Eikonal equa-

tion ∇ˆ

xφ=1 while prescribing the signed distance function

to 0 at ˆ

x∈fΓ. For every stress point in the yield surface

data set, we generate a discrete number of auxiliary points

that construct a signed distance function. In the context of

level set theory, this can be seen as solving the level set ini-

tialization problem. In the context of machine learning, the

signed distance function construction can be interpreted as

a method of data augmentation – a large number of auxil-

iary data samples where fΓ6=0 are introduced to improve

the training performance as well as the accuracy and robust-

ness of both the learned function fΓand equally importantly

its stress gradient ∂fΓ/∂σi j. A schematic of the yield surface

data pre-processing into a signed distance function is demon-

strated in Fig. 1.The color is the value of the signed distance

yield function. It is negative in the elastic region and posi-

tive in the inadmissible stress region. The material yields if

the current stress is at the location where the value of yield

function equals to zero. It is noted that the signed distance

function has been selected as the preferred level set function

due to the simplicity of the implementation – the yield func-

tion can be formulated on other level set function bases, the

beneﬁts of which will be considered in future work.

2.2 Hardening as a level set extension problem

After pre-processing the yield surface fΓdata points for

a sequence of internal variable values ξand rates ˙

ξinto a

level set by solving the level set initialization problem, we

will recover the velocity function of a Hamilton-Jacobi equa-

tion of a level set extension problem to describe the temporal

evolution of the level set. A general Hamilton-Jacobi equa-

tion reads:

∂φ

∂t+v·∇ˆ

xφ=0,(4)

where vis the normal velocity ﬁeld that describes the geo-

metric evolution of the boundary (yield surface fΓ). In the

context of plasticity, the velocity ﬁeld corresponds to the ob-

served hardening mechanism. The velocity vector ﬁeld can

be described by a magnitude scalar function Fand a direc-

tion vector ﬁeld n=∇ˆ

xφ/∇ˆ

xφsuch that:

v=F·n.(5)

Substituting into Eq. (4):

∂φ(ξ,˙

ξ)

∂t+F(ξ,˙

ξ)|∇ˆ

xφ(ξ,˙

ξ)|=0,

where Fi≈φi+1(ξi+1,˙

ξi+1)−φi(ξi,˙

ξi)

∆t.

(6)

In the above equation, Fi(p,ρ,θ,ξ,˙

ξ) = F(p,ρ,θ,ξ,˙

ξ,ti)

for i=0,1,2,...,n+1 is the ﬁnite difference approxi-

mated scalar velocity (hardening) function that corresponds

to the pre-processed collection of signed distance functions

{φ0,φ1,...,φn+1}at time {t0,t1,...,tn+1}. Thus, we have

recast a yield function finto a signed distance function

φ, such that f(p,ρ,θ,ξ,˙

ξ) = φ(p,ρ,θ,ξ,˙

ξ). We can now

formulate a machine learning problem to approximate the

level set yield function fwith its neural network yield func-

tion b

f=b

fp,ρ,θ,ξ,˙

ξ|W,bcounterpart, parametrized by

weights Wand biases bto be optimized during training.

The training objective for the neural network optimiza-

tion is to minimize the following loss function at training

samples (ˆx,ξ,˙

ξ,t)for i∈[1,...,N]:

W0,b0=argmin

W,b1

N

N

∑

i=1

fi−b

fi

2

2

+wpsign−

3

∑

A=1

σA,i

∂b

fi

∂σA,i,

(7)

where we have added a penalty term, weighted by a factor

wp, that will activate when the the yield function is not obey-

ing convexity during training.

It is noted that the Hamilton-Jacobi equation described

in this section will not be solved numerically – while theo-

retically possible (e.g fast marching solver). Its solution will

be directly predicted by a neural network. The zeroth level

of the neural network predicted level set is the yield surface.

The neural network approximated velocity ﬁeld is the data-

driven hardening mechanism.

Remark 1 Rescaling of the training data. In every loss

function in this work, we have introduced scaling coefﬁ-

cients γαto remind the readers that it is possible to change

the weighting to adjust the relative importance of different

terms in the loss function. These scaling coefﬁcients may

also be viewed as the weighting function in a multi-objective

optimization problem. In practice, we have normalized all

data to avoid the vanishing or exploding gradient problem

that may occur during the back-propagation process [29]. As

such, normalization is performed before the training as a pre-

processing step. The Xisample of a measure Xis scaled to a

unit interval via,

4 Copyright c

by ASME

Algorithm 1 Training of a pressure and rate dependent

isotropic yield function level set neural network.

Require: Data set of Nsamples: stress measures σat yield-

ing, accumulated plastic strain εp, and accumulated plas-

tic strain rate ˙

εp, a Llevels number of levels (isocontours)

for the constructed signed distance function level set (data

augmentation), and a parameter ζ>1 for the radius range

of the constructed signed distance function.

1. Project stress onto π-plane

Initialize empty set of π-plane projection training sam-

ples (ρi,θi,pi)for iin [0,...,N].

for i in [0,...,N]do

Spectrally decompose σi=∑3

A=1σA,in(A)

i⊗n(A)

i.

Transform (σ1,i,σ2,i,σ3,i)into σ00

1,i,σ00

2,i,σ00

3,ivia

Eq. (1).

ρi←qσ002

1,i+σ002

2,i

θi←tan−1σ00

2,i

σ00

1,i

pi←3

√3σ00

3,i

end for

2. Construct yield function level set (data augmentation)

Initialize empty set of augmented training samples

(ρm,θm,pm,εp,m,˙

εp,m,fm)for min [0,...,N×Llevels].

m←0.

for i in [0,...,N]do

for jin [0,...,Llevels]do

ρm←ζj

Llevels ρithe signed distance

function is constructed for a radius range of [0,ζρi]

θm←θi

pm←pi

εp,m←εp,i

˙

εp,m←˙

εp,i

fm←ζj

Llevels ρi−ρithe signed distance

function value range is [−ρi,(ζ−1)ρi]

Rescale (ρm,θm,pm,εp,m,˙

εp,m,fm)into

(ρm,θm,εp,m,˙

εp,m,fm)via Eq. (8).

m←m+1

end for

end for

3. Train neural network b

f(ρm,θm,εp,m,˙

εp,m)with loss

function Eq. (7).

4. Output trained yield function b

fneural network and exit.

Xi:=Xi−Xmin

Xmax −Xmin

,(8)

where Xiis the normalized sample point. Xmin and Xmax are

the minimum and maximum values of the measure Xin the

training data set such that all different types of data used in

this paper (e.g. energy, stress, stress gradient, stiffness) are

all normalized within the range [0,1].

2.3 High-order Sobolev training

In this work, we distinguish between the material’s elas-

tic and plastic behaviors by training two different neural net-

work model components – a hyperelastic energy functional

and a yield function level set that evolves according to accu-

mulated plastic strain. These components are then combined

in a speciﬁc form of return mapping algorithm (Algorithm 1)

that may take an arbitrary elasticity model, and a yield func-

tion with generic hardening law to generate the constitutive

update for the class of inelastic materials that has a distinct

elastic region deﬁned in a parametric space. The hypere-

lastic network counterpart is expected to have interpretable

derivatives – the ﬁrst derivative of the energy functional with

respect to the strain should be a valid stress tensor, and the

second derivative a valid stiffness tensor. We adopt a Sobolev

training objective, ﬁrst introduced in [30], and we extend it

to higher-order constraints, to train the energy functional ap-

proximator b

ψe(εe|W,b)using the following loss function:

W0,b0=argmin

W,b1

N

N

∑

i=1kψe

i−b

ψe

ik2

2+

∂ψe

i

∂εe

i−∂b

ψe

i

∂εe

i

2

2

+

∂2ψe

i

∂εe

i⊗∂εe

i−∂2b

ψe

i

∂εe

i⊗∂εe

i

2

2.

(9)

A beneﬁt of using Sobolev training is the notable data ef-

ﬁciency. Sobolev training has been shown to produce more

accurate and smooth predictions for the energy, stress, and

stiffness ﬁelds for the same amount of data compared to clas-

sical L2norm approaches that would solely constrain the pre-

dicted energy values [25].

3 Numerical experiments

In this section, we demonstrate the AI’s capacity to re-

discover plasticity models from the literature, we explore

the model’s ability to capture highly complex new harden-

ing modes, and, ﬁnally, showcase how the AI can discover

the yield surface for a new polycrystal material and replace

the plasticity model in a ﬁnite element simulation. To test

whether the machine learning approach can be generalized,

we purposely test the AI against a wide range of material

data sets for soil, rock, poly-crystal, and steel. In particu-

lar, we employ three types of data sets, (1) data generated

from known literature models, (2) data obtained from exper-

iments, and (3) data obtained from sub-scale direct numeri-

cal simulations of microstructures. The ﬁrst type of data is

used as a benchmark to verify whether the neural network

can correctly deduce the correct plastic deformation mech-

anisms (yield surface and hardening) when given the corre-

sponding data. The second and third types of data are used to

validate and examine the AI’s ability to discover new plastic

deformation mechanisms with a geometrical interpretation in

the stress space.

5 Copyright c

by ASME

3.1 Veriﬁcation examples

The purpose of this example is to showcase our algo-

rithms capacity to reproduce the modeling capacity of clas-

sical plasticity theory. We ﬁrst demonstrate our algorithms

ability to recover yield surface and hardening mechanisms

from the classical plasticity literature. We then demon-

strate the frameworks capacity to make predictions cali-

brated on experimental data for pressure-dependent and rate-

dependent plasticity.

3.1.1 Veriﬁcation on classical plasticity theories

The proposed AI can readily reproduce numerous yield

function models from the plasticity literature, following the

same universal data pre-processing and neural network train-

ing algorithm. For this benchmark experiment, we gener-

ate synthetic data sets for four initial yield surfaces of in-

creasing shape complexity: the J2 [7] (cylinder), Drucker-

Prager [8] (cone), Modiﬁed Cam-Clay [31] (oval), and Ar-

gyris [32] (ovoid with triangular cross-section) yield sur-

faces. We simultaneously study four common hardening

mechanisms that transform and/or translate these surfaces

in the 3D stress space: isotropic hardening (cylinder dila-

tion), rotational hardening (cone rotation), kinematic hard-

ening (translation along the hydrostatic axis), and softening

(shrinking).

The data sets for these yield surfaces are populated by

sampling from the above-mentioned literature yield func-

tions. The sampling was performed as a uniform grid of the

stress invariants and the accumulated plastic strain. We sam-

ple 50 data points along the mean pressure axis, 100 data

points along the angle axis, and 10 data points along the ac-

cumulated plastic strain axis (a total of 50000 data samples

per yield function data set). The yield surface data points

are pre-processed into a signed distance function level set

database through the level set initialization procedure. For

each yield surface, 15 levels are constructed: the yielding

level, 7 in the elastic region, and 7 in the region of inadmis-

sible stress. After data augmentation, the training data set

consists of 750000 level set sample points.

For each level set database, we train a feed-forward neu-

ral network to approximate the initial yield function and its

evolution. The yield function neural networks consist of a

hidden Dense layer (100 neurons / ReLU), followed by two

Multiply layers, then another hidden Dense layer (100 neu-

rons / ReLU) and an output Dense layer (Linear). The use

of Multiply layers was ﬁrst introduced in [25] to increase the

continuity of the activation functions of neural network func-

tional approximators. They were shown to allow for greater

control over the network’s higher-order derivatives and the

application of higher-order Sobolev constraints in the loss

function. The layers’ kernel weight matrix was initialized

with a Glorot uniform distribution and the bias vector with

a zero distribution. All the models were trained for 2000

epochs with a batch size of 128 using the NAdam optimizer,

set with default values of the Keras library [33].

The neural network predicted yield surfaces are demon-

strated in Fig. 2. For each model, three surfaces are shown

for three different levels of accumulated plastic strain. It is

highlighted that, given an accumulated plastic strain value,

we can recover the entire yield locus.

3.1.2 Level set plasticity model discovery for rate-

dependent and anisotropic materials

In this section, we test the frameworks capacity to make

predictions on rate-dependent and anisotropic data.

To test the trained neural network prediction of rate-

dependent responses, we incorporate data from the published

work [34] for steel that exhibits different yielding stress un-

der different strain rates. In the numerical experiments, we

use experimental data collected at strain rates ranging from

0 to 0.02s−1as the training data, sampled in a uniform grid

of 10 strain rate increments. The yield surface is sampled at

25 points along the mean pressure axis, at 100 points along

the angles axis, and at 10 points along the accumulated plas-

tic strain axis (a total of 250000 sample points). The data

are pre-processed into signed distance functions of 15 lev-

els, generating 3750000 training sample points. The neural

network used for this viscoplastic model training follows the

same architecture as the yield function neural networks de-

scribed in the previous section.

We use the experimental data collected at strain rates

10−4, 5 ×10−1, and 0.02s−1to validate the ability of the

model to make blind predictions for unseen events. Figure

3(a) shows the results of the six predictions that the AI gen-

erated for unseen data. The left ﬁgure shows the stress-strain

predictions on the uniaxial tensile tests of three different

loading rates, while the right ﬁgure shows the stress-strain

predictions on the simple shear test counterpart. In both

cases, the predictions match well with the unseen benchmark

data that is excluded from the training data set.

As for the anisotropic predictions, Figure 4shows the

machine learning generated mapping that predicts how the

yield surface in the principal stress space evolves for dif-

ferent material orientations. The data we employed in this

second experiment is generated from an FFT solver that sim-

ulates the polycrystal plasticity of a specimen composed of

FCC crystal grains. We sample the material constitutive be-

havior at 10 microstructure orientations at 150 Lode angle

sampling directions and pre-process the data into signed dis-

tance functions of 15 levels, generating 22500 training sam-

ple points for the projection mapping neural network. Work-

ing on the pressure independent stress space, the network

inputs the true stress invariants and the microstructural ori-

entation information that describes the anisotropy – in the

case of polycrystals studied in this work, the polycrystal ori-

entations as three Euler angles, and outputs the reference

stress space invariants. The network has the following layer

structure: Dense layer (200 neurons / ReLU), Multiply layer,

Dense layer (200 neurons / ReLU), Multiply layer, followed

by three more Dense layers (200 neurons / ReLU) and an

output Dense layer (Linear). The layers’ kernel weight ma-

trix was initialized with a Glorot uniform distribution and the

bias vector with a zero distribution. The model was trained

for 2000 epochs with a batch size of 256 using the NAdam

6 Copyright c

by ASME

!!= !"= !#

!!

!#

!"!!= !"= !#

!#

!"

!!

!!= !"= !#

!!= !"= !#

!#!#

!!

!!

!"!"

#

/

!

"

#

!

"

#

$

%

#

$

%

0

1

/

0

1

2

3

4

2

3

4

Fig. 2. AI can rediscover classical plasticity models: J2 plasticity model with isotropic hardening (top left), Drucker-Prager model with

rotational hardening (top right), MCC model with kinematic hardening (Bauschinger effect) (bottom left), and Argyris model with softening

(bottom right). The corresponding benchmark and predicted strain-stress curves are also demonstrated. The stress measure is in kPa.

(a)

Fig. 3. Neural network predicted viscoplastic response for increas-

ing loading strain rates for a tension (left) and shear (right) load-

ing test performed on mild-steel beams (experimental data obtained

from [34]).

optimizer. The predictions of the mapping suggest that it is

possible to generate a single mapping function that maps all

yield surfaces obtained from different polycrystal specimens

of different orientation onto a reference stress domain - de-

noted as (σ00

1,σ00

2,σ00

3).

3.2 Demonstration of model discovery capacity

Yield surface discovery in the literature has been limited

by the difﬁculty of deriving mathematical expressions for

higher-complexity geometrical shapes that represent them.

Additional obstacles arise when there is need to describe the

smooth transition from the shape of the initial yield surface

to that of a state with more accumulated plasticity. The al-

gorithm’s capability to discover new yield surfaces and hard-

ening mechanisms automatically directly from the data over-

comes these impediments.

To test this, we construct a ﬁctitious yield surface

(b)

Fig. 4. The framework can capture anisotropic responses by pro-

jecting anisotropic yield surfaces onto a master projection basis curve

using a neural network stress space mapping ϕNN .

database that is based on the Argyris model [32] and com-

bines the Modiﬁed Cam-Clay [31] hardening mechanism

along with a transformation of the elastic region’s cross-

section from a triangular shape to a circle. The yield surface

is sampled at a total of 50000 points and pre-processed to

generate 750000 level set sample points. The predictions for

the yield surface and underlying level set for increasing ac-

cumulated plastic strain are demonstrated in Fig. 5. Deriving

a mathematical expression for this data set is not straightfor-

ward. Even if the derivation is sucessful, the resultant mathe-

7 Copyright c

by ASME

Increasing accumulated plastic strain

!!!!!!

!

"= !#= !!!

"= !#= !!!

"= !#= !!

!

"= !#= !!!

"= !#= !!!

"= !#= !!

!

"!

"!

"

!

"!

"!

"

!!!!!!

!#!#!#

!#

!#!#

Fig. 5. AI discovered yield surfaces and hardening mechanisms as

evolving level sets from synthetic data. The yield surface and cor-

responding yield function level set are evolved in according to the

increasing accumulated plastic strain.

matical expression might require additional material param-

eters that lacks physics underpinning. The capability of the

neural network to approximate arbitrary function therefore

offers us a ﬂexible and simple treatment to handle the evolu-

tion of yield function.

To analyze the sensitivity with respect to the random

neural network weight initialization, we have repeated the

supervised learning for the synthetic yield function problem

showcased in Fig. 5 ﬁve times, each with a different random

seed. The results, which are shown in Fig. 6, indicate that

the training and the resultant losses for both the training and

testing cases are close. This result suggests that the training

is not very sensitive to the random seeds. Furthermore, the

small difference in the training and testing loss of the yield

function also suggests that there is no signiﬁcant overﬁtting.

Our proposed algorithm also automates the discovery

of yield surfaces for new materials. We generate a yield

surface database for a randomly generated polycrystal mi-

crostructure through efﬁcient data sampling of the invariant

stress space with FFT solver elastoplastic simulations. To

gather the yield surface data points for the polycrystal mate-

rial, we subdivide the π-plane uniformly at 140 Lode’s angles

and sample the stress space with monotonic loading simula-

tions at each angle direction. The yield surface data points

are gathered as soon as yielding is ﬁrst detected, recording

the stress response and the accumulated plastic strain. The

FFT simulations provide 157500 sample points that are pre-

processed into 2362500 level set sample points. It is noted

that the material was observed to be pressure-independent.

Thus, sampling on the π-plane at a constant mean pressure

was enough to capture the entire stress response.

The yield surface data points are pre-processed into a

level set data base and the results of the trained polycrys-

tal neural network yield function are demonstrated in Fig. 5.

The neural network parameters for the new model training

in this section remain identical as previously described. In-

vesting the modeling effort to describe the complex yielding

behavior of a material could be proven futile – especially if

Fig. 6. Loss vs. epoch for the synthetic yield function shown in Fig

5for the training data set (TOP) and the testing data set for cross

validation (BOTTOM). The test data set is mutually exclusive with the

training data set.

Increasing accumulated plastic strain

!!

!!

!!

!

"= !#= !!!

"= !#= !!

!

"= !#= !!

!

"

!

"!

"

!#!#!#

polycrystal

microstructure

Fig. 7. Yield function level set of a new polycrystal microstructure

for increasing accumulated plastic strain.

the material is highly heterogeneous. Conceiving a new yield

function for every new material studied can become rather

impractical and automation in yield surface generation can

accelerate the plasticity study of novel materials.

3.3 Ofﬂine multiscale FFT-FEM numerical experi-

ments

In engineering practice, a constitutive law is seldom

used as a standalone forecast engine but is often incorporated

into a solver that provides a discretized numerical solution.

Here, we test whether the AI-generated models can be de-

8 Copyright c

by ASME

ployed into an existing ﬁnite element solution. The yield

surface neural networks combined with a hyperelastic en-

ergy functional neural network can be readily plugged into

a strain space return mapping algorithm to make strain-stress

predictions. In this work, we utilize a linear elasticity energy

functional as the neural network that will provide the elastic

response in the algorithm. We train a two layer feed-forward

neural network that inputs the elastic volumetric εe

vand de-

viatoric εe

sstrain invariants to approximate the hyperelastic

energy funtional ψe. The network is trained on 2500 data

points sampled from a uniform grid of (εe

v,εe

s)pairs. The ar-

chitecture consists of a hidden Dense layer (100 neurons /

ReLU), followed by two Multiply layers, then another hid-

den Dense layer (100 neurons / ReLU), and an output Dense

layer (Linear). The models were trained for 1000 epochs

with a batch size of 32 using the NAdam optimizer [35], set

with default values in the Keras library. Using a Sobolev

training framework, the model was optimized with a higher-

order H2training objective – the loss function constrains the

predicted energy, stress, and stiffness similar to (9). The re-

sulting stress predictions for the literature yield surfaces for

random cyclic loading and unloading strain paths are demon-

strated in Fig. 2for each approximated yield surface model.

We have also successfully incorporated the trained neu-

ral network plasticity model into a ﬁnite element solver to de-

liver an excellent match with the higher-cost FFT-FEM pre-

dictions for unseen loading paths not included in the training

data set. The discovered yield function for a randomly gen-

erated polycrystal microstructure is demonstrated in Fig. 7.

In Fig. 8, the polycrystal plasticity model trained by a neu-

ral network is used to replace the FFT solver that provides

the constitutive updates from DNS simulations at the sub-

scale level. The simulation is performed on a square plate

with a circular hole supported on frictionless rollers on the

top and bottom surfaces. Results shown in Fig. 8indicate

that the NN-FEM model is capable of replacing the compu-

tationally heavy FFT-FEM simulations (cf. [36]) with a frac-

tion of the cost. In this ofﬂine multiscale problem, the ﬁnite

element contains 960 elements with 2880 integration points.

An FFT-FEM framework may take an average of 11110 (ap-

proximately 3.85 seconds per integration point) seconds to

complete the incremental constitutive updates for all integra-

tion points whereas the neural network counterpart require

an average of 230 seconds (approximately 0.08 seconds per

integration point) to ﬁnish the same task in a MacBook Pro

with 8-core CPU. As for the overhead cost to generate the

training data from the FFT polycrystal simulations, the time

to generate the training data set for the polycrystal yield func-

tion (157500 yield function sample points) is approximately

5 hours.

4 Discussion

The proposed algorithm provides a general approach of

discovering complex yield surface shapes and their evolu-

tion directly from data. In the result section of this work, all

yield functions and hardening mechanisms are predicted by

neural networks without any speciﬁc modeler intervention

!

5

!

6

!

7

!

8

Fig. 8. The discovered yield function can be readily implemented in

FEM simulations, replacing the FFT solver. The accumulated plas-

tic strain proﬁle for an FEM simulation and the predicted stress re-

sponses at different points of the domain against the FFT benchmark

simulations are also shown. The stress measure is in kPa

with hand-crafted derivations. All models in this work, be it

models from the plasticity literature or models designed for

new materials, followed identical data pre-processing, neural

network training, and return mapping implementation proce-

dures.

Our neural network yield functions provide a unique ad-

vantage in crafting interpretable data-driven plasticity mod-

els. The capacity to predict and visualize the entire yield

locus at every time step of an elastoplastic simulation al-

lows for the anticipation of elastic or plastic responses and

the inspection of thermodynamic consistency (e.g convex-

ity). Especially, by adopting a lower-dimensional stress rep-

resentation (Lode’s coordinates), not only is the model com-

plexity reduced but also a transparent yield surface data sam-

pling scheme becomes possible. The alternative of random

sampling of strain paths comes with the uncertainty of sufﬁ-

ciently visiting the yield surface in the entire stress space.

9 Copyright c

by ASME

4.1 Physics underpinning for the partition of elastic and

plastic strain

Decomposing the elastoplastic behavior prediction into

two simple feed-forward neural networks – a hyperelastic en-

ergy functional and a yield function – is central to the al-

gorithm’s interpretability and allows for a clear-cut distinc-

tion of elastic and plastic behavior. This is not necessarily

true with the classical recurrent network approach, such as

the common LSTM or GRU [16,37]. When training neu-

ral networks with these architectures, the elastic and plastic

constitutive responses are often indistinguishable. This treat-

ment does not only cause issues with interpretability but also

renders the black-box predictions vulnerable to erroneous

causality or correlation structures. For instance, experimen-

tal data of the nonlinear elasticity response may actually af-

fect the yielding response as there is no explicit mechanism

to distinguish the two. For instance, the models trained on

monotonic loading data can readily predict non-monotonic

constitutive responses due to the explicitly deﬁned elastic

range whereas the black-box alternative cannot (see Fig. 2).

Furthermore, the recurrent network’s dependency on the in-

put strain rate, the importance of the sampling frequencies in

the time domain, and the more difﬁcult training due to the

vanishing or exploding [38] are rarely addressed in the ma-

chine learning plasticity literature.

Note that the machine learning algorithm proposed here

does not exhibit better interpretability than the hand-craft

counterpart, but is easier to interpret than the RNN and the

multi-step ANN approaches that do not provide deﬁnite dis-

tinction between the elastic region and the yielding. An ex-

ception is the recent work by Huang and Darve [39], in which

the neural network partitioned the total strain into elastic and

plastic components via a partition-of-unity function. Never-

theless, when a continuous weighting function (such as sig-

moid function), is used to partition for the elastic and plastic

strain, it may introduce a transition zone where the materials

are considered both path-independent and path-dependent.

4.2 Representation of parametric space and geometri-

cal interpretation of elastoplasticity models

Another advantage of the interpretable machine learning

approach is that the geometrical interpretation is helpful for

determining the optimal data exploration strategies. Given

the fact both real experiments and direct numerical simula-

tions are often costly to conduct, a Monte Carlo simulation

to randomly sample the parametric space for path-dependent

materials is too costly to be feasible [40]. By introducing

the level set to deﬁne the yielding criterion, however, we can

conceptualize the elastic range as a multi-dimensional ob-

ject in a Euclidean space. This feature may help us to visual

the abstract concept of yielding on a Euclidean space and

help us estimating the sufﬁciency of the data by deﬁning a

proper metrics in the parametric vector space and decide the

distribution of data that helps us better captures important

features such as replicating sharp gradient, determining con-

vexity by checking the Hessian and ensuring connectivity of

the learned models. These tasks are not necessarily impossi-

ble but are difﬁcult to achieve with a black-box model.

4.3 Smoothness of the machine learning plasticity

model

Training the neural networks of this work with a higher

degree of continuity activation functions and higher-order

Sobolev loss function constraints allows one to control the

prediction accuracy of the derivatives of the approximated

functionals. This control of stress gradient to the yield

function is crucial whereas the automatic differentiation

used in the back-propagation can help us generate sufﬁ-

ciently smooth elastoplastic tangent operators suitable for

PDE solvers. On the other hand, classical black-box neural

network elastoplasticity approaches usually do not control

the quality of the derivatives of the trained functions. While

ﬁnite difference methods can be used to approximate the tan-

gent tensor obtained from neural network without Sobolev

training if necessary [41], the smoothness and accuracy of

the approximated tangent cannot be guaranteed. Further-

more, the Sobolev training and higher-order activation func-

tions allow controlling the smoothness and continuity of the

yield surface. This can be a more efﬁcient alternative to the

current practice where a plasticity model with a non-smooth

yield surface either requires speciﬁc algorithmic algorithm

to genrate incremental constitutive updates [42] or modiﬁed

manually into a smoothed version to bypass the numerical

barrier [43,44].

In principle, the approach may generate a sufﬁciently

smooth yield surface in parametric space of different dimen-

sions (e.g. principal stress space, strain space, porosity-stress

space). However, if the yield surface is non-smooth for phys-

ical reasons, then (1) speciﬁc supervised learning algorithms

that detect the singular point and (2) the corresponding spe-

ciﬁc treatment to handle the bifurcated stress gradient of the

yield surface are both necessary. Furthermore, unlike the

classical hand-crafted model or models generated from ge-

ometric learning (see [45]) that are designed for an entire

class of materials of similar but distinctive microstructures,

the proposed algorithm is designed to generate a surrogate

model speciﬁcally tailored for one RVE or specimen.

4.4 Comparison with parameter identiﬁcation of pre-

determined models

Note that, while both parameter identiﬁcation and su-

pervised machine learning involve solving inverse problems

and, in many cases, multi-objective optimization, the pro-

posed approach does not assume speciﬁc forms of equations

a priori for the hyperelasticity energy functional and yield

function. With a sufﬁcient neural network architecture, the

neural network approach may offer more ﬂexibility in ﬁnd-

ing the optimal forms of equations (see universal approxi-

mation theorem [46] ). However, this ﬂexibility comes at the

expense of having to deal with the Banach space (cf. Parhi

and Nowak [47] and Weinan and Wojtowytsch [48]) of much

higher dimensions (of the neural network learned function)

than the Euclidean space for a typical parameter identiﬁca-

tion problem.

10 Copyright c

by ASME

A similar analogy can be drawn between nonparamet-

ric/symbolic regression and polynomial regression where

the lack of predetermined form of the former approach of-

fers greater ﬂexibility but also increases the difﬁculty of the

inverse problem. As demonstrated in the previous work

(cf. Wang et al. [49]), even in the case where the inverse

problem is merely used to determine the optimal set of

choices among a handful of pre-determined components of

the elasto-plasticity model, the additional effort and cost to

solve the combinatorial optimization on top of the CPU time

required to identify the parameter identiﬁcation process can

be enormous.

This complexity motivates us to propose this alternative

paradigm that enable us to learn the elasto-plasticity prob-

lem in a divide-and-conquered manner, i.e., (1) learning the

elasticity ﬁrst, (2) then the initial yield function and (3) the

hardening/softening rules that evolve the yield function, all

with multilayer perceptrons. In the cases we demonstrated

here, there is no need to use recurrent neural networks that

are more difﬁcult to train well and apply regularization than

the simpler multilayer perceptrons [50]. In the future, we

may explore proper ways to generate more complex rules for

the yield function evolution with recurrent neural network,

but this is out of the scope of this study.

5 Conclusion

We propose a generalized machine learning paradigm

capable of generating pressure-sensitive and rate-dependent

plasticity models consisting of with interpretable compo-

nents. The component approach enables geometrical inter-

pretation of the hyperelastic energy and yield function in

the corresponding stress and strain spaces. This treatment

allow us examine thermodynamic constraints through geo-

metrical interpretation (e.g. convexity) and provide a higher

degree of modularity and simulatability required to interpret

mechanisms of plastic deformation. In the numerical exper-

iments presented in this paper, we ﬁrst verify the capacity of

the paradigm to recover existing plasticity models with the

corresponding data. Then we provide additional examples

to show that the revised Hamilton-Jacobi solution formu-

lated for rate-dependent plasticity may generate model from

experimental data for steel. Finally, the machine learning

paradigm is used to generate macroscopic elasto-plasticity

surrogate model from FFT simulations of polycrystal con-

sists of FCC grains. The resultant macroscopic surrogate

model is tested against FFT direct numerical simulation at

the Gauss point. The results of the numerical experiments

that the generated models are able to recover old plasticity

laws but also capable of deducing new ones, with a reason-

able level of predictive and descriptive accuracy for the given

amount of data. This interpretability is necessary for ensur-

ing trustworthiness for engineering applications.

Acknowledgements

The authors would like to thank Dr. Ran Ma for pro-

viding the implementation of the polycrystal microstructure

generation and the FFT solver. The authors are supported by

the NSF CAREER grant from Mechanics of Materials and

Structures program at National Science Foundation under

grant contracts CMMI-1846875 and OAC-1940203, the Dy-

namic Materials and Interactions Program from the Air Force

Ofﬁce of Scientiﬁc Research under grant contracts FA9550-

17-1-0169 and FA9550-19-1-0318. These supports are grate-

fully acknowledged. The views and conclusions contained

in this document are those of the authors, and should not

be interpreted as representing the ofﬁcial policies, either ex-

pressed or implied, of the sponsors, including the Army Re-

search Laboratory or the U.S. Government. The U.S. Gov-

ernment is authorized to reproduce and distribute reprints for

Government purposes notwithstanding any copyright nota-

tion herein.

References

[1] Cheng, Z., Zhou, H., Lu, Q., Gao, H., and Lu, L., 2018.

“Extra strengthening and work hardening in gradient

nanotwinned metals”. Science, 362(6414).

[2] Van der Giessen, E., and Needleman, A., 1995. “Dis-

crete dislocation plasticity: a simple planar model”.

Modelling and Simulation in Materials Science and En-

gineering, 3(5), p. 689.

[3] Aydin, A., Borja, R. I., and Eichhubl, P., 2006. “Geo-

logical and mathematical framework for failure modes

in granular rock”. Journal of Structural Geology, 28(1),

pp. 83–98.

[4] Gurson, A. L., 1977. “Continuum Theory of Duc-

tile Rupture by Void Nucleation and Growth: Part I—

Yield Criteria and Flow Rules for Porous Ductile Me-

dia”. Journal of Engineering Materials and Technol-

ogy, 99(1), 01, pp. 2–15.

[5] Martin, C., Bouvard, D., and Shima, S., 2003. “Study

of particle rearrangement during powder compaction

by the discrete element method”. Journal of the Me-

chanics and Physics of Solids, 51(4), pp. 667–693.

[6] Coulomb, C. A., 1773. Essai sur une application des

regles de maximis et minimis a quelques problemes de

statique relatifs a l´

architecture. Tech. rep., Mem. Div.

Sav. Acad.

[7] Mises, R. v., 1913. “Mechanik der festen k¨

orper

im plastisch-deformablen zustand”. Nachrichten von

der Gesellschaft der Wissenschaften zu G¨

ottingen,

Mathematisch-Physikalische Klasse, 1913, pp. 582–

592.

[8] Drucker, D. C., and Prager, W., 1952. “Soil mechan-

ics and plastic analysis or limit design”. Quarterly of

applied mathematics, 10(2), pp. 157–165.

[9] Sun, X., Bahmani, B., Vlassis, N. N., Sun, W., and

Xu, Y., 2021. “Data-driven discovery of interpretable

causal relations for deep learning material laws with

uncertainty propagation”. Granular Matter, pp. doi:

10.1007/s10035–021–01137–y.

[10] Ehlers, W., and Scholz, B., 2007. “An inverse algorithm

for the identiﬁcation and the sensitivity analysis of the

parameters governing micropolar elasto-plastic granu-

11 Copyright c

by ASME

lar material”. Archive of Applied Mechanics, 77(12),

pp. 911–931.

[11] Wang, K., Sun, W., Salager, S., Na, S., and Khaddour,

G., 2016. “Identifying material parameters for a micro-

polar plasticity model via x-ray micro-computed tomo-

graphic (ct) images: lessons learned from the curve-

ﬁtting exercises”. International Journal for Multiscale

Computational Engineering, 14(4).

[12] Jang, J., and Smyth, A. W., 2017. “Model updating of a

full-scale fe model with nonlinear constraint equations

and sensitivity-based cluster analysis for updating pa-

rameters”. Mechanical Systems and Signal Processing,

83, pp. 337–355.

[13] Ghaboussi, J., Pecknold, D. A., Zhang, M., and Haj-

Ali, R. M., 1998. “Autoprogressive training of neu-

ral network constitutive models”. International Journal

for Numerical Methods in Engineering, 42(1), pp. 105–

126.

[14] Ghaboussi, J., Garrett Jr, J., and Wu, X., 1991.

“Knowledge-based modeling of material behavior with

neural networks”. Journal of engineering mechanics,

117(1), pp. 132–153.

[15] Leﬁk, M., and Schreﬂer, B. A., 2003. “Artiﬁcial neu-

ral network as an incremental non-linear constitutive

model for a ﬁnite element code”. Computer meth-

ods in applied mechanics and engineering, 192(28-30),

pp. 3265–3283.

[16] Mozaffar, M., Bostanabad, R., Chen, W., Ehmann, K.,

Cao, J., and Bessa, M., 2019. “Deep learning predicts

path-dependent plasticity”. Proceedings of the National

Academy of Sciences, 116(52), pp. 26414–26420.

[17] Wang, K., and Sun, W., 2018. “A multiscale multi-

permeability poroplasticity model linked by recur-

sive homogenizations and deep learning”. Computer

Methods in Applied Mechanics and Engineering, 334,

pp. 337–380.

[18] Chi, H., Zhang, Y., Tang, T. L. E., Mirabella, L., Dal-

loro, L., Song, L., and Paulino, G. H., 2021. “Univer-

sal machine learning for topology optimization”. Com-

puter Methods in Applied Mechanics and Engineering,

375, p. 112739.

[19] Baker, N., Alexander, F., Bremer, T., Hagberg, A.,

Kevrekidis, Y., Najm, H., Parashar, M., Patra, A.,

Sethian, J., Wild, S., et al., 2019. Workshop report

on basic research needs for scientiﬁc machine learn-

ing: Core technologies for artiﬁcial intelligence. Tech.

rep., USDOE Ofﬁce of Science (SC), Washington, DC

(United States).

[20] Murdoch, W. J., Singh, C., Kumbier, K., Abbasi-Asl,

R., and Yu, B., 2019. “Deﬁnitions, methods, and appli-

cations in interpretable machine learning”. Proceed-

ings of the National Academy of Sciences, 116(44),

pp. 22071–22080.

[21] Liu, Y., Sun, W., Yuan, Z., and Fish, J., 2016. “A nonlo-

cal multiscale discrete-continuum model for predicting

mechanical behavior of granular materials”. Interna-

tional Journal for Numerical Methods in Engineering,

106(2), pp. 129–160.

[22] Wang, K., and Sun, W., 2016. “A semi-implicit

discrete-continuum coupling method for porous media

based on the effective stress principle at ﬁnite strain”.

Computer Methods in Applied Mechanics and Engi-

neering, 304, pp. 546–583.

[23] Feyel, F., and Chaboche, J.-L., 2000. “Fe2 multi-

scale approach for modelling the elastoviscoplastic be-

haviour of long ﬁbre sic/ti composite materials”. Com-

puter methods in applied mechanics and engineering,

183(3-4), pp. 309–330.

[24] Hartmaier, A., Buehler, M. J., and Gao, H., 2005.

“Multiscale modeling of deformation in polycrystalline

thin metal ﬁlms on substrates”. Advanced Engineering

Materials, 7(3), pp. 165–169.

[25] Vlassis, N. N., and Sun, W., 2021. “Sobolev training

of thermodynamic-informed neural networks for inter-

pretable elasto-plasticity models with level set harden-

ing”. Computer Methods in Applied Mechanics and En-

gineering, 377, p. 113695.

[26] Molnar, C., Casalicchio, G., and Bischl, B., 2018. “iml:

An r package for interpretable machine learning”. Jour-

nal of Open Source Software, 3(26), p. 786.

[27] Coombs, W. M., and Motlagh, Y. G., 2017. “Nurbs

plasticity: yield surface evolution and implicit stress in-

tegration for isotropic hardening”. Computer Methods

in Applied Mechanics and Engineering, 324, pp. 204–

220.

[28] Borja, R. I., 2013. Plasticity: modeling & computation.

Springer Science & Business Media.

[29] Bishop, C. M., et al., 1995. Neural networks for pattern

recognition. Oxford university press.

[30] Czarnecki, W. M., Osindero, S., Jaderberg, M.,

´

Swirszcz, G., and Pascanu, R., 2017. “Sobolev train-

ing for neural networks”.

[31] Roscoe, K., and Burland, J., 1970. “On the generalized

stress-strain behavior of “wet” clay: 60. k. h. roscoe

and j. b. burland. engineering plasticity (papers for a

conference held in cambridge, mar. 1968), cambridge,

university press, 535–609 (1968)”. Journal of Terrame-

chanics, 7(2), pp. 107–108.

[32] Argyris, J., Faust, G., Szimmat, J., Warnke, E., and

Willam, K., 1974. “Recent developments in the ﬁnite

element analysis of prestressed concrete reactor ves-

sels”. Nuclear Engineering and Design, 28(1), pp. 42–

75.

[33] Chollet, F., et al., 2015. Keras. https://keras.io.

[34] Cowper, G. R., and Symonds, P. S., 1957. Strain-

hardening and strain-rate effects in the impact loading

of cantilever beams. Tech. rep., Brown Univ Provi-

dence Ri.

[35] Dozat, T., 2016. Incorporating nesterov momentum

into adam.

[36] Kochmann, J., Wulﬁnghoff, S., Reese, S., Mianroodi,

J. R., and Svendsen, B., 2016. “Two-scale fe-fft-and

phase-ﬁeld-based computational modeling of bulk mi-

crostructural evolution and macroscopic material be-

havior”. Computer Methods in Applied Mechanics and

Engineering, 305, pp. 89–110.

12 Copyright c

by ASME

[37] Fuchs, A., Heider, Y., Wang, K., Sun, W., and Kaliske,

M., 2021. “Dnn2: A hyper-parameter reinforcement

learning game for self-design of neural network based

elasto-plastic constitutive descriptions”. Computers &

Structures, 249, p. 106505.

[38] Pascanu, R., Mikolov, T., and Bengio, Y., 2013. “On

the difﬁculty of training recurrent neural networks”. In

International conference on machine learning, PMLR,

pp. 1310–1318.

[39] Xu, K., Huang, D. Z., and Darve, E., 2021. “Learning

constitutive relations using symmetric positive deﬁnite

neural networks”. Journal of Computational Physics,

428, p. 110072.

[40] Giunta, A., Wojtkiewicz, S., and Eldred, M., 2003.

“Overview of modern design of experiments methods

for computational simulations”. In 41st Aerospace Sci-

ences Meeting and Exhibit, p. 649.

[41] Hashash, Y., Jung, S., and Ghaboussi, J., 2004. “Nu-

merical implementation of a neural network based ma-

terial model in ﬁnite element analysis”. International

Journal for numerical methods in engineering, 59(7),

pp. 989–1005.

[42] de Souza Neto, E. A., Peric, D., and Owen, D. R., 2011.

Computational methods for plasticity: theory and ap-

plications. John Wiley & Sons.

[43] Abbo, A., and Sloan, S., 1995. “A smooth hyperbolic

approximation to the mohr-coulomb yield criterion”.

Computers & structures, 54(3), pp. 427–441.

[44] Matsuoka, H., and Nakai, T., 1974. “Stress-

deformation and strength characteristics of soil under

three different principal stresses”. In Proceedings of

the Japan Society of Civil Engineers, no. 232, Japan

Society of Civil Engineers, pp. 59–70.

[45] Vlassis, N. N., Ma, R., and Sun, W., 2020. “Geomet-

ric deep learning for computational mechanics part i:

Anisotropic hyperelasticity”. Computer Methods in Ap-

plied Mechanics and Engineering, 371, p. 113299.

[46] Scarselli, F., and Tsoi, A. C., 1998. “Universal approxi-

mation using feedforward neural networks: A survey of

some existing methods, and some new results”. Neural

networks, 11(1), pp. 15–37.

[47] Parhi, R., and Nowak, R. D., 2020. “Banach space

representer theorems for neural networks and ridge

splines”. arXiv preprint arXiv:2006.05626.

[48] Weinan, E., and Wojtowytsch, S., 2020. “On the

banach spaces associated with multi-layer relu net-

works: Function representation, approximation the-

ory and gradient descent dynamics”. arXiv preprint

arXiv:2007.15623.

[49] Wang, K., Sun, W., and Du, Q., 2019. “A cooperative

game for automated learning of elasto-plasticity knowl-

edge graphs and models with ai-guided experimenta-

tion”. Computational Mechanics, 64(2), pp. 467–499.

[50] Zaremba, W., Sutskever, I., and Vinyals, O., 2014. “Re-

current neural network regularization”. arXiv preprint

arXiv:1409.2329.

13 Copyright c

by ASME