ArticlePDF Available

Abstract

This paper introduces a neural kernel method to generate machine learning plasticity models for micropolar and micromorphic materials that lack material symmetry and have internal structures. Since these complex materials often require higher-dimensional parametric space to be precisely characterized, we introduce a representation learning step where we first learn a feature vector space isomorphic to a finite-dimensional subspace of the original parametric function space from the augmented labeled data expanded from the narrow band of the yield data. This approach simplifies the data augmentation step and enables us to constitute the high-dimensional yield surface in a feature space spanned by the feature kernels. In the numerical examples, we first verified the implementations with data generated from known models, then tested the capacity of the models to discover feature spaces from meso-scale simulation data generated from representative elementary volume (RVE) of heterogeneous materials with internal structures. The neural kernel plasticity model and other alternative machine learning approaches are compared in a computational homogenization problem for layered geomaterials. The results indicate that the neural kernel feature space may lead to more robust forward predictions against sparse and high-dimensional data.
Computer Methods in Applied Mechanics and Engineering manuscript No.
(will be inserted by the editor)
A neural kernel method for capturing multiscale high-dimensional
micromorphic plasticity of materials with internal structures
Zeyu Xiong ·Mian Xiao ·Nikolaos Vlassis ·WaiChing Sun
Received: July 10, 2023
Abstract This paper introduces a neural kernel method to generate machine learning plasticity models for
micropolar and micromorphic materials that lack material symmetry and have internal structures. Since
these complex materials often require higher-dimensional parametric space to be precisely characterized,
we introduce a representation learning step where we first learn a feature vector space isomorphic to a
finite-dimensional subspace of the original parametric function space from the augmented labeled data
expanded from the narrow band of the yield data. This approach simplifies the data augmentation step
and enables us to constitute the high-dimensional yield surface in a feature space spanned by the feature
kernels. In the numerical examples, we first verified the implementations with data generated from known
models, then tested the capacity of the models to discover feature spaces from meso-scale simulation data
generated from representative elementary volume (RVE) of heterogeneous materials with internal struc-
tures. The neural kernel plasticity model and other alternative machine learning approaches are compared
via a computational homogenization problem for layered geomaterials. The results indicate that the neural
kernel feature space may lead to more robust forward predictions against sparse and high-dimensional
data.
Keywords neural kernel, micropolar continua, micromorphic continua, level set plasticity
1 Introduction
When deriving models to predict path-dependent responses of materials that lack material symmetry
or exhibit complex size-dependent behaviors, the smallest number of variables (e.g., stress measures, stress
invariants, internal variables) required to replicate constitutive responses increases, and the dimension of
the parametric space in which the model is formulated also becomes higher. This increase in variables often
leads to the increase of material parameters due to the need for additional support to control the geometry
in the high-dimensional space.
As such, modelers must decide a trade-off between simplicity and sophistication Dafalias (1984). In-
creasing the number of material parameters is often undesirable or only used as the last measure to cap-
ture the phenomenology precisely (Wang et al.,2016). This preference for simpler models is not limited to
constitutive theories of solids and has a long history in science (p.398, Newton (1687))and philosophy (van
Orman Quine,1963;Howard et al.,2019;Kant,1908). In the early ages of the development of plasticity
theories, experimental data were relatively limited in quantities and lacked precision afforded by the state-
of-the-art instruments (Herle and Gudehus,1999;Gudehus et al.,2008;Dafalias,1984). Hence, assumptions
on material symmetry and limitations on the number of variables used to describe the deformation mech-
Corresponding author: WaiChing Sun
Associate Professor, Department of Civil Engineering and Engineering Mechanics, Columbia University, 614 SW Mudd, Mail
Code: 4709, New York, NY 10027 Tel.: 212-854-3143, Fax: 212-854-6267, E-mail: wsun@columbia.edu
2 Zeyu Xiong et al.
anisms become common strategies that maximize what Dafalias (1984) described as the trade-off between
simplicity and sophistication.
1.1 Previous approaches for modeling low-symmetry, higher-order, and multiphysics coupling plasticities
Increasing the precision of a material model may unavoidably increase the least number of material
parameters necessary to describe the material behaviors. As such, many simpler models that involve more
hypotheses often become the blueprint of more elaborated models of higher dimensions. This approach is
often adopted by plasticity models in, for instance, the following scenarios.
1. There is a need to capture additional causal mechanisms that are not describable with the current exist-
ing variables (Sun et al.,2022). For instance, the anisotropy and non-coaxiality induced by fabric evo-
lution require incorporating the fabric tensors into the constitutive laws Dafalias and Manzari (2004);
Zhao et al. (2005); Dafalias et al. (2006).
2. There is a need to enrich the description of a model such that the model can be further generalized for
broader applications. For instance, the Drucker-Prager model can be viewed as a generalization of the
Von Mises model by introducing the dependence of the yielding behaviors with respect to the pressure
(position of the hydrostatic axis in the principal stress space). de Borst (1993) introduces micropolar
constitutive models for geo-materials by introducing the additional couple stress terms and length
scale parameter in a Drucker-Pragger model. Multi-physics constitutive models, such as Wheeler et al.
(2002), also adopt an extension strategy where a pure solid mechanics model is enhanced by introduc-
ing additional variables such as temperature, degree of saturation, and volume fraction of void that are
necessary to capture the multiphysical coupling (Wheeler et al.,2002;Na and Sun,2017;Yin et al.,2022;
Kebria et al.,2022;Ma and Sun,2020)).
3. Finally, there have also been cases in which a model is amended to circumvent known limitations. A
classic example is the usage of micropolar theory to circumvent the pathological mesh dependence of
plasticity models in the softening regimes. (e.g. (Manzari,2004;Lin et al.,2015;Dietsche et al.,1993)).
Nevertheless, this incremental strategy to amend models exhibits disadvantages. First of all, obtaining
plasticity data from either representative elementary volume simulations or experiments could be costly.
As such, the lack of foresight of the model dimensions may lead to a biased data acquisition strategy
where data might not distribute well enough to calibrate and test the learned model. For models expressed
in higher-dimensional space (e.g., anisotropic elasticity model that requires not just the strain invariants
but also the principal directions, micropolar and micromorphic models that require higher-order terms
to fully describe the kinematics), data may appear to be sparser and hence requires more data points to
characterize the behaviors of comparable complexity (Donoho et al.,2000).
1.2 Previous efforts in machine learning plasticity modeling
Machine learning approaches with advanced architecture, such as the recurrent neural networks for
sequential learning (Gorji et al.,2020;Bonatti et al.,2022;Heider et al.,2020;Wang and Sun,2018), 1D con-
volutional neural network (Vlassis and Sun,2021), and transformer with attention mechanisms (Vaswani
et al.,2017), can be, in theory, trained to replicate these high-dimensional constitutive responses. However,
the vanishing/exploding gradients, the increased computational cost, and the increased demand for data
can all become the bottleneck of these approaches. Furthermore, the higher dimensionality of the model
also makes the learned model more vulnerable to overfitting (incapable of generalized prediction) (Tibshi-
rani,1996) and the results more difficult to interpret properly (He and Chen,2022;Sussillo and Barak,2013;
Vlassis and Sun,2021).
On the other hand, there are multiple attempts to represent the plastic yield surfaces via paramterized
surface or implicit functions. Vlassis and Sun (2021), for instance, introduce the usage of the signed distance
function to implicitly represent the yield surface in the stress-internal-variable space. The additional signed
distance property enables the plastic flow to be a unit gradient which simplifies the calculation of the
plastic multiplier, and allows one to incorporate the plastic flow direction into the Sobolev training of yield
Neural kernel for micromorphic plasticity 3
function. The resultant yield function is then parametrized via an MLP architecture. Meanwhile, Coombs
et al. (2016); Coombs and Motlagh (2017), and Coombs and Motlagh (2018) leverage the flexibility afforded
by non-uniformed rational B-splines to parametrize yield function. Xiao and Sun (2022), on the other hand,
directly parametrize the yield surface as an atlas of coordinate charts.
While the component/modular-based learning approach (Vlassis and Sun,2022;Fuhg et al.,2023) en-
ables one to reformulate the plasticity learning problem without enforcing the memory effect through the
architecture, the challenges of training and interpreting those models remain. Another less explored route
to model plasticity is to use the kernel method in which a nonlinear transformation maps the material data
in a higher-dimensional feature space to make it easier to perform clustering (e.g., unsupervised learning)
or insert hyperplane (e.g., classification). The advantage of the kernel method is that it is generally more
robust (due to the non-parametric nature and the implicit feature mapping without any explicit feature
engineering) and less sensitive to outliers (Williams et al.,2022). More importantly, the construction of
a higher-dimensional feature space also makes it possible to interpret the relations between the features
and the learned plasticity models. However, a key technical barrier of the classical kernel method is that
the performance of the method is highly sensitive to the specific kernel function (e.g., polynomials, radial
basis function) used to generate the feature space. As different data sets may require different kernel func-
tions to achieve good performance, selecting the kernel function becomes a time-consuming trial-and-error
process.
1.3 Neural kernel method for high-dimensional plasticity
This paper introduces a neural kernel approach to generate micropolar and micromorphic plasticity
models. Here we want to leverage the robustness and interpretability of the kernel method afforded by
the feature space while generating a data-dependent kernel tailored to our specific need to capture the
high-dimensional constitutive responses of the materials with complex internal structures. This treatment
provides us with a unified data-driven approach to recognize the pattern of the data (through the data-
dependent kernel), regardless of the data dimensions. Combined with a simple kernel ridge regression,
we may generate a yield function of arbitrary input space dimensions without explicitly handcrafting the
feature space. This trait is input for us to automate the process of generating yield surface from data of
dimensions higher than 3, especially when data is sparse.
1.4 Organization of the rest of the paper
The organization of the rest of the paper is as follows. For completeness, we first review the plastic-
ity theory of micropolar and micromorphic materials and the related Hill-Mandel lemma necessary for
generating the multiscale data for computational homogenization. This review explains the difficulties in
formulating yield surfaces in a high-dimensional stress space (Section 2). We then introduce the neural
kernel method formulated for the high-dimensional space. In particular, we explain both the theory of the
neural kernel method, as well as the strategy we adopted to train both the data-dependent neural network
kernels and obtain the coefficients used to interpolate the yield function in the high-dimensional feature
space. The return mapping algorithm that adopts the neural kernel yield function is also included (Section
3). This is followed by a collection of representative numerical examples in which we provide verifications
and demonstrate how the proposed approach can be applied to complex data obtained from direct numer-
ical simulations of layered pressure-sensitive materials (Section 4). We then summarize our major findings
in Section 5. Additional numerical tests and the detailed procedure necessary for third-party inspection
and validations are provided in the appendix.
2 Constitutive framework for higher-order continua
For completeness, we review the theory of micropolar and micromorphic continua (i.e., higher-order
continua). Cosserat and Cosserat (1909) is credited with introducing the first high-order continuum the-
ory in which the concept of micro-rotation is introduced to describe the effects of internal structures on
4 Zeyu Xiong et al.
the constitutive responses. Different variations, for instance, the couple stress theory (Toupin,1962,1964;
Mindlin et al.,1962;Green and Rivlin,1964), which derives energy density as a function of both strain
and the curl of strain. The generalized micromorphic continuum theory, which introduces the concepts of
micro-deformation and the corresponding energy-conjugate stress measures as a generalization of the kine-
matics, have been introduced in the 1960s (Mindlin,1963,1965;Mindlin and Eshel,1968;Eringen,1968).
The extension of the higher-order continua theory to the finite deformation range has been formulated by
Toupin (1964) where an action density is derived such that it is invariant under the group of Euclidean dis-
placements in a Hamiltonian mechanics framework. M¨
uhlhaus and Vardoulakis (1987) and Peerlings et al.
(2001) further extended the higher-order theories for elastoplasticity problems and examined the regular-
ization effect of higher-order continuum theories. Steinmann (1994) extends the multiplicative kinematics
theory to formulate micropolar elastoplasticity in the geometric nonlinear regime. Recently, the multiscale
micropolar (Larsson and Diebels,2007) and micromorphic (J¨
anicke and Diebels,2010) constitutive mod-
eling has been derived in the finite deformation regime, while Neff et al. (2014) have introduced a linear
relaxed version of micromorphic models to capture the wave propagation in meta-materials. A compre-
hensive review of connections between the higher-order and non-local continuum theories can be found
in Bazant and Jir´
asek (2002).
For simplicity, we restrict our learned models to be within the infinitesimal deformation regime. We
then introduce machine learning to generate three classes of plasticity models based on Cauchy, microp-
olar, and micromorphic continuum theories. The micro-deformation χi j describes the configuration of the
directors, i.e., the internal micro-structures (e.g., voids and inclusions) contained in the material point sam-
pled at an arbitrary position x, as shown in Figure 1. The material point of an effective medium is also
called the representative volume element (RVE). For a Cauchy continuum, the directors are much smaller
than the RVE, so the kinematic configuration of the directors χij mechanically affects the RVE much less
than the strain tensor,
εij = (ui,j+uj,i)/2 (1)
such that the χij can be neglected. In comparison, the micromorphic theory considers that the size effect of
the deformable directors is not negligible. As such χij must be considered to describe the distortion of the
RVE internal structure. Micropolar continua is a sub-class of micromorphic continua in which the directors
can be assumed to be rigid (e.g., rigid inclusions) such that
χij =ϵi jk θk, (2)
where θkdescribes the rigid rotation and ϵijk is the Levi-Civita permutation symbol. Based on the kinematic
configuration of higher-order continua described by uiand χij, the boundary value problems can be solved
given the constitutive laws of micropolar and micromorphic continua, which are reviewed in Section 2.1,
followed by the RVE homogenization scheme that models the constitutive law by multiscale simulation
(Section 2.2) and data-driven approaches to learn the constitutive law (Section 2.3).
2.1 Boundary value problems
To solve the boundary value problems (BVP) of higher-order continua for the kinematic configuration
uiand χij , we need to consider the kinematic relation, balance of linear and angular momentum, and consti-
tutive law as summarized in the upper half of Table 1(Schr ¨
oder et al.,2022). The kinematic relation defines
the strain tensor as the difference between the displacement gradient ui,jand the micro-deformation χij or
micro-rotation ϵijk θk; the gradient of micro-deformation Gijk or the gradient micro-rotation κij (i.e., cur-
vature tensor) are also introduced to describe the higher-order deformation. The balance law of linear and
angular momentum states the relationship between the Cauchy stress σji and higher-order couple stress
mji or generalized stress ζijk , which are computed given εi j,κij , and Gijk based on the constitutive law.
To complete the solution of BVP, the elastoplastic constitutive law in the lower half of Table 1needs to
be found to model the relation between the kinematic modes and the higher-order stresses, i.e., between εij
and σji, between κi j and mji , and between Gijk and ζijk. The constitutive law consists of the elasticity model,
yield function, KKT condition, and plastic flow rule, as summarized in Table 1. The elasticity model finds
the elastic energy functional Wsuch that the elastic stresses are work conjugates of the kinematic modes.
Neural kernel for micromorphic plasticity 5
Fig. 1: A microscopic representative volume element (RVE) is attached to the macroscopic material point
with position vector X; the local coordinate of the RVE is denoted by Y. The directors in the RVE describe
the internal micro-structure (e.g., voids or inclusions), and the deformed configuration of the directors
reflects the distortion of the internal structure.
Table 1: The summary of BVP components of the micropolar and micromorphic materials.
Micropolar Micromorphic
kinematic relation εij =ui,j+ϵi jk θk,κij =θi,jεij =ui,jχi j,Gijk =χi j,k
balance of linear momentum σji,j=0σji,j=0
balance of angular momentum 1
2ϵijk σji mjk,j=0σji +ζijk,k=0
constitutive law (εij ,κij)(σji ,mji ) (εi j ,Gijk)(σji ,ζi jk )
elastoplastic deformation εij =εe
ij +εp
ij ,κij =κe
ij +κp
ij εi j =εe
ij +εp
ij ,Gijk =Ge
ijk +Gp
ijk
elasticity σji =W(εe
ij ,κe
ij )
∂εe
ij ,mji =W(εe
ij ,κe
ij )
∂κe
ij σji =W(εe
ij ,Ge
ijk )
∂εe
ij ,ζijk =
W(εe
ij ,Ge
ijk )
Ge
ijk
yield function f(σji,mji )0f(σji ,ζijk )0
KKT condition ˙
Λf(σji,mji ) = 0 ( ˙
Λ0, f0) ˙
Λf(σji,ζi jk ) = 0 ( ˙
Λ0, f0)
flow rule ˙
εp
ij =˙
Λf
∂σji ,˙
κp
ij =˙
Λf
mji ˙
εp
ij =˙
Λf
∂σji ,˙
Gp
ijk =˙
Λf
∂ζi jk
The core component of the plasticity model (Borja,2013) is the yield function fthat defines the onset of
plastic yielding at f=0. The elastic region in the stress space requires f<0. When the stresses touch the
yield surface, then f=0. The permanent plastic deformation grows at the rate of ˙
Λand in the direction
of gradients of faccording to the associative flow rule, where Λis called the plastic multiplier. The KKT
6 Zeyu Xiong et al.
condition shown in Table 1implies that if the yield surface is not touched, equivalent to f<0, then ˙
Λ=0
and no permanent plastic deformation is growing.
The micromorphic elastoplastic constitutive law can then be derived as the return mapping algorithm,
i.e., given current elastic deformation (εe
ij (t),Ge
ijk (t)) and incremental deformation (εij ,Gijk), finding the
stresses at the next time step (σji(t+t),ζi jk (t+t)).
The first step is to find the trial stresses (σtr
ji ,ζtr
ijk ), assuming the incremental deformation is elastic,
which is equivalent to f(σtr
ji ,ζtr
ijk )<0 as shown in Eq. (3).
σji(t+t)
ζijk (t+t)="σtr
ji
ζtr
ijk #=
W(εe
ij (t)+εi j ,Ge
ijk (t)+Gijk )
∂εe
ij
W(εe
ij (t)+εi j ,Ge
ijk (t)+Gijk )
Ge
ijk
if f(σtr
ji ,ζtr
ijk )<0 (3)
If the incremental deformation is not elastic, i.e., f(σtr
ji ,ζtr
ijk )0, then a correction of the trial stresses
should be made to ensure f(σji (t+t),ζijk(t+t)) = 0, and the final stresses can be found based on the
associative plastic flow rule as shown in Eq. (5). The elasticity tensors are defined as,
Ce,σε
ijmn =2W
∂εij ∂ε mn ,Ce,σG
ijmnl =2W
∂εij Gmnl
,Ce,ζε
ijkmn =2W
∂εmn Gijk
, and Ce,ζG
ijkmnl =2W
Gijk Gmnl
. (4)
The incremental stress update expressed in Voigt notation reads,
σji(t+t)
ζijk (t+t)="σtr
ji
ζtr
ijk #"Ce,σε
ijmn Ce,σG
ijmnl
Ce,ζε
ijkmn Ce,ζG
ijk mnl #εp
mn
Gp
mnl="σtr
ji
ζtr
ijk #Λ "Ce,σε
ijmn Ce,σG
ijmnl
Ce,ζε
ijkmn Ce,ζG
ijk mnl #" f
∂σnm
f
∂ζmnl #. (5)
The two equations above summarize the return mapping algorithm for micromorphic continua. The
micropolar continua is a special case of the micromorphic continua where the micro-deformation is re-
stricted to be rotational only. As such, the return mapping algorithm can be implemented by re-expressing
the micro-rotation gradient κij and the couple stress mji )in terms of the micro-deformation gradient Gijk
and generalized stress ζijk , i.e.,
(Gijk ,ζijk ) = (ϵi jl κlk,1
2ϵijl ml k ). (6)
The micromorphic return mapping algorithm then return the constitutive updates and the gradient of
micro-rotation and couple stress can then be recovered via
(κij ,mji ) = (1
2ϵimn Gmn j,ϵimn ζmn j). (7)
2.2 RVE homogenization based on Hill-Mandel’s condition
The constitutive relation of the material point at position x, shown in Figure 1, is obtained by studying
the RVE domain with the local coordinate Yas shown in Figure 1. The kinematic modes denoted as
¯
εij ,¯
κij , and ¯
Gijk are prescribed by deforming the RVE with the displacement boundary condition shown in
Eq. (8) and Figure 2(Larsson and Diebels,2007;J¨
anicke and Diebels,2010), where the boundary condition
is the linear combination of the prescribed kinematic modes. This boundary condition is admissible, as
proven by Hill’s Lemma in Appendix A.
ui=
¯
εij Yj+1
2¯
Gijk YjYkon for micromorphic continua
¯
εij Yj1
2ϵijl ¯
κlk YjYkon for micropolar continua
¯
εij Yjon for Cauchy continua
(8)
The local boundary value problem is then solved given the prescribed boundary condition to solve for
the local displacement and stress field. The homogenized stress and generalized stress (J¨
anicke and Diebels,
2010) can be computed as Eq. (9), which satisfies the Hill-Mandel’s condition as proven in Appendix A.
Neural kernel for micromorphic plasticity 7
Fig. 2: The characteristic micromorphic kinematic modes.
¯
ζijk =1
VR1
2nlσliYjYkdS =1
VR1
2(σjiYk+σki Yj)dV for micromorphic continua
¯
mji =ϵimn ¯
ζmnj for micropolar continua
¯
σji =1
VRnlσliYjdS =1
VRσjidV for all continua
(9)
2.3 Supervised learning tasks for neural kernel plasticity
The elastoplastic constitutive law can be sufficiently modeled by machine learning by training the elas-
tic energy functional and the plastic yield function independently. The elastic energy functional, W(εe
ij ,κe
ij )or
W(εe
ij ,Ge
ijk ), can be simply modeled by a neural network consisting of multi-layer perceptrons (MLP), with
the strain measures as the input and the elastic stored energy as output as the output. In our implemen-
tation, we adopt the Voigt vectorized notation used in Weeger (2021) to train the neural network elasticity
models. For brevity, the supervised learning of elasticity for micromorphic continua will not be discussed
in great detail. Interested readers may refer to, for instance, Vlassis and Sun (2021). On the other hand, the
supervised learning for the yielding function and the corresponding hardening laws are formulated in the
next section.
Remark 1 Neural Network Architecture The architecture of the MLP, which we adopted in this study, is
shown in Eq. (10). The 3-layer architecture is composed of neurons equipped with Exponential Linear Unit
(ELU) activation function. The activation function of the output layer A3can be ELU or identity map,
depending on how the MLP is used: A3should be an identity map mostly except when the MLP is used to
8 Zeyu Xiong et al.
construct the neural kernel (NK) architecture.
fNN (x) = h3h2h1(x), where
h1(x) = ELU(W1·x+b1)
h2(h1) = ELU(W2·h1+b2)
h3(h2) = A3(W3·h2+b3)
and ELU(x) = xif x0
ex1 if x<0
(10)
This neural network design is used for both the elastic stored energy functional as well as the yield surface
because the derivatives of ELU are sufficiently differentiable. This smoothness may improve the robustness
of the optimization process and alleviates the vanishing and exploding gradient problems.
3 Yield Surface Reconstruction via Neural Kernel (NK) Method
This section describes (1) how to use the neural kernel (NK) method to reconstruct the yield surface for
micromorphic continua and (2) provides the implementation details necessary to incorporate the learned
model into a return mapping algorithm. Here, we represent the yield surface in a multi-dimensional para-
metric space via an implicit scalar signed distance level set function f(x), such that the yield surface ge-
ometry is recovered at f(x) = 0 where xstores the higher-order stress components of the implicit yield
function and the internal variables.
The general framework of our NK method follows the work of Williams et al. (2022) on Neural Kernel
Field (NKF), while we adopt the kernel function architecture as deep neural networks for the generaliz-
ability to arbitrary dimensions. A general workflow of this framework is presented in Figure 3containing
three major steps:
1. Generate the labeled narrow band level set data given the yield stress point xand plastic flow direction
nas discussed in Section 3.1,
2. Train the kernel coefficients for the kernel function associated with the NN-based feature map ϕθ, after
which a two-step training may be needed for the NN weight and bias θto ensure that the sign of the
yield function is correct, as shown in Algorithm 1, and
3. Predict the level set yield function fθ(x)as a linear combination of the basis kernel functions and locate
the surface at fθ(x) = 0 (see Section 3.2).
The workflow shown in Figure 3presents the 2D and 3D views of the surface for the readability pur-
pose, but the NK model is able to reconstruct a much higher-dimensional yield surface. After training the
NK-based yield function, the elastoplastic constitutive law is reproduced by the return mapping algorithm
presented in Section 3.3 and Algorithm 2.
3.1 Data processing
The raw data set for surface reconstruction consists of point coordinates sampled from the ground-
truth surface and the corresponding normal vectors, such that the data set is described in the form of
S={(xi,ni)Rd×Rd}, where xiare point coordinates and niare the surface outward unit normal
vector at xi; in the context of the plastic yield function, xiare yield stress points and niare the plastic flow
direction. For supervision purposes, we create two labeled datasets following the concept of the narrow
band level set (Rosenthal et al.,2011). The first data set Dis generated by perturbating the spatial coordinate
of surface points in the normal direction, and labeling them by the distance to the true surface as follows:
D={(xi, 0)|(xi,ni)S}[{(xi+ϵni,ϵ)|(xi,ni)S}[{(xiϵni,ϵ)|(xi,ni)S}(11)
where ϵis a small number indicating the distance of perturbation for the surface points. The first data set
controls the trained level set function to be zeros at the surface, and the gradient of the function is equal
to the unit normal as shown in Figure 4.In our numerical examples, the data are centered and scaled such
that the average norm of each data point becomes one, and ϵis tuned as a hyperparameter between 0.01
and 0.1.
The second set of labeled data Vis needed to ensure the occupancy condition (Peng et al.,2020), i.e.,
the function should be negative inside the surface and positive outside the surface, and there should not
Neural kernel for micromorphic plasticity 9
!"#$%&'(%)
#*&&"+),*#-
−1
=$
.'&/*(0)1"2#%)
(3"'-)-*%*4)56*&&"+),*#-)
30703)$0%)-*%*4)89!:;"<
;"=:9!;"49!;#<>"?#
@0*%'&0)
A*1
!"#$%&'(%)B0&#03)=*#-)$"370)/"&)?#
9!
$; 9!;%9!
$; 9!;&9!
$; 9!;|+|
!C!D!EFE
++
G2#0*&)("A,2#*%2"#)"/)30703)$0%)B0&#03)/'#(%2"#$
H!:;< I J
'!∈)
K#9!
$;#L!:;<
G0703)$0%)H:;<
/"&+*&-)1&0-2(%2"#
M0("#$%&'(%0-)
.'&/*(0)
H ; I N
TRAINING
PREDICTION
Fig. 3: The workflow of training the neural kernel function with the yield surface data consists of 3 steps:
generating the labeled narrow band level set data, training the kernel coefficients for the kernel function,
and finally forward predicting the level set yield function as a linear combination of the basis kernel func-
tions.
Fig. 4: Sketch of a narrow band level set yield function. The idea is to locally control the gradient around
the boundary at f=0 and globally control the sign of fon both sides of the boundary.
be additional holes in the region enclosed by the surface, as shown in Figure 4.Vis sampled in the d-
dimensional Euclidean space excluding the points on the surface, where yvol =1 if xvol is a point outside
the surface, and yvol =1 otherwise.
V={(xvol ,yvo l )Rd×{−1, 1}|xvol ∈ {x|(x,n)S}} (12)
10 Zeyu Xiong et al.
3.2 Training scheme of NK method
This subsection describes the proposed NK method for surface reconstruction in detail. Before we go
through the general framework of NK, we present a brief overview of classical kernel methods to facilitate
further demonstrations of the NK method. Classical kernel methods for regression problems adopt a pre-
defined kernel function with a set of trainable kernel coefficients αito approximate the function y(x)given
a labeled dataset D={(xi,yi)}as follows:
ˆ
y(x) =
(xj,yj)D
αjKer(x,xj)(13)
where the hat over ˆ
yindicates an approximation of function y. The kernel coefficients αjare trained by
directly solving the following system of linear equations:
(Ker(xi,xj) + λδij )αj=yi,(xi,yi)D(14)
where λis the tunable hyperparameter used for regularization and data denoising.
It is theoretically proven that the conventional kernel method does not accurately make predictions of
unseen features if the spatial dimension dis much smaller than the number of data |D|(Rakhlin and Zhai,
2019). In this sense, the neural network is adopted in kernel methods in order to increase the representation
power of this machine learning model, where a trainable NN ϕθis introduced to create a map from a d-
dimensional input space to an h-dimensional feature space. The kernel regression is then implemented in
the feature space enforcing hand |D|in the same order of magnitude. The ϕθimplemented in Williams
et al. (2022) follows an architecture similar to C-OccNet (Peng et al.,2020) and is not applicable in higher-
dimensional space. For generalizability to higher-dimensional inputs, we establish ϕθas a multi-layer
perceptron (MLP).
Instead of the kernel map of the original feature vector xshown in Eq. (13), the arguments of the kernel
function are replaced by the feature map ϕθ(x). As a result, we introduce the MLP weight θin addition
to coefficients αjas the trainable parameters, and the architecture of the neural kernel fθ:Rd Ris
described as follows:
ˆ
y=fθ(x) =
xjD
αjKNS (ϕθ(xj),ϕθ(x)) (15)
where KNS (x,y) = ψT(x)ψ(y)is the neural spline kernel (Cho and Saul,2009) induced from a single-layer
non-trainable neural network ψ. In this paper, KNS is replaced by K(x,y) = xTysuch that the neural
spline kernel is integrated into ϕθas an additional layer; which results in the following simplified NK
architecture:
ˆ
y=fθ(x) =
xjD
αjϕθT(xj)ϕθ(x)(16)
We next present the loss function L(αj,θ)adopted as follows:
L(αj,θ) =
(xi,yi)D
(yiˆ
y(xi))2+γ
(xvol
i,yvol
i)V yvol
itanh(ˆ
y(xvol
i)
ϵ)!2
, where α
j,θ=argmin
(αj,θ)
L(αj,θ).
(17)
where γis a tunable hyperparameter. By minimizing this loss function, we achieve two goals in recovering
the correct surface geometry: (1) constrain the value of fθto zero and the gradient of fθto fit the actual
surface normal direction, which is satisfied as the first term goes to zero; and (2) enforce the occupancy
condition so that fθpredicts the correct sign inside and outside the yield surface, which is satisfied as the
second term goes to zero. Notice that the second goal characterizes a binary classification problem, but the
conventionally used binary cross entropy (BCE) loss is not included in the loss function, which is because
the usage of logarithm becomes problematic with negative values in this case.
In the training process, αjand θare updated in an asynchronous fashion: αjare trained by directly solv-
ing Eq. (14), but θis updated using gradient-based optimization given fixed αjand L/θ. We summarize
the training routine in Algorithm 1.
Neural kernel for micromorphic plasticity 11
Algorithm 1 The NK training routine for optimizing trainable parameters αand θ
Require: Training data set D, occupancy data set V, learning rate η, hyperparameters λand ϵ.
1. Setup the feature map neural network ϕθ(x)with the trainable parameters θ.
Define ϕθ(x) = ELU (W3·ELU(W2·ELU(W1·x+b1) + b2) + b3).
Initial θ={W1,b1,W2,b2,W3,b3}with pytorch default initializer.
2. Given pairs of feature vectors xi,xjfrom D, compute the kernel matrix Kij
Assemble Kij =ϕθT(xi)ϕθ(xj)for (xi,yi),(xj,yj)D
3. Solve the linear kernel equations for the kernel coefficients αj.
Solve (Kij +λδi j )αj=yifor (xi,yi)D.
4. Given the level set yield function fθ(x), compute the loss function L(αj,θ)and its gradient L/θ.
Conctruct fθ(x) = (xi,yi)DαjϕθT(xj)ϕθ(x).
Predict ˆ
yi=fθ(xi)and ˆ
yvol
i=fθ(xvol
i)for (xi,yi)Dand (xvol
i,yvol
i)V.
Compute L(αj,θ) = (xi,yi)D(yiˆ
y(xi))2+γ(xvol
i,yvol
i)Vyvol
itanh(ˆ
y(xvol
i)
ϵ)2
.
Differentiate L(αj,θ)for L/θwith loss.backward().
5. Update θgiven ηand L/θusing gradient-based optimizer like ADAM.
Run torch.optim.Adam(θ,η).step().
6. Repeat Steps 2-5 until the loss function converges and output fθ(x).
Remark 2 (Hyperparameter Tuning)
1. γ<ϵ<rmin can be a set of good hyperparameters in Eqs. (11) and (17), where rmin is found by first
centering the data, i.e. translating the data such that the centroid goes to the origin, and then computing
the distance from the closest data point to the origin.
2. The output dimension hshould be large enough, which is equivalent to increasing the dimension of
the basis level set function shown in Figure 3. Ideally, hshould be comparable with |D|.
3. Increasing λwould also improve the convergence of the loss function when the data are noisy. In both
examples of this paper, λ=0.01 is used.
In the Sobolev training technique with deep learning, we consider the influence of the derivative of the
neural network function with respect to the network input in the loss function, such that we enforce the
prediction accuracy for the neural network derivative with respect to its input. For the global loss function,
we directly supply the original loss in Eq. (17) with the MSE between the groundtruth and predicted surface
normal directions:
LSob (α,θ) =
(xi,yi)D
(yiˆ
y(xi))2+γ
(xvol
i,yvol
i)V yvol
itanh(ˆ
y(xvol
i)
ϵ)!2
+γ
(xi,yi)D||niˆ
y(xi)
xi||2
(18)
where LSob is the Sobolev loss function we adopt, γis a hyperparameter controlling the influence of
derivative term in the global loss. We will compare NK with the MLP-based level set (MLP-LS) method,
which is documented in Vlassis and Sun (2021); the details of the MLP-LS method are provided in Ap-
pendix C.
3.3 Return Mapping Algorithm
The return mapping algorithm for micromorphic materials is presented in Algorithm 2, which follows
the return mapping theory shown in Eqs. (3) and (5). We further assume that the yield surface is able to
evolve, and such hardening process is governed by an internal variable Λ, i.e., the magnitude of the cumu-
lative plastic strain (plastic multiplier), such that fθ(x,Λ)represents a family of yield functions evolving
12 Zeyu Xiong et al.
with Λ.For micropolar materials, convert the (κij ,mji )into (Gij k,ζijk )=(ϵijl κl k ,1
2ϵijl ml k )before Algo-
rithm 2and convert back by (κij ,mji ) = (1
2ϵimn Gmn j,ϵimn ζmn j)after the return mapping. The notation
3
Gand 3
ζare used to represent the 3rd-order gradient of the micro-deformation tensor and micromorphic
generalized stress tensor respectively.
The implementation of the return mapping algorithm requires the elastic energy functional W(ε,3
G),
the yield function fθ(x,Λ), where xis the vector consisting of the components of σand 3
ζ; given the pre-
trained plastic yield function, the plastic flow direction xfθ(x,Λ)can be derived by pytorch automatic
differentiation and Λfθ(x,Λ) = 0 in the case of perfect plasticity. Unless otherwise stated, fθ(x,Λ)is
derived from the pretrained machine learning model, either from NK or MLP-LS, and W(ε,3
G)comes from
the Sobolev training of elastic energy functional (Vlassis et al.,2020).
Given all the necessary ingredients, Algorithm 2first computes the elastic trial Cauchy and higher-
order stresses following Eq. (3). If yielding is detected by fθ(xtr ,Λ)0, a system of nonlinear equations
following Eq. (5) would be solved to find the vectorized stresses xand the increment in the plastic multi-
plier ∆Λ. Otherwise, the trial states are directly output.
Algorithm 2 Return mapping algorithm for higher-order continuum.
Require: Current internal variable Λ, elastic strain εe
n, elastic gradient of micro-derformation
3
Ge
n, incre-
ments of deformation ε,3
G, yield function fθ(x,Λ), and elastic energy functional W(ε,3
G).
Outputs: Higher-order stresses σn+1,3
ζn+1, and plastic multiplier Λat the next time step.
1. Compute trial elastic strain, trial elastic stress, and elasticity tensor.
Compute
εe,tr
n+1
3
Ge,tr
n+1
="εe
n+ε
3
Ge
n+3
G#.
Compute
σtr
n+1
3
ζtrn+1
="W/ε
W/3
G#at (ϵe,tr
n+1,
3
Ge,tr
n+1).
Vectorize xtr =
σtr
n+1
3
ζtrn+1
.
Matrixize Ce=
2W/ε22W/ε3
G
2W/ε3
G2W/3
G
2
at (ϵe,tr
n+1,
3
Ge,tr
n+1).
2. Check the yield condition and perform return mapping if yield is detected.
if fθ(xtr,Λ)0then
σn+1=σtr
n+1,3
ζn+1=
3
ζtrn+1
else
Solve for xand ∆Λ, such that xxtr +Ce∆Λxfθ(x,Λ+∆Λ)
fθ(x,Λ+∆Λ)=0.
Extract σn+1and 3
ζn+1from x="σn+1
3
ζn+1#, update Λ=Λ+∆Λ.
Neural kernel for micromorphic plasticity 13
Remark 3 Stress integration for non-convex yield surfaces. As pointed out by Lin and Baˇ
zant (1986), there
do exist non-convex yield surfaces in the literature, such as the Argyris yield surface (cf. Argyris et al.
(1974)) and the Barcelona Basic Model (cf. Sheng et al. (2011)), which purposely introduce non-convexity for
the sake of matching the phenomenological responses observed from experiments. As we did not enforce
the convexity of the yield function explicitly, the resultant yield functions (see Figures 11 and 14) are found
to have concave regions. A robust implicit stress integration may require specific treatment to find the first
intersection between the non-convex yield function and an elastic trial stress point (cf. Pedroso et al. (2008);
Sheng et al. (2011). In our case, we follow the treatment in Gl ¨
uge and Bucci (2018), in which an incremental
trial step smaller than the radius of the curvature of the yield surface is used such that the return mapping
algorithm may yield a unique corrected state.
4 Numerical Experiments
In this section, two examples of yield surface reconstruction with MLP-LS and NK are presented, each
followed by a performance evaluation. In the first example, both methods are verified by the analytical mi-
cropolar J2 yield surface (Steinmann,1995), followed by a short case study comparing the performance of
both methods given limited or missing data. In the second example, both methods are validated by a direct
numerical simulation (DNS) data upscaled from finite element simulations. For brevity, we only present
the comparisons of yield surfaces obtained from MLP-LS and MK methods here. The results obtained from
other alternative approaches (e.g., Gaussian Kernel, and NURBS) are presented in the appendix.
4.1 Two-dimensional micropolar J2 plasticity model
The first example applies the MLP-LS and NK methods to reconstruct the micropolar J2 yield surface
(Steinmann,1995) with a data set inferred from the yield function in Eq. (19). This micropolar J2 plasticity
model can be viewed as a generalized version of the classical J2 plasticity, where the additional terms with
respect to the couple stress tensor mand micropolar length scale lare introduced to capture the size effect.
The spatial dimension is reduced to 2D assuming plane stress, i.e. σ33 =0, such that only the components
σ11,σ22,σ12,m13 , and m23 are non-zero in the yield function, where the mean stress p= (σ11 +σ22)/3 and
the deviatoric stress s=σpI.
f(σ,m) = rs:s+m:m
l2r2
3Y(19)
where Yindicates the yield stress. The numerical specimen used for verification follows linear micropolar
elasticity (Steinmann,1995;Vardoulakis,2019),
σ=Ktr(ε)I+2µεdev +2µcεskw , (20)
with bulk modulus K=100
3MPa, shear modulus µ=50MPa, coupled shear modulus µc=0, and yield
stress Y=q2
3MPa. Due to the micropolar effect, the strain tensor is divided into the hydrostatic part
tr(ε)I, deviatoric part εdev and the skew-symmetric part εskw .
The performance of the two methods is examined by visualizing cross sections of the yield surface
and the results of the return mapping algorithm. According to the test results, both methods are able to
detect the yield point on the yield surface accurately and produce correct higher-order stress curves when
the trained yield function is integrated into a return mapping algorithm. An additional case study with
limited and missing data is conducted to evaluate the performance of both methods given low-quality
data.
14 Zeyu Xiong et al.
4.1.1 Yield surface reconstruction and return mapping
The yield data points generated by Algorithm 3in Appendix Bare used to train both the MLP-LS
and the NK models. The MLP-LS model, constructed by a 3-layer MLP shown in Eq. (10) function and 64
neurons in each hidden layer, is trained with the batch size of 1000 for 500 epochs. The NK model, whose
feature map ϕθis constructed by a 3-layer MLP architecture with 64 neurons in each hidden and output
layer, is trained for 500 epochs. The training and validation loss histories are shown in Figure 5, where the
validation loss can be less than the training loss because the training loss function has an additional term
controlling the sign of the yield function, as shown in Eq. (17).
Fig. 5: Training and validation losses when learning the analytical plastic yield function.
The smooth yield surface is reconstructed by the MLP-LS and NK. The cross sections on σ11 -σ22,σ11-
σ12, and m13-m23 planes are shown in Figure 6, where both methods accurately reconstruct a yield surface
that respects the ground truth yield points. Therefore, we consider that both methods are generalizable to
higher-dimensional yield surfaces given a set of sufficient and well-distributed data.
In addition to the yield surface reconstruction, the return mapping algorithm 2is implemented to verify
both MLP-LS and NK methods. The material is loaded elastoplastically in a single kinematic mode, i.e.
one of the uniaxial tension, shear, and bending modes, and then unloaded elastically. The test results of
the return mapping, as shown in Figure 7, indicate that both methods can be integrated into the return
mapping algorithm and are able to produce valid stress curves.
4.1.2 Case study with limited and missing data
It has been shown that given a set of sufficient and well-distributed yield point data, both MLP-LS and
NK methods are able to reconstruct the yield surface accurately, as shown in Figure 6. However, when the
sufficient and well-distributed data are not available, the performance of the two methods is of interest.
Therefore, two sets of non-ideal data are generated from the training set to test the performance of the two
methods. The first data set, called the set of limited data, is sampled from the training set with a limited
size via numpy.random.choice(), such that the data set is as well-distributed as the training set but
has a much smaller data size. The other data set, called the set of missing data, is generated by removing
a cluster of yield point data from the training set, such that the data set has a missing data patch and is
considered poorly distributed.
In this case study, the limited data set with 800 data is sampled from the training set, and the missing
data is generated by removing a cluster of data as shown in Figure 8(MIDDLE), which contains 13665 data
after the data cluster is removed. It is observed from Figure 8that given limited data that follows the same
distribution with the test data, the yield surface can still be reconstructed accurately by both methods,
shown in Figure 8(LEFT); however, when the missing data set is used for training, it is observed in Figure
Neural kernel for micromorphic plasticity 15
surface reconstructed by NK
surface reconstructed MLP-LS
ground truth data
Fig. 6: Cross sections of the analytical yield surface.
benchmark
return map by MLP-LS
return map by NK
Fig. 7: Return mapping results of uniaxial (LEFT), shear(MIDDLE), and bending (RIGHT) tests verify both
MLP-LS and NK methods.
8(MIDDLE and RIGHT) that NK produces a significantly more accurate prediction of the test data and
stress history than MLP-LS, which reflects the robustness against the missing data.
4.2 Multiscale homogenization for anisotropic plasticity of layered clay
The second example presents the application of the neural kernel method to reconstruct micropolar
and micromorphic yield surfaces upscaled from direct numerical simulations. We select an idealized mi-
crostructure commonly used to represent shale, i.e. a layered material consisting of hard and soft con-
16 Zeyu Xiong et al.
Fig. 8: Surface reconstructed given limited data (LEFT), missing data (MIDDLE), and the return mapping
results of uniaxial tension test (RIGHT).
stituents (Deutsch,1989;Sone and Zoback,2013;Na et al.,2017). Readers interested in previous mathemat-
ical and neural network modeling efforts of layered geomaterials may refer to Semnani et al. (2016); Zhao
et al. (2018); Borja et al. (2020) and Xiao and Sun (2022). To the best knowledge of the authors, there has not
yet been any attempt to model the micropolar and micromorphic plasticity of shale via deep learning.
4.2.1 Data preparation
The RVE data is obtained from finite element simulations on a domain consisting of layered clay mate-
rials composed of a hard and a soft constituent with intact interfaces. These layer constituents are assumed
to be Cauchy continua with elasto-plastic behaviors characterized by the classical Cam-Clay model but
with different material parameters. Since the constituents are assumed to exhibit no higher-order effects,
the higher-order effect of the effective medium is stemmed from the spatial heterogeneity of the micro-
structures. The elastic response is captured by the following elasticity energy function
ψe(εe) = p0Crexp εe
v
Cr+3
2µ0(εe
s)2,εe
v=tr(εe),εe
s=||εe1
3εe
v1|| (21)
where p0is the initial pressure, εe
v0is the initial volumetric deformation, µ0indicates a constant shear
modulus, and Cris the elastic re-compression ratio. The yield function with the hardening law is captured
by,
f(σ) = q2
M2+p(ppc),p=tr(σ)
3,q=r3
2||σp1|| (22)
where pcthe preconsolidation pressure, Mis the slope of the critical state line. The hardening law that
governs the evolution of pcis expressed as follows:
pc=pc0exp(εp
v/(CcCr)),εp
v=tr(εp)(23)
where Ccis the plastic compression index and pc0indicates the initial preconsolidation pressure.
This data set is generated from the same finite element domain used in Xiao and Sun (2022). The bound-
ary value problem that generates the data set is solved also by the same finite element solver (cf. Xiao
and Sun (2022)) that employs the deal.ii library (Arndt et al.,2020). To capture the high-order constitu-
tive behaviors, data are collected by applying the admissible boundary conditions, solving local BVP, and
homogenizing the higher-order stresses according to Eqs. (8) and (9) in Section 2.
The DNS yield data are first sampled by loading the RVE at evenly parametrized deformation rate by
Algorithm 4in Appendix B, and the yield points are recorded when permanent deformation is detected.
Neural kernel for micromorphic plasticity 17
Fig. 9: The internal structure of the RVE consists of layers of clay materials, with the material parameters
labeled.
4.2.2 Reconstruction of yield surfaces
We present the yield surfaces in a higher-dimensional stress space. The results are evaluated by com-
paring the accuracy and robustness of the yielding and hardening behaviors from unseen stress paths
predicted by the NK and MLP-LS (which serves as the benchmark model).
To present a yield surface in a higher-dimensional stress space, the cross sections of the yield surfaces
are presented by projecting the surface onto the stress planes defined by the combinations of two stress
components. Two cases are studied: micropolar and micromorphic yield surfaces in higher-dimensional
stress spaces. The results show that both MLP-LS and NK are able to capture the complex features of the
higher-dimensional yield surface where data is sufficient. However, when data is sparsely distributed at
some parts of the yield surface, NK outperforms MLP-LS in terms of extrapolating the unseen data.
4.2.3 Hyperparameters
Different sets of hyperparameters are used to learn the two yield surfaces with different dimensions.
The micropolar yield function consists of 5 stress components, learned by the deep architecture with an
input layer of 5 neurons. The MLP architecture consists of two dense hidden layers of 256 neurons and an
output layer of 1 neuron, while the NK architecture consists of three dense layers of 512 neurons. The MLP-
LS is trained with the batch size of 100 and the learning rate of 0.01 for 1000 epochs, while NK is trained for
500 epochs with the batch size of 200 and the learning rate of 0.0001, with the other hyperparameters being
γ=0.1, ϵ=0.02, λ=0.01. In the micromorphic case, the yield function consists of 9 stress components,
such that the input layer with 9 neurons is used. The MLP with the same hidden dimension is trained with
the batch size of 100 and the learning rate of 0.005 for 1000 epochs, while the NK architecture that consists
of three dense layers of 512, 512, and 1024 neurons is trained for 100 epochs with the batch size of 500,
learning rate of 0.0002, with the other hyperparameters being γ=0.1, ϵ=0.05, λ=0.01.
4.2.4 Training of neural kernels
The training and validation loss histories of different combinations of yield surfaces and models are
shown in Figure 10. The training loss is generally higher than the validation loss due to the additional term
in the training loss function Eq. (17) that controls the sign of the yield function.
The strikes in the loss history are probably because of the mini-batch gradient-descent optimization of
the loss function, where the loss is not guaranteed to be consistently decreasing, and the gradient evaluated
on some data batches may be very large to create the instability of the loss history. In general, the NK has
a higher training loss but a lower validation loss than MLP-LS which is more likely to overfit the data.
The validation loss is defined by |fθ(xi)|given the test data xi, which does not reflect the accuracy of
prediction defined by ||xiˆxi||, where ˆxiis the predicted yield point (fθ(ˆxi) = 0), but the convergence of
the validation loss reflects a decent accuracy of prediction.
18 Zeyu Xiong et al.
100101102103
Epoch
10 5
10 4
10 3
10 2
10 1
Loss
Micropolar
NK train
LS train
NK validation
LS validation
100101102103
Epoch
10 4
10 3
10 2
10 1
Loss
Micromorphic
NK train
LS train
NK validation
LS validation
Fig. 10: Training and validation losses of different combinations of yield surfaces and models. The microp-
olar and micromorphic yield surfaces can be learned by either NK or MLP-LS (LS).
4.2.5 Micropolar yield surfaces
Since the yield surface for higher-order continua depends on more than 3 variables, it is not feasible
to fully visualize the learned yield surface geometrically in a three-dimensional space. As such, we pro-
jected the high-dimensional yield surface onto 2D stress planes (where the rest of the stress components
and internal variables are fixed) to demonstrate the geometrical features of the yield surfaces in the high-
dimensional models. We then further examine the results via 2D prediction vs. ground-truth plots, samples
of stress paths, and stress-strain curves for individual stress/strain components.
Micropolar yield surfaces projected onto 2D stress planes. The micropolar yield surface is visualized by
the cross sections projected on 10 stress planes of different combinations of stress components, as shown in
Figure 11; the ground truth yield data are projected to the 10 stress planes as well and compared with the
yield points predicted by NK and MLP-LS. We observe that both NK and MLP-LS are able to capture the
complex features of the higher-dimensional yield data. Furthermore, The yield surfaces parametrized by
both approaches are geometrically similar in most regions, except for a few locations where the training
data is sparse, e.g., the top left corner of the yield surface projected to the first stress plane in Figure 11.
This convergence of the learned function indicates that the data is sufficiently abundant to constrain the
geometry of the yield surfaces. As such, the feature extraction enabled by the data-dependent kernel of NK
does not lead to significant difference in the data-rich regions.
To have a better understanding of the difference between the two predicted surfaces, we inspect the
yield surface projected on the first stress plane and generate two additional unseen data as shown in Figure
12. It is observed that although MLP-LS fits the training data more accurately, NK outperforms MLP-
LS when extrapolating the missing data. The same phenomenon is observed in Figure 8in the previous
example. The reason why NK is better at extrapolating is probably that the function space of the NK model
is spanned by a controlled number of basis kernel functions as shown in Figure 3, such that the pattern
of the yield surface is recognized by the linear Galerkin projections of the data to the finite-dimensional
function space; the yield surface can then be extrapolated more reasonably based on the learned pattern.
Accuracy on unseen data. The unseen test yield data (sampled within the identical distribution of the train
data) are compared with the yield point predictions component-wisely in Figure 13.In this Prediction vs.
Ground Truth figure, a perfect result would be a straight line with a 45-degree inclination angle. In our case,
Neural kernel for micromorphic plasticity 19
Predicted Surface by MLP-LS
Predicted Surface
f
= 0.05 by MLP-LS
Predicted Surface
f
= + 0.05 by MLP-LS
Predicted Surface by NK
Predicted Surface
f
= 0.01 by NK
Predicted Surface
f
= + 0.01 by NK
Test Data Projection
Test Data on Plane
Fig. 11: Cross sections of the micropolar yield surface reconstructed by MLP-LS and NK.
Predicted Surface by MLP-LS
Predicted Surface by NK
Train Data
Unseen Data
Fig. 12: Considering the σ11-σ22 plane, although MLP-LS is able to fit the training data more accurately, NK
is able to better extrapolate the surface with missing data and fit the unseen data more accurately.
the micropolar yield surface generated by MK and MLP-LS both demonstrate sufficient accuracy, with the
MLP-LS performing slightly better. This performance difference could be attributed by the different ways
NK and MLP-LS parameterize the yield surface. In the NK case, the inductive bias obtained from the
training is generated via the finite-dimensional space spanned by the data-dependent basis kernels which
20 Zeyu Xiong et al.
is used to express the yield surface. In the MLP-LS case, the yield surface is directly parametrized by the
neural network weights. This setting is less restrictive than the kernel approach where the learned yield
function must be an element of the space spanned by the kernel basis. As such, assuming that the data
is sufficiently populated in this micropolar experiment, the performance gain of the MLP-LS might have
been attributed to the higher expressivity of the MLP neural network (Raghu et al.,2017).
Test Data Predicted by MLP-LS
Test Data Predicted by NK
Fig. 13: Micropolar test data compared component-by-component with predictions by MLP-LS and NK.
4.2.6 Micromorphic yield surfaces
Compared to the micropolar case, the dimension of the micromorphic yield surface is higher. With
9 stress components and an internal variable as the input, the yield surface would require significantly
more data to populate the parametric space. Theoretically, the number of data needed to produce sim-
ilar performance increases exponentially with the dimension (Mohri et al.,2018), which means if 103is
needed to reconstruct the micropolar yield surface accurately, then approximately 106data is needed in
the micromorphic case in order to reach the similar performance. This data sparsity manifested by the
high dimensionality might contribute to the fact that there are so few attempts to hand-craft yield surfaces
for micromorphic continua and most of them are simply an extension of the Cauchy continuum counter-
part. As such, many of the micromorphic and micropolar simulations are often related to FEM2multiscale
approach that is computationally expensive (Geers et al.,2010;H¨
utter,2019). Since it is increasingly unre-
alistic to always expect that we will have sufficient data when we increase the dimensionality of the model
to incorporate the micromorphic effect, the robustness of the learning algorithms in the sparse data regime
becomes critical.
In this micromorphic example, 16904 data points (split into 13500 training data and 3404 test data) are
used to train and test the model. In other words, we have only increased the training data by 2.25 times (cf.
Appendix B.2) whereas the data demand is expected to be increased by 3 orders. This sparsity of data is
intentional, as we would like to investigate whether the feature space generated from the data-dependent
kernel may enable us to generate a more robust inductive bias for extrapolation when the data is limited.
Micromorphic yield surfaces projected onto 2D stress planes.We first examine the NK and MLP-LS yield
surfaces by projecting them onto 36 different stress planes. By comparing Figure 14 with Figure 11, this
micromorphic data is distributed in a much sparser manner due to the increased dimensionality. In many
of the planes shown in Figure 14, there is not even one single data point on the same plane (the orange
point(s)). The sparsity showcased in these 2D planes indicates that the inductive bias inferred from the
rest of the data becomes the only dominant factor dictating the prediction accuracy in those data-missing
regions.
At the location regions far away from the DNS data (see σ22-ζ111 ,m23-ζsym
212 , and ζ111-ζsym
212 planes.), MLP-
LS method tends to generate yield surfaces of more complex shapes that vary significantly among different
2D stress planes. On the other hand, the NK yield functions (see Figure 14) have significantly less concave
Neural kernel for micromorphic plasticity 21
regions and generally maintain a convex shape. The yield surface projected on different stress planes also
exhibits more consistent geometrical patterns than those obtained from MLP-LS. This difference in the
resultant yield functions is attributed to the sparsity of the data, which makes the learned function depend
more significantly on the hypothesis sets employed by the NK and MLP-LS models (cf. Mohri et al. (2018)).
Remarkably, in this data-limited region, the NK model seems to be capable of exploiting the structure
and similarity of the data in the feature space. This exploitation on the structure and similarity of data
seems to be helpful in preventing overfitting (and hence the less complex yield surface) as well as enabling
the learned model to be consistent with the underlying physical laws obeyed by the data. In particular,
while both the NK and MLP-LS algorithms are subjected to the DNS data set compatible to the thermody-
namics principles, only the NK model yields a convex yield surface compatible with the thermodynamics
constraint where data are sparse. This result is particularly interesting, because the convexity of the yield
function has not been explicitly enforced via loss function or specific neural network architecture design.
Accuracy on unseen data. As shown in Figure 15, both NK and MLP-LS are able to predict the micromor-
phic test data with reasonable accuracy in the unseen data. However, due to the high dimensionality and
sparsity of the data, the similar accuracy in predicting a limited set of test data does not necessarily imply
the similarity in the geometry of the learned yield function.
As such, the test data sampled within the same distribution of the training may not be sufficient in
painting a complete picture of the prediction performance. Instead, an adversarial sampling on regions
distant from the training data may provide more useful insight on the robustness of the model, as shown
in numerical examples demonstrated in Figures 8and 12.
4.2.7 Validation exercise for constitutive responses along unseen loading paths
This subsection shows the hardening process by the evolution of the yield surface and validates the NK
return mapping algorithm (Algorithm 2) by the benchmark DNS stress history. To realistically reproduce
the stress curve, the hardening process is modeled by introducing the magnitude of cumulative plastic
strain as the internal variable, equivalent to an additional dimension of the higher-order yield surface; the
initial yield surface expands as the internal variable increases. To further validate the constitutive modeling
with hardening, the benchmark DNS stress history as the ground truth is first generated by controlling
the deformation rate and solving the local BVP with the prescribed time-dependent boundary conditions.
Another stress history is then produced by the NK return mapping algorithm and validated by the DNS
benchmark. Given that the micropolar and micromorphic data could be too sparse, the DNS simulations
that generate the data are run sequentially with the machine learning step described in Section 3. This
adaptive strategy enables us to sample additional stress data at the locations where local support of the
NK-based yield function is needed to amend the neural kernels. Note that a more rigorous active sampling
strategy could potentially be derived via deep reinforcement learning, as shown in Wang et al. (2021);
Villarreal et al. (2023). The rational design of experiments or sampling of data via deep reinforcement
learning may potentially improve the robustness of the learned yield surface, but is out of the scope of
this study. Two return mapping examples are presented, one with the micropolar and the other with the
micromorphic constitutive model.
As shown in Algorithm 2, the implementation of the NK-enabled return mapping algorithm requires
two ingredients: the elastic energy functional and the plastic yield function that evolves with an internal
variable. The elastic energy functional with respect to strain and higher-order kinematic modes is first
trained by Sobolev training with the loss function of Eq. (24), where the predicted elastic energy and its
gradients are both constrained (cf. (Vlassis and Sun,2021)). To simulate the hardening process upon yield-
ing, the yield function is expressed as a function of both the individual stress components and internal
variable Λis trained via the NK Algorithm 1where Λis the magnitude of cumulative plastic strain.
L=
i|Wiˆ
Wi|2+i||σiˆ
Wi
ε||2+i||miˆ
Wi
κ||2in micropolar case
i|Wiˆ
Wi|2+i||σiˆ
Wi
ε||2+i||3
ζiˆ
Wi/3
G||2in micromorphic case
(24)
The micropolar return mapping example is first presented to introduce a test case with the microstrain
being neglected. The deformation rate is controlled such that ˙
ε11 =˙
ε22 =6e7, ˙
κ32 =9.5e5 for t<200
22 Zeyu Xiong et al.
Fig. 14: Cross sections of the micromorphic yield surface reconstructed by MLP-LS and NK.
and the opposite deformation rate is applied for t>200, where tis the pseudo time. The stress history
is then simulated from DNS under the elastoplastic loading and elastic unloading with the prescribed
boundary condition in Eq. (8); all the stress and couple stress components are homogenized. The stress
Neural kernel for micromorphic plasticity 23
Predicted Surface by MLP-LS
Predicted Surface
f
= 0.05 by MLP-LS
Predicted Surface
f
= + 0.05 by MLP-LS
Predicted Surface by NK
Predicted Surface
f
= 0.01 by NK
Predicted Surface
f
= + 0.01 by NK
Test Data Projection
Test Data on Plane
Fig. 14: Cross sections of the micromorphic yield surface reconstructed by MLP-LS and NK.
history is projected onto the stress plane of the combination of each stress component and m23, which is the
component conjugate to the main kinematic mode κ32. The yield surfaces with different internal variables
are reconstructed by NK and projected to the same stress planes, as shown in Figure 16. It is observed that
24 Zeyu Xiong et al.
Test Data Predicted by MLP-LS
Test Data Predicted by NK
Fig. 15: Micromorphic test data compared component-by-component with predictions by MLP-LS and NK.
as the stress propagates, the internal variable increases, and the yield surface evolves such that the elastic
region expands.
Yield Surface =1
e
5
Yield Surface =2
e
5
Yield Surface =3
e
5
Stress Path
Fig. 16: Micropolar yield surface evolution with stress path projected to the stress planes consisting of
different micropolar stress component combined with m23, where m23 is the stress components conjugate
to the micropolar kinematic deformation κ32.
In addition to the DNS stress history as the ground truth, another stress history from the return map-
ping algorithm of NK is also simulated by prescribing the same deformation history. The material is first
loaded elastically such that the stress increments follow the elastic energy functional W(ε,κ); when the
yield is detected, the return mapping algorithm is initiated and the internal variable is incremented to
model the hardening process; finally, the elastic unloading is applied, and the stress decreases with the
curve showing the permanent plastic deformation. Since some stress components remain zero as shown in
Figure 16, only the nonzero components are presented in Figure 17, and the consistency in the component-
wise comparison validates the micropolar NK return mapping.
Neural kernel for micromorphic plasticity 25
DNS benchmark
NK return map
Fig. 17: The history of micropolar stress components is accurately predicted given the history of the kine-
matic deformation κ32.
One interesting finding observed from Figure 17 is that the bending kinematic mode can affect the
Cauchy stress. Most of the handwritten models are chiral, i.e. higher-order kinematic modes and stresses
being independent of the Cauchy stress and strain (Larsson and Diebels,2007) to avoid modeling the
complex coupling effect of the non-chiral material. The geomaterial we are studying is a non-chiral material
according to the stress pattern shown in Figure 18, because under the bending mode, the tension part starts
to yield such that the stress stops increasing, while the compression part remains elastic with the normal
stresses decreasing. As a result, the homogenized σ11 and σ22 decrease under the higher-order modes, and
our NK method is able to capture this non-chiral effect.
Fig. 18: Patterns of normal stress σ11,σ22, and σ12 indicate that the geo-material is nonchiral.
In the micromorphic case, the stress history involves more components due to the additional kinematic
modes. The deformation modes are controlled by the rate of ˙
ε11 =˙
ε22 =6e7, ˙
G222 =3.5e4 for
t<200 and the opposite rate is applied for t>200. The stress history is then simulated from DNS under
the elastoplastic loading and elastic unloading with the prescribed boundary condition in Eq. (8); all the
stress and couple stress components are homogenized. The stress history is projected onto the stress plane
of the combination of each stress component and ζ222. The yield surfaces evolving with the internal variable
are reconstructed by NK and projected to the same stress planes, as shown in Figure 19.
26 Zeyu Xiong et al.
Yield Surface = 1
e
5
Yield Surface = 1.3
e
5
Yield Surface = 1.6
e
5
Stress Path
Fig. 19: Micromorphic yield surface evolution with stress path projected to the stress planes consisting of
different micromorphic stress components combined with ζ222, where ζ222 is the stress component conju-
gate to the micromorphic kinematic deformation G222.
Similar to the micropolar case, the stress history simulated by DNS and predicted by NK return map-
ping under elastoplastic loading and elastic unloading, with more kinematic modes involved. The non-zero
components of the stress history are compared and found to be consistent as shown in Figure 20, which
validates the micromorphic NK return mapping algorithm and shows the robustness of the NK method in
modeling higher-order plasticity problems.
5 Conclusions
This paper presents the NK and MLP-LS method to reconstruct the higher-order yield surface in order
to model the path-dependent constitutive law of a material with a complex micro-structure, e.g., a layered
geo-material. The DNS constitutive data are first collected by solving local BVPs over the RVE domain,
and then used to train the MLP-based elastic energy functional and NK-based narrow band yield function,
which reproduce the path-dependent constitutive relation via return mapping algorithm.
Two examples are presented to evaluate the performance of NK compared to MLP-LS. In the first ex-
ample, the NK and MLP-LS are both able to reproduce a simple analytical plasticity yield model, verified
by the yield surface and path-dependent stress curves. However, in the following case study, NK signifi-
cantly outperforms MLP-LS in the accuracy of the yield surface and the stress curves reproduction, given
the poorly distributed data with missing patches. In the second example, the two methods are validated
by the micropolar and micromorphic DNS constitutive data of a layered geo-material. It is observed from
the micropolar yield surface that NK outperforms MLP-LS in extrapolating unseen data, which is also ob-
served in the first example. We consider the reason is that NK learns the pattern of the higher-dimensional
surface via projecting the data onto a finite-dimensional kernel function space and extrapolates the data
from the learned pattern. The micropolar return mapping results show that the material is non-chiral (i.e.
Cauchy and higher-order stresses are coupled), which is also observed from the distortion of the internal
RVE structure; the NK method is able to model the non-chiral effect that is hard to reflect by the handwrit-
ten models. The micromorphic results of yield surface reconstruction and return mapping further show
Neural kernel for micromorphic plasticity 27
DNS benchmark
NK return map
Fig. 20: The history of micromorphic stress components is accurately predicted given the history of the
kinematic deformation G222.
that the NK method is generalizable to a much higher-dimensional stress space and decently reproduces
the path-dependent constitutive responses given limited and missing data.
6 Acknowledgments
The authors would like to thank the three anonymous reviewers for the constructive feedback and sug-
gestions. WCS and NNV are primarily supported by the Department of Energy DE-NA0003962. The rest
of the authors are supported by the National Science Foundation under grant contracts CMMI-1846875
and the Dynamic Materials and Interactions Program from the Air Force Office of Scientific Research un-
der grant contracts FA9550-21-1-0391 with the major equipment supported by FA9550-21-1-0027, and the
MURI Grant No. FA9550-19-1-0318. These supports are gratefully acknowledged. The views and conclu-
sions contained in this document are those of the authors, and should not be interpreted as representing
the official policies, either expressed or implied, of the sponsors, including the U.S. Government. The U.S.
Government is authorized to reproduce and distribute reprints for Government purposes notwithstand-
ing any copyright notation herein. The views and conclusions contained in this document are those of the
authors, and should not be interpreted as representing the official policies, either expressed or implied, of
the sponsors, including the Army Research Laboratory or the U.S. Government. The U.S. Government is
authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright
notation herein.
Appendix A Some mathematical background of higher-order continua
This section provides the mathematical derivation of the theory in Section 2. We start with the proof of
Hill’s Lemma of micromorphic, micropolar, and Cauchy materials, which is used to derive the admissible
28 Zeyu Xiong et al.
boundary condition for the higher-order continua. Based on the admissible boundary condition, the stress
homogenization scheme satisfying Hill-Mandel’s condition is derived.
A.1 Proof of Hill’s Lemma
The higher-order continuum is studied on an RVE domain with the local coordinate yas shown
in Figure 1. Hill’s Lemma of the micromorphic continuum can be derived as Eq. (25), where the notation
¯·=1
VR·dV.
σjiεi j ¯
σji ¯
εij +ζi jk Gijk ¯
ζijk ¯
Gijk =1
VZ(nkσki nk¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl)dS
+1
VZ(nkζijk nk¯
ζijk )(χij ¯
Gijl Yl)dS +¯
σji ¯
χij
(25)
The Hill’s Lemma shown in Eq. (25) contains two terms of surface integral. To prove the Hill’s Lemma,
the two terms are expanded separately by the divergence theorem. The first term of Eq. (25), named ”part
1”, is the surface integral regarding Cauchy stress. The ”part 1” surface integral can be expanded into two
volume integrals by the divergence theorem and product rule, with the first volume integral being zero,
given σki,k=¯
σki,k=0 as shown in Eq. (26).
part 1 :=1
VZ(nkσki nk¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl)dS
=1
VZ(σki,k¯
σki,k)(ui¯
ui,jYj1
2¯
Gijl YjYl)dV +1
VZ(σki ¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl),kdV
=1
VZ(σki ¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl),kdV
(26)
The ”part 1” can be further expanded based on Eq. (26), and some terms can be canceled given R(σki
¯
σki)dV =0 and ¯
Gijk =¯
Gikj , such that the results can be simplified as shown in Eq. (27).
part 1 =1
VZ(σki ¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl),kdV
=1
VZ(σki ¯
σki)ui,kdV 1
VZ(σki ¯
σki)dV (¯
ui,jδjk )1
VZ(σki ¯
σki)( 1
2¯
Gijl YjYl),kdV
=σkiui,k¯
σki ¯
ui,k1
2¯
Gijl
1
VZ(σki ¯
σki)(δkj Yl+δljYk)dV (given Z(σki ¯
σki)dV =0)
=σjiui,j¯
σji ¯
ui,j¯
Gijk
1
VZσjiYkdV (given ¯
Gijk =¯
Gikj )
(27)
The second term of Eq. (25), named ”part 2”, is the surface integral involving the micromorphic gen-
eralized stress. The ”part 2” surface integral can be expanded by the divergence theorem and simplified
similarly to the previous derivation.
part 2 :=1
VZ(nkζijk nk¯
ζijk )(χij ¯
Gijl Yl)dS
=1
VZnkζijk χi j dS +1
VZnk¯
ζijk ¯
Gijl YldS 1
VZnk¯
ζijk χi j dS 1
VZnkζijk ¯
Gijl YldS
=1
VZ(ζijk,kχij +ζi jk Gijk )dV +¯
ζijk ¯
Gijk ¯
ζijk ¯
Gijk 1
VZζijk,k¯
Gijl YldV ¯
ζijk ¯
Gijk
=ζijk Gi jk ¯
ζijk ¯
Gijk +1
VZζijk,k(χij ¯
Gijl Yl)dV
(28)
Neural kernel for micromorphic plasticity 29
Adding the equations Eqs. (27) and (28), the Hill’s lemma Eq. (25) can be proven. Given the balance
equation σji +ζijk,k=0, all additional terms can be canceled other than the final result consisting of 5
terms of volume integrals, as shown in Eq. (29). This concludes the proof of Eq. (25).
1
VZ(nkσki nk¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl)dS +1
VZ(nkζijk nk¯
ζijk )(χij ¯
Gijl Yl)dS
=σjiui,j¯
σji ¯
ui,j¯
Gijk
1
VZσjiYkdV +ζijk Gi jk ¯
ζijk ¯
Gijk +1
VZζijk,k(χij ¯
Gijl Yl)dV
=σjiεi j ¯
σji ¯
εij +ζi jk Gijk ¯
ζijk ¯
Gijk ¯
σji ¯
χij +1
VZ(σji +ζijk,k)(χij ¯
Gijl Yl)dV
=σjiεi j ¯
σji ¯
εij +ζi jk Gijk ¯
ζijk ¯
Gijk ¯
σji ¯
χij
(29)
A.2 Hill’s Lemma for micropolar and Cauchy continua
In the case where the micropolar continuum is considered, the generalized stress is replaced by the
couple stress mji =ϵimn ζmn j, and the micro-deformation is replaced by the micro-rotation θi=1
2ϵijk χjk,
where ϵijk is the Levi-Civita permutation symbol. As a result, the curvature tensor becomes κij =θi,j=
1
2ϵimn Gmn j, and Hill’s lemma can be reduced to Eq. (30).
σjiεi j ¯
σji ¯
εij +mji κi j ¯
mji ¯
κij =1
VZ(nkσki nk¯
σki)(ui¯
ui,jYj+ϵijk ¯
κkl YjYl)dS
+1
VZ(nkmki nk¯
mki )(θi¯
κij Yj)dS ϵi jk ¯
σji ¯
θk
(30)
For the Cauchy continuum, the Hill’s Lemma is further reduced to Eq. (31) by neglecting the micro-
deformation and the higher-order generalized stress.
σjiεi j ¯
σji ¯
εij =1
VZ(nkσki nk¯
σki)(ui¯
ui,jYj)dS (31)
A.3 Admissible RVE boundary conditions
The admissible boundary condition of the RVE can be derived from the Hill’s Lemma, which should
satisfy the Hill-Mandel’s condition in the form of Eq. (32), such that the left-hand side of Eq. (25) vanishes.
σjiεi j ¯
σji ¯
εij +ζi jk Gijk ¯
ζijk ¯
Gijk =0 (32)
The Hill-Mandel’s condition is satisfied if the 3 terms of the right-hand side of Eq. (25), including two
surface integrals and one volume integral, become zero, as shown in 33.
1
VR(nkσki nk¯
σki)(ui¯
ui,jYj1
2¯
Gijl YjYl)dS =0
1
VR(nkζijk nk¯
ζijk )(χij ¯
Gijl Yl)dS =0
¯
σji ¯
χij =0
(33)
One admissible boundary condition that satisfies the Hill-Mandel’s condition can be derived based on
the macroscopic strain tensor ¯
εij and the gradient of micro-deformation ¯
Gijk (Larsson and Diebels,2007;
J¨
anicke and Diebels,2010). In the case where the base materials of RVE are Cauchy continuum (i.e. Cauchy
to higher-order upscaling), χij =0 internally and ¯
χij =0, such that ¯
ui,j=¯
εij , the boundary condition
shown in Eq. (34) can be derived, which satisfies Hill Mandel’s condition by satisfying Eq. (33).
30 Zeyu Xiong et al.
ui=¯
εij Yj+1
2¯
Gijk YjYkon
χij =¯
Gijk Ykon (34)
The 3rd-order tensor Gijk is further split into the symmetric part Gsym
ijk = (Gijk +Gjik )/2 and the skew-
symmetric part Gskw
ijk = (Gijk Gjik )/2, such that the higher order modes can be separated into micropolar
bending modes and microstrain modes. The micropolar behavior of the material can be studied by pre-
scribing the bending modes κij =θi,j=1
2ϵimn Gmn j =1
2ϵimn Gskw
mnj . As a results, all the characteristic
kinematic modes in 2D case are presented in Figure 2, such that the prescribed boundary condition is the
linear combination of those characteristic modes.
In the special case when the micropolar or Cauchy continuum is considered, the boundary condition is
reduced to the form of Eq. (35), where ¯
Gijk becomes ϵi jl ¯
κlk for micropolar continuum and becomes zero for
Cauchy continuum.
ui=¯
εij Yj1
2ϵijl ¯
κlk YjYk,θi=¯
κij Yjon for micropolar continuum
ui=¯
εij Yjon for Cauchy continuum (35)
A.4 Homogenization based on Hill-Mandel’s condition
The homogenization of stress and generalized stress should still satisfy Hill Mandel’s condition in
the case of Cauchy to micromorphic homogenization. In the case of Cauchy to higher-order upscaling,
the ζijk Gi jk term in Eq. (32) vanishes because ζi jk vanishes in Cauchy continua and Gi jk is redefined as
Gijk =ui,jk. The Hill Mandel’s condition can be rewritten in the form of Eq. (36).
σjiεi j =¯
σji ¯
εij +¯
ζijk ¯
Gijk (36)
The left hand side of Eq. (36) can be rewritten into the surface integral terms by divergence theorem as
shown in Eq. (37).
σjiεi j =1
VZσjiui,jdV =1
VZnlσliuidS (37)
Based on the admissible boundary condition shown in Eq. (34), Eq. (37) is rewritten into the form with
respect to ¯
εij and ¯
Gijk as shown in Eq. (38).
σjiεi j =1
VZnlσliuidS =1
VZnlσliYjdS ·¯
εij +1
VZ
1
2nlσliYjYkdS ·¯
Gijk (38)
Subtracting equation Eq. (38) from Eq. (36), the homogenized stress ¯
σji and generalized stress ¯
ζijk can
be derived in Eq. (39).
(¯
σji 1
VRnlσliYjdS)·¯
εij + ( ¯
ζijk 1
VR1
2nlσliYjYkdS)·¯
Gijk =0
=
¯
σji =1
VRnlσliYjdS =1
VRσjidV
¯
ζijk =1
VR1
2nlσliYjYkdS =1
VR1
2(σjiYk+σki Yj)dV
(39)
For micromorphic continuum, ¯
σij is the homogenized Cauchy stress conjugate to ¯
εij and ¯
ζijk is the
generalized stress conjugate to ¯
Gijk as computed in Eq. (39). For Cauchy continuum, only ¯
σij is effective
as Gijk is neglected. For micropolar continuum, the coupled stress ¯
mij is derived from the skew-symmetric
part of ¯
ζijk , such that ¯
mij =ϵjmn ¯
ζmni is conjugate to the curvature tensor ¯
κij =1
2ϵimn ¯
Gmnj . In the special
case where only the symmetric part of ¯
ζijk is considered, ¯
ζsym
ijk =1
2(¯
ζijk +¯
ζjik )is homogenized (Biswas and
Poh,2017) to represent the microstrain generalized stress.
Neural kernel for micromorphic plasticity 31
Appendix B Data generation
B.1 Analytical data generation
The data are generated based on the yield function shown in Eq. (19) with the yield strength Y=
3/2 and the length scale l=1. The data are generated by first determining the loading direction
(ˆ
σ11,ˆ
σ22,ˆ
σ12,ˆ
m13/l,ˆ
m23/l)in terms of the 4 parameters (θ0,θ1,θ2,θ); the magnitude of the yield stress is
then found given the loading direction as shown in Algorithm 3. As a result, the point cloud data set of
size 20000 is sampled from the yield surface, which is then split into a training set of 15000 data and a test
set of 5000 data.
Algorithm 3 Data sampling from micropolar J2 plasticity model.
Require: Yield function f(σ,m)
1. Sample the parametric variables with equal spacing (linspace).
Iterate through θ1=numpy.linspace(π/4, 5π/4, 10)and
θ=θ0=θ2=numpy.linspace(0, π, 10).
2. Define the stress and couple stress directions in terms of parametric variables.
Define (ˆ
σ1,ˆ
σ2,ˆ
m13/l,ˆ
m23/l) = (cos θ0cos θ1, cos θ0sin θ1, sin θ0cos θ2, sin θ0sin θ2).
Compute directional tensors ˆ
σ=cos θsin θ
sin θcos θˆ
σ10
0ˆ
σ2cos θsin θ
sin θcos θ,ˆm=ˆ
m13
ˆ
m23
3. Determine and record the yield point.
Compute the yield point (σY,mY) = (ˆ
σ,ˆm)
f(ˆ
σ,ˆm)/(2/3Y)+1
B.2 DNS data generation and sampling
To generate nonpolar, micropolar, and micromorphic yield data, the strain rate and generalized strain
rate are controlled to prescribe the displacement boundary condition; the RVE is loaded at the prescribed
rate of deformation until yield is detected, when the stress is recorded as a point on the yield surface. The
direction is the strain and generalized rates are parametrized d1 variables, where dis the dimension of
the stress space (d=3, 5, 9 in nonpolar, micropolar, and micromorphic cases respectively). The loading
direction is first determined by the parametric variables and the strain rate is tuned in magnitude to make
sure the RVE yield in 300 strain increments. The details of the data generation are described in the following
steps.
After the DNS simulation, 3185 raw data are generated. The data set is then expanded by utilizing
the symmetry of the RVE (e.g. adding (σ11 ,σ22,σ12)if (σ11,σ22,σ12)is in the data set in nonpolar case),
and then reduced by removing the outliers and redundant data in the data set, there are finally 214 data
(split into 200 training data and 14 test data) in nonpolar case, 7048 data (split into 6000 training data and
1048 test data) in micropolar case, and 16904 data (split into 13500 training data and 3404 test data) in
micromorphic case.
Appendix C Benchmark against alternative approaches for yield surface reconstruction
For completeness, we briefly introduce three other alternative surface reconstruction approaches, which
we will use to benchmark the performance against the proposed neural kernel method. The first approach
is the level set plasticity (Vlassis and Sun,2021), which fits the yield function as a signed distance func-
tion f(x)parametrized by deep neural networks (Section C.1). The second approach is the classical kernel
method with a pre-determined Gauss Process (GP) kernel (Section C.2). The last one is the Non-uniform
32 Zeyu Xiong et al.
Algorithm 4 Data sampling from DNS simulation.
Require: case {”nonpolar”, ”micropolar”, ”micromorphic”}
1. Sample the parametric variables with equal spacing (linspace).
if case == ”nonpolar” then
θ=numpy.linspace(π
8,π
2, 6)and θ1=numpy.linspace(3π
4,π, 10)
if case == ”micropolar” then
θ,θ0,θ2=numpy.linspace(0, π
2, 5), and θ1=numpy.linspace(π
2,5π
4, 10)
if case == ”micromorphic” then
θ,θ0,θ2,ϕ0,ϕ1,ϕ2,ϕ3=numpy.linspace(π
12 ,5π
12 , 3),θ1=numpy.linspace(3π
4,π, 4)
2. Define the deformation direction by the directional tensors ˆε,ˆκ, and ˆ
Gsym.
if case == ”nonpolar” then
(ˆ
ε1,ˆ
ε2) = (cos θ1, sin θ1)
if case == ”micropolar” then
(ˆ
ε1,ˆ
ε2,ˆ
κ13,ˆ
κ23) = (cos θ0cos θ1, cos θ0sin θ1, sin θ0cos θ2, sin θ0sin θ2)
if case == ”micromorphic” then
ˆ
ε1ˆ
Gsym
111
ˆ
ε2ˆ
Gsym
222
ˆ
κ13 ˆ
Gsym
121
ˆ
κ23 ˆ
Gsym
212
=
cos ϕ0cos θ0cos θ1sin ϕ0cos ϕ1cos ϕ2
cos ϕ0cos θ0sin θ1sin ϕ0cos ϕ1sin ϕ2
cos ϕ0sin θ0cos θ2sin ϕ0sin ϕ1cos ϕ3
cos ϕ0sin θ0sin θ2sin ϕ0sin ϕ1sin ϕ3
compute ˆε=cos θsin θ
sin θcos θˆ
ε10
0ˆ
ε2cos θsin θ
sin θcos θ
3. Tune the deformation rate based on ˆε,ˆκ, and ˆ
Gsym.
if case == ”nonpolar” then
(˙
ε11,˙
ε22,˙
ε12) = 105(0.3 ˆ
ε11, 0.3 ˆ
ε22, 0.06 ˆ
ε12)
if case == ”micropolar” then
(˙
ε11,˙
ε22,˙
ε12,˙
κ13,˙
κ23) = 105(0.2 ˆ
ε11, 0.2 ˆ
ε22, 0.06 ˆ
ε12, 2 ˆ
κ13, 2 ˆ
κ23)
if case == ”micromorphic” then
˙
ε11 ˙
ε22 ˙
ε12
˙
κ13 ˙
κ23 ˙
Gsym
111
˙
Gsym
222 ˙
Gsym
121 ˙
Gsym
212
=1.5(105)
0.2 ˆ
ε11 0.2 ˆ
ε22 0.06 ˆ
ε12
2ˆ
κ13 2ˆ
κ23 ˆ
Gsym
111
ˆ
Gsym
222 ˆ
Gsym
121 ˆ
Gsym
212
Rational Basis Spline (NURBS) method, which fits the surface with a set of B-spline basis functions (Section
C.3). The benchmark results are reported in Section (Section C.4).
C.1 MLP-based level-set method (MLP-LS)
The level set function, also called signed distance function, is defined as the shortest signed distance of
a point to a given surface satisfying Eq. (40), where the sign is positive if the point is outside the surface
and negative otherwise. For example, for J2 plasticity, the yield surface is a circle on the pi plane, and the
yield function is a cone, as shown in Figure 21. The deep neural network with 2nd-order differentiable
activation functions is then used to learn the scalar-valued level-set function.
Neural kernel for micromorphic plasticity 33
||∇σf|| =1 and f=0 on yield surface. (40)
Fig. 21: Level set function of J2 plasticity.
C.2 Classical kernel method based on Gauss Process (GP)
The GP kernel method is exactly the same as the neural kernel method except that the GP kernel is in
the form of Eq. (41) which is not trainable. Gauss process can be further used for statistical variance analysis
and uncertainty quantification, but in our application, we only use the GP kernel to predict a deterministic
yield surface; please refer to Williams and Rasmussen (2006) for more mathematical background of GP.
Ker(x,y) = exp(||xy||2
2L2)(41)
C.3 Non-uniform Rational Basis Spline (NURBS)
The other alternative method is the NURB-based plasticity (Coombs et al.,2016) that approximates the
yield function with basis spline functions, where the isotropic hardening can be modeled by the evolving
control points with the internal variable (Coombs and Motlagh,2017), and the non-associative flow rule
can be realized by NUBRS-based plastic potential (Coombs and Motlagh,2018).
Basis spline (B-spline) is one popular method for fitting curves or surfaces (Knott,2000). NURBS, as
one kind of B-spline method, has been used to reconstruct the yield surface (Coombs et al.,2016), where
the NURBS model, with the control points selected in a particular way, fits the 3D ellipsoidal surface accu-
rately. In this section, we generalize the NURBS model to surface fitting in arbitrary dimensional Euclidean
spaces. We will introduce the adapted NURBS implementation from the following perspectives: (1) the ba-
sis functions, (2) parametrization of higher-dimensional yield surfaces, and (3) surface fitting with NURBS.
C.3.1 B-spline basis function in higher dimension
In general, we adopt the multi-variate B-spline basis functions as compositions of B-spline basis func-
tions with a single variable. To this end, we first introduce the single-variate B-spline basis functions with
34 Zeyu Xiong et al.
the independent variable denoted as ϑ, which form a family of polynomials. These functions are defined
piecewisely given a knot vector tthat specifies a partition of interval [0, 1]: 0 =t1t2 · ·· ti · ··
tq=1, where qis the length of the knot vector. The single-variate basis functions B(·)of zero degree are
shown to be piecewise constant functions as follows:
B(0)
i(ϑ) = 1 if tiϑ<ti+1
0 else , 1 iq1 (42)
where the superscript with bracket indicates the degree of polynomials and the subscript iis an index
indicator within the basis family. We further express the single-variate basis function of higher degrees k
recursively as follows:
B(k)
i(ϑ) = ϑti
ti+kti
B(k1)
i(ϑ) + ti+k+1ϑ
ti+k+1ti+1
B(k1)
i+1(ϑ), 1 iqk1 (43)
where we adopt k=3, and the knot vector t= [0, 0, 0, 0, 0.25, 0.5, 0.75, 1, 1, 1, 1]in our implementation,
which indicates that the length of the knot vector q=11 and the number of basis function is qk1=7.
We denote the number of basis function members as |B|.
The composition of Bi(for simplicity the superscript (k)is omitted) that formulates multi-variate B-
spline basis functions is then expressed as follows:
Ni(ϑ) = Bi(ϑ)for single variable
NK(ϑ) = Πd
l=1Bil(ϑl), where K=d
l=1il|B|dlfor multiple variables (44)
C.3.2 Parametrization of higher-dimensional surface
The yield surface living in d-dimensional space describes a d1-dimensional manifold and hence
could be parametrized with d1 independent variables ϑ= (ϑ1,ϑ2,· · · ,ϑd1). We adopt the generalized
spherical parametrization for describing the yield surface mathematically in Eq. (45), where the indepen-
dent variables are angle indicators within the range [0, 1]and there is a dependent variable ρ(ϑ)indicating
the radial distance.
x=ρ(ϑ)cos 2πϑ1
ρ(ϑ)sin 2πϑ1for 1D curve
x=
ρ(ϑ)cos πϑ1
ρ(ϑ)sin πϑ1cos 2πϑ2
ρ(ϑ)sin πϑ1sin 2πϑ2
for 2D surface
xi=
ρ(ϑ)(Πi1
j=1sin πϑj)cos πϑiif i<d2
ρ(ϑ)(Πd2
j=1sin πϑj)cos 2πϑiif i=d2
ρ(ϑ)(Πd2
j=1sin πϑj)sin 2πϑiif i=d1
for higher dimensions
(45)
C.3.3 Fitting the surface with NURBS
The point cloud data x(i)sampled from the yield surface can be mapped into the labeled data (ϑ(i),ρ(i))
through the parametrization shown in Eq. (45). The NURBS model is described by Eq. (46), where c
R|B|d1is an array of control points.
ρ(ϑ) = |B|d1
j=1
Nj(ϑ)cj=N(ϑ)·c(46)
To fit the NURBS model with the data sampled from the yield surface, the control points care found
such that the difference between the prediction ˆ
ρ(i)=N(ϑ(i))·cand the label ρ(i)is minimized. Therefore,
Neural kernel for micromorphic plasticity 35
the control points are computed through the regularized least square method as shown in Eq. (47), where
[ρ]i=ρ(i)and [N(ϑ)]ij =Nj(ϑ(i)).
c=argmin
C
i||ρ(i)N(ϑ(i))·C||2+λ||C||2 c= (NT(ϑ)N(ϑ) + λI)1NT(ϑ)ρ. (47)
C.4 Yield surface reconstructed via alternative approaches
To compare the performances of the approaches outlined in Sections C.1-C.3, we obtained the learned
yield functions via these approaches for three data sets we used in this paper, i.e., the classical J2 von Mises
yield function, a generalized J2 von Mises yield function with the additional couple stress terms, and the
micropolar DNS data inferred from finite element simulations of the layered materials. The results are
shown in Fig. 22.
Fig. 22: The cross section of the classical J2 yield surface f(σ11,σ22,σ12 =0) = 0 in 3D space (LEFT), the
micropolar J2 yield surface f(σ11,σ22,σ12 =0, m13 =0, m23 =0) = 0 in 5D space (MIDDLE), and the
micropolar DNS yield surface (RIGHT).
In the first case, the actual learned function only depends on the J2 stress and the data set is of low
dimension. As such, the results in Fig. 22(a) indicate that all four approaches (neural kernel method (red),
classical Gaussian kernel (black), level set MLP (green), and NURBS (yellow)) perform well in this simple
case. In the second case, the Von mises yield function is amended with a regularization term (see Eq. (19)).
This dependence on coupled stress also breaks the symmetry of the force stress, and hence leads to the
dimension of the parametric space to be increased from 3 to 5. This increase of dimensionality leads to the
NURBS yield function failing, but the rest of the three approaches still perform well (see Fig. fig:nurbs(b)).
In the last case, the DNS data set is used where only a portion of data is purposely missing (see Fig. 22(c)),
the dimensionality of the data is identical to the second case, but the data exhibits lower symmetry. Visual
inspections reveal that the neural kernel generates yield function with geometric features consistent with
those of data. On the other hand, the MLP level set is capable to generate non-oscillatory yield functions,
whereas both the NURBS and classical GP kernel may lead to spurious oscillation in the learned function.
Note that, in the last case where data are missing in one portion of the parametric space, the robustness
of the learned function in the extrapolated regime depends strongly on the types of inductive biases of
36 Zeyu Xiong et al.
the learning algorithm (Wilson et al.,2016). Since the neural kernel method employs a data-dependent
adaptive kernel, it is more expressive than the classical GP and the NURBS approach where the parametric
space is spanned by a pre-determined set of bases.
References
JH Argyris, G Faust, J Szimmat, EP Warnke, and KJ Willam. Recent developments in the finite element
analysis of prestressed concrete reactor vessels. Nuclear Engineering and Design, 28(1):42–75, 1974.
Daniel Arndt, Wolfgang Bangerth, Bruno Blais, Thomas C Clevenger, Marc Fehling, Alexander V Grayver,
Timo Heister, Luca Heltai, Martin Kronbichler, Matthias Maier, et al. The deal. ii library, version 9.2.
Journal of Numerical Mathematics, 28(3):131–146, 2020.
Zdenek P Bazant and Milan Jir´
asek. Nonlocal integral formulations of plasticity and damage: survey of
progress. Journal of engineering mechanics, 128(11):1119–1149, 2002.
Raja Biswas and Leong Hien Poh. A micromorphic computational homogenization framework for hetero-
geneous materials. Journal of the Mechanics and Physics of Solids, 102:187–208, 2017.
Colin Bonatti, Bekim Berisha, and Dirk Mohr. From cp-fft to cp-rnn: Recurrent neural network surrogate
model of crystal plasticity. International Journal of Plasticity, 158:103430, 2022.
Ronaldo I Borja. Plasticity, volume 2. Springer, 2013.
Ronaldo I Borja, Qing Yin, and Yang Zhao. Cam-clay plasticity. part ix: On the anisotropy, heterogeneity,
and viscoplasticity of shale. Computer Methods in Applied Mechanics and Engineering, 360:112695, 2020.
Youngmin Cho and Lawrence Saul. Kernel methods for deep learning. Advances in neural information
processing systems, 22, 2009.
William M Coombs and Yousef Ghaffari Motlagh. Nurbs plasticity: yield surface evolution and implicit
stress integration for isotropic hardening. Computer Methods in Applied Mechanics and Engineering, 324:
204–220, 2017.
William M Coombs and Yousef Ghaffari Motlagh. Nurbs plasticity: non-associated plastic flow. Computer
Methods in Applied Mechanics and Engineering, 336:419–443, 2018.
William M Coombs, Oscar A Petit, and Yousef Ghaffari Motlagh. Nurbs plasticity: Yield surface represen-
tation and implicit stress integration for isotropic inelasticity. Computer Methods in Applied Mechanics and
Engineering, 304:342–358, 2016.
Eug`
ene Maurice Pierre Cosserat and Franc¸ois Cosserat. Th´eorie des corps d´eformables. A. Hermann et fils,
1909.
Yannis F Dafalias. Modelling cyclic plasticity: simplicity versus sophistication. Mechanics of engineering
materials, 153178, 1984.
Yannis F Dafalias and Majid T Manzari. Simple plasticity sand model accounting for fabric change effects.
Journal of Engineering mechanics, 130(6):622–634, 2004.
Yannis F Dafalias, Majid T Manzari, and Achilleas G Papadimitriou. Saniclay: simple anisotropic clay
plasticity model. International Journal for Numerical and Analytical Methods in Geomechanics, 30(12):1231–
1257, 2006.
Ren´
e de Borst. A generalisation of j2-flow theory for polar continua. Computer Methods in Applied Mechanics
and Engineering, 103(3):347–362, 1993.
Clayton Deutsch. Calculating effective absolute permeability in sandstone/shale sequences. SPE Formation
Evaluation, 4(03):343–348, 1989.
Andreas Dietsche, Paul Steinmann, and Kaspar Willam. Micropolar elastoplasticity and its role in localiza-
tion. International Journal of Plasticity, 9(7):813–831, 1993.
David L Donoho et al. High-dimensional data analysis: The curses and blessings of dimensionality. AMS
math challenges lecture, 1(2000):32, 2000.
A Cemal Eringen. Mechanics of micromorphic continua. In Mechanics of Generalized Continua: Proceedings
of the IUTAM-Symposium on The Generalized Cosserat Continuum and the Continuum Theory of Dislocations
with Applications, Freudenstadt and Stuttgart (Germany) 1967, pages 18–35. Springer, 1968.
Jan Niklas Fuhg, Craig M Hamel, Kyle Johnson, Reese Jones, and Nikolaos Bouklas. Modular machine
learning-based elastoplasticity: Generalization in the context of limited data. Computer Methods in Applied
Mechanics and Engineering, 407:115930, 2023.
Neural kernel for micromorphic plasticity 37
Marc GD Geers, Varvara G Kouznetsova, and WAM Brekelmans. Multi-scale computational homogeniza-
tion: Trends and challenges. Journal of computational and applied mathematics, 234(7):2175–2182, 2010.
Rainer Gl ¨
uge and Sara Bucci. Does convexity of yield surfaces in plasticity have a physical significance?
Mathematics and Mechanics of Solids, 23(9):1364–1373, 2018.
Maysam B Gorji, Mojtaba Mozaffar, Julian N Heidenreich, Jian Cao, and Dirk Mohr. On the potential of
recurrent neural networks for modeling path dependent plasticity. Journal of the Mechanics and Physics of
Solids, 143:103972, 2020.
Albert Edward Green and Ronald S Rivlin. Multipolar continuum mechanics. Archive for rational mechanics
and analysis, 17:113–147, 1964.
Gerd Gudehus, Angelo Amorosi, Antonio Gens, Ivo Herle, Dimitrios Kolymbas, David Maˇ
s´
ın, David
Muir Wood, Andrzej Niemunis, Roberto Nova, Manuel Pastor, et al. The soilmodels. info project. Inter-
national Journal for Numerical and Analytical Methods in Geomechanics, 32(12):1571–1572, 2008.
Xiaolong He and Jiun-Shyan Chen. Thermodynamically consistent machine-learned internal state variable
approach for data-driven modeling of path-dependent materials. Computer Methods in Applied Mechanics
and Engineering, 402:115348, 2022.
Yousef Heider, Kun Wang, and WaiChing Sun. So (3)-invariance of informed-graph-based deep neural
network for anisotropic elastoplastic materials. Computer Methods in Applied Mechanics and Engineering,
363:112875, 2020.
Ivo Herle and Gerd Gudehus. Determination of parameters of a hypoplastic constitutive model from prop-
erties of grain assemblies. Mechanics of Cohesive-frictional Materials: An International Journal on Experiments,
Modelling and Computation of Materials and Structures, 4(5):461–486, 1999.
Don A Howard, Marco Giovanelli, et al. Einstein’s philosophy of science. Stanford encyclopedia of philosophy,
pages 1–20, 2019.
Geralf H ¨
utter. On the micro-macro relation for the microdeformation in the homogenization towards
micromorphic and micropolar continua. Journal of the Mechanics and Physics of Solids, 127:62–79, 2019.
R J¨
anicke and S Diebels. Numerical homogenisation of micromorphic media. Technische Mechanik-European
Journal of Engineering Mechanics, 30(4):364–373, 2010.
Immanuel Kant. Critique of pure reason. 1781. Modern Classical Philosophers, Cambridge, MA: Houghton
Mifflin, pages 370–456, 1908.
Mahyar Malekzade Kebria, SeonHong Na, and Fan Yu. An algorithmic framework for computational
estimation of soil freezing characteristic curves. International Journal for Numerical and Analytical Methods
in Geomechanics, 46(8):1544–1565, 2022.
Gary D Knott. Interpolating cubic splines, volume 18. Springer Science & Business Media, 2000.
Ragnar Larsson and Stefan Diebels. A second-order homogenization procedure for multi-scale analysis
based on micropolar kinematics. International journal for numerical methods in engineering, 69(12):2485–
2512, 2007.
Feng-Bao Lin and Zdenˇ
ek P Baˇ
zant. Convexity of smooth yield surface of frictional material. Journal of
engineering mechanics, 112(11):1259–1262, 1986.
Jia Lin, Wei Wu, and Ronaldo I Borja. Micropolar hypoplasticity for persistent shear band in heterogeneous
granular materials. Computer Methods in Applied Mechanics and Engineering, 289:24–43, 2015.
Ran Ma and WaiChing Sun. Computational thermomechanics for crystalline rock. part ii: Chemo-damage-
plasticity and healing in strongly anisotropic polycrystals. Computer Methods in Applied Mechanics and
Engineering, 369:113184, 2020.
Majid T Manzari. Application of micropolar plasticity to post failure analysis in geomechanics. International
Journal for Numerical and Analytical Methods in Geomechanics, 28(10):1011–1032, 2004.
Raymond David Mindlin. Microstructure in linear elasticity (tech. rep.). Columbia Univ New York Dept of
Civil Engineering and Engineering Mechanics, 1963.
Raymond David Mindlin. Second gradient of strain and surface-tension in linear elasticity. International
Journal of Solids and Structures, 1(4):417–438, 1965.
Raymond David Mindlin and NN0166 Eshel. On first strain-gradient theories in linear elasticity. Interna-
tional Journal of Solids and Structures, 4(1):109–124, 1968.
RD Mindlin, HF144513 Tiersten, et al. Effects of couple-stresses in linear elasticity. Archive for Rational
Mechanics and analysis, 11(1):415–448, 1962.
38 Zeyu Xiong et al.
Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. Foundations of machine learning. MIT press,
2018.
Hans-Bernd M ¨
uhlhaus and Ioannis Vardoulakis. The thickness of shear bands in granular materials.
Geotechnique, 37(3):271–283, 1987.
SeonHong Na and WaiChing Sun. Computational thermo-hydro-mechanics for multiphase freezing and
thawing porous media in the finite deformation range. Computer Methods in Applied Mechanics and Engi-
neering, 318:667–700, 2017.
SeonHong Na, WaiChing Sun, Mathew D Ingraham, and Hongkyu Yoon. Effects of spatial heterogeneity
and material anisotropy on the fracture pattern and macroscopic effective toughness of mancos shale in
brazilian tests. Journal of Geophysical Research: Solid Earth, 122(8):6202–6230, 2017.
Patrizio Neff, Ionel-Dumitrel Ghiba, Angela Madeo, Luca Placidi, and Giuseppe Rosi. A unifying perspec-
tive: the relaxed linear micromorphic continuum. Continuum Mechanics and Thermodynamics, 26:639–681,
2014.
Isaac Newton. Philosophiæ naturalis principia mathematica (mathematical principles of natural philoso-
phy). London (1687), 1687.
Dorival M Pedroso, Daichao Sheng, and Scott W Sloan. Stress update algorithm for elastoplastic models
with nonconvex yield surfaces. International journal for numerical methods in engineering, 76(13):2029–2062,
2008.
Ron HJ Peerlings, Marc GD Geers, Ren´
e de Borst, and WAM1032 Brekelmans. A critical comparison of
nonlocal and gradient-enhanced softening continua. International Journal of solids and Structures, 38(44-
45):7723–7746, 2001.
Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, and Andreas Geiger. Convolutional
occupancy networks. In European Conference on Computer Vision, pages 523–540. Springer, 2020.
Maithra Raghu, Ben Poole, Jon Kleinberg, Surya Ganguli, and Jascha Sohl-Dickstein. On the expressive
power of deep neural networks. In international conference on machine learning, pages 2847–2854. PMLR,
2017.
Alexander Rakhlin and Xiyu Zhai. Consistency of interpolation with laplace kernels is a high-dimensional
phenomenon. In Conference on Learning Theory, pages 2595–2623. PMLR, 2019.
Paul Rosenthal, Vladimir Molchanov, and Lars Linsen. A narrow band level set method for surface extraction
from unstructured point-based volume data. Universit¨
atsbibliothek Chemnitz, 2011.
J¨
org Schr¨
oder, Mohammad Sarhil, Lisa Scheunemann, and Patrizio Neff. Lagrange and h (curl, b) based
finite element formulations for the relaxed micromorphic model. Computational Mechanics, pages 1–25,
2022.
Shabnam J Semnani, Joshua A White, and Ronaldo I Borja. Thermoplasticity and strain localization in
transversely isotropic materials based on anisotropic critical state plasticity. International Journal for Nu-
merical and Analytical Methods in Geomechanics, 40(18):2423–2449, 2016.
Daichao Sheng, Charles E Augarde, and Andrew J Abbo. A fast algorithm for finding the first intersection
with a non-convex yield surface. Computers and Geotechnics, 38(4):465–471, 2011.
Hiroki Sone and Mark D Zoback. Mechanical properties of shale-gas reservoir rocks—part 2: Ductile creep,
brittle strength, and their relation to the elastic modulus. Geophysics, 78(5):D393–D402, 2013.
Paul Steinmann. A micropolar theory of finite deformation and finite rotation multiplicative elastoplastic-
ity. International Journal of Solids and Structures, 31(8):1063–1084, 1994.
Paul Steinmann. Theory and numerics of ductile micropolar elastoplastic damage. International journal for
numerical methods in engineering, 38(4):583–606, 1995.
Xiao Sun, Bahador Bahmani, Nikolaos N Vlassis, WaiChing Sun, and Yanxun Xu. Data-driven discovery
of interpretable causal relations for deep learning material laws with uncertainty propagation. Granular
Matter, 24:1–32, 2022.
David Sussillo and Omri Barak. Opening the black box: low-dimensional dynamics in high-dimensional
recurrent neural networks. Neural computation, 25(3):626–649, 2013.
Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society:
Series B (Methodological), 58(1):267–288, 1996.
RA144512 Toupin. Elastic materials with couple-stresses. Archive for rational mechanics and analysis, 11(1):
385–414, 1962.
Neural kernel for micromorphic plasticity 39
Richard A Toupin. Theories of elasticity with couple-stress. Archive for Rational Mechanics and Analysis, 17
(2):85–112, 1964.
Willard van Orman Quine. On simple theories of a complex world. Synthese, pages 103–106, 1963.
Ioannis Vardoulakis. Cosserat continuum mechanics. Lecture Notes in Applied and Computational Mechanics,
87, 2019.
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz
Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems,
30, 2017.
Ruben Villarreal, Nikolaos N Vlassis, Nhon N Phan, Tommie A Catanach, Reese E Jones, Nathaniel A Trask,
Sharlotte LB Kramer, and WaiChing Sun. Design of experiments for the calibration of history-dependent
models via deep reinforcement learning and an enhanced kalman filter. Computational Mechanics, 72(1):
95–124, 2023.
Nikolaos N Vlassis and WaiChing Sun. Sobolev training of thermodynamic-informed neural networks for
interpretable elasto-plasticity models with level set hardening. Computer Methods in Applied Mechanics
and Engineering, 377:113695, 2021.
Nikolaos N Vlassis and WaiChing Sun. Component-based machine learning paradigm for discovering
rate-dependent and pressure-sensitive level-set plasticity models. Journal of Applied Mechanics, 89(2),
2022.
Nikolaos N Vlassis, Ran Ma, and WaiChing Sun. Geometric deep learning for computational mechanics
part i: Anisotropic hyperelasticity. Computer Methods in Applied Mechanics and Engineering, 371:113299,
2020.
Kun Wang and WaiChing Sun. A multiscale multi-permeability poroplasticity model linked by recursive
homogenizations and deep learning. Computer Methods in Applied Mechanics and Engineering, 334:337–
380, 2018.
Kun Wang, WaiChing Sun, Simon Salager, SeonHong Na, and Ghonwa Khaddour. Identifying material pa-
rameters for a micro-polar plasticity model via x-ray micro-computed tomographic (ct) images: lessons
learned from the curve-fitting exercises. International Journal for Multiscale Computational Engineering, 14
(4), 2016.
Kun Wang, WaiChing Sun, and Qiang Du. A non-cooperative meta-modeling game for automated third-
party calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks. Com-
puter Methods in Applied Mechanics and Engineering, 373:113514, 2021.
Oliver Weeger. Numerical homogenization of second gradient, linear elastic constitutive models for cubic
3d beam-lattice metamaterials. International Journal of Solids and Structures, 224:111037, 2021.
SJ Wheeler, D Gallipoli, and M Karstunen. Comments on use of the barcelona basic model for unsaturated
soils. International journal for numerical and analytical methods in Geomechanics, 26(15):1561–1571, 2002.
Christopher KI Williams and Carl Edward Rasmussen. Gaussian processes for machine learning, volume 2.
MIT press Cambridge, MA, 2006.
Francis Williams, Zan Gojcic, Sameh Khamis, Denis Zorin, Joan Bruna, Sanja Fidler, and Or Litany. Neural
fields as learnable kernels for 3d reconstruction. In Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, pages 18500–18510, 2022.
Andrew Gordon Wilson, Zhiting Hu, Ruslan Salakhutdinov, and Eric P Xing. Deep kernel learning. In
Artificial intelligence and statistics, pages 370–378. PMLR, 2016.
Mian Xiao and WaiChing Sun. Geometric prior of multi-resolution yielding manifolds and the local closest
point projection for nearly non-smooth plasticity. Computer Methods in Applied Mechanics and Engineering,
400:115469, 2022.
Qing Yin, Edward And`
o, Gioacchino Viggiani, and WaiChing Sun. Freezing-induced stiffness and strength
anisotropy in freezing clayey soil: Theory, numerical modeling, and experimental validation. Interna-
tional Journal for Numerical and Analytical Methods in Geomechanics, 46(11):2087–2114, 2022.
Jidong Zhao, Daichao Sheng, M Rouainia, and Scott W Sloan. Explicit stress integration of complex soil
models. International Journal for Numerical and Analytical Methods in Geomechanics, 29(12):1209–1229, 2005.
Yang Zhao, Shabnam J Semnani, Qing Yin, and Ronaldo I Borja. On the strength of transversely isotropic
rocks. International Journal for Numerical and Analytical Methods in Geomechanics, 42(16):1917–1934, 2018.
... Here, we hypothesize that the mapping M can be decomposed into two steps (cf. Xiong et al. [2023]). ...
Article
Full-text available
We introduce HYDRA, a learning algorithm that generates symbolic hyperelasticity models designed for running in 3D Eulerian hydrocodes that require fast and robust inference time. Classical deep learning methods require a large number of neurons to express a learned hyperelasticity model adequately. Large neural network models may lead to slower inference time when compared to handcrafted models expressed in symbolic forms. This expressivity-speed trade-off is not desirable for high-fidelity hydrocodes that require one inference per material point per time step. Pruning techniques may speed up inference by removing/deactivating less important neurons, but often at a non-negligible expense of expressivity and accuracy. In this work, we introduce a post-hoc procedure to convert a neural network model into a symbolic one to reduce inference time. Rather than directly confronting NP-hard symbolic regression in the ambient strain space, HYDRA leverages a data-driven projection to map strain onto a hyperplane and a neural additive model to parameterize the hyperplane via univariate bases. This setting enables us to convert the univariate bases into symbolic forms via genetic programming with explicit control of the expressivity-speed trade-off. Additionally, the availability of analytical models provides the benefits of ensuring the enforcement of physical constraints (e.g., material frame indifference, material symmetry, growth condition) and enabling symbolic differentiation that may further reduce the memory requirement of high-performance solvers. Benchmark numerical examples of material point simulations for shock loading in β-octahydro-1,3,5,7-tetranitro-1,3,5,7-tetrazocine (β-HMX) are performed to assess the practicality of using the discovered machine learning models for high-fidelity simulations.
Article
Full-text available
Experimental data are often costly to obtain, which makes it difficult to calibrate complex models. For many models an experimental design that produces the best calibration given a limited experimental budget is not obvious. This paper introduces a deep reinforcement learning (RL) algorithm for design of experiments that maximizes the information gain measured by Kullback–Leibler divergence obtained via the Kalman filter (KF). This combination enables experimental design for rapid online experiments where manual trial-and-error is not feasible in the high-dimensional parametric design space. We formulate possible configurations of experiments as a decision tree and a Markov decision process, where a finite choice of actions is available at each incremental step. Once an action is taken, a variety of measurements are used to update the state of the experiment. This new data leads to a Bayesian update of the parameters by the KF, which is used to enhance the state representation. In contrast to the Nash–Sutcliffe efficiency index, which requires additional sampling to test hypotheses for forward predictions, the KF can lower the cost of experiments by directly estimating the values of new data acquired through additional actions. In this work our applications focus on mechanical testing of materials. Numerical experiments with complex, history-dependent models are used to verify the implementation and benchmark the performance of the RL-designed experiments.
Article
Full-text available
Modeling the unusual mechanical properties of metamaterials is a challenging topic for the mechanics community and enriched continuum theories are promising computational tools for such materials. The so-called relaxed micromorphic model has shown many advantages in this field. In this contribution, we present significant aspects related to the relaxed micromorphic model realization with the finite element method (FEM). The variational problem is derived and different FEM-formulations for the two-dimensional case are presented. These are a nodal standard formulation H1(B)×H1(B) and a nodal-edge formulation H1(B)×H(curl,B), where the latter employs the Nédélec space. In this framework, the implementation of higher-order Nédélec elements is not trivial and requires some technicalities which are demonstrated. We discuss the computational convergence behavior of Lagrange-type and tangential-conforming finite element discretizations. Moreover, we analyze the characteristic length effect on the different components of the model and reveal how the size-effect property is captured via this characteristic length parameter.
Article
Full-text available
Elastoplasticity models often introduce a scalar-valued yield function to implicitly represent the boundary between elastic and plastic material states. This paper introduces a new alternative where the yield envelope is represented by a manifold of which the topology and the geometry are learned from a set of data points in a parametric space (e.g. principal stress space, pi-plane). Here, deep geometric learning enables us to reconstruct a highly complex yield envelope by breaking it down into multiple coordinate charts. The global atlas that consists of these coordinate charts in return allows us to represent the yield surface via multiple overlapping patches, each with a specific local parametrization. This setup provides several advantages over the classical implicit function representation approach. For instance, the availability of coordinate charts enables us to introduce an alternative stress integration algorithm where the trial stress may project directly on a local patch and hence circumvent the issues related to non-smoothness and the lack of convexity of yield surfaces. Meanwhile, the local parametric approach also enables us to predict hardening/softening locally in the parametric space, even without complete knowledge of the yield surface. Comparisons between the classical yield function approach on the non-smooth plasticity and anisotropic cam-clay plasticity model are provided to demonstrate the capacity of the models for highly precise yield surface and the feasibility of the implementation of the learned model in the local stress integration algorithm.
Article
Full-text available
Characterization and modeling of path-dependent behaviors of complex materials by phenomenological models remains challenging due to difficulties in formulating mathematical expressions and internal state variables (ISVs) governing path-dependent behaviors. Data-driven machine learning models, such as deep neural networks and recurrent neural networks (RNNs), have become viable alternatives. However, pure black-box data-driven models mapping inputs to outputs without considering the underlying physics suffer from unstable and inaccurate generalization performance. This study proposes a machine-learned physics-informed data-driven constitutive modeling approach for path-dependent materials based on the measurable material states. The proposed data-driven constitutive model is designed with the consideration of universal thermodynamics principles, where the ISVs essential to the material path-dependency are inferred automatically from the hidden state of RNNs. The RNN describing the evolution of the data-driven machine-learned ISVs follows the thermodynamics second law. To enhance the robustness and accuracy of RNN models, stochasticity is introduced to model training. The effects of the number of RNN history steps, the internal state dimension, the model complexity, and the strain increment on model performances have been investigated. The effectiveness of the proposed method is evaluated by modeling soil material behaviors under cyclic shear loading using experimental stress–strain data.
Article
Full-text available
This paper presents a combined experimental-modeling effort to interpret the coupled thermo-hydro-mechanical behaviors of the freezing soil, where an unconfined, fully saturated clay is frozen due to a temperature gradient. By leveraging the rich experimental data from the microCT images and the measurements taken during the freezing process, we examine not only how the growth of ice induces volumetric changes of the soil in the fully saturated specimen but also how the presence and propagation of the freezing fringe front may evolve the anisotropy of the effective media of the soil-ice mixture that cannot be otherwise captured phenomenologically in the isotropic saturation-dependent critical state models for plasticity. The resultant model is not only helpful for providing a qualitative description of how freezing affects the volumetric responses of the clayey material, but also provide a mean to generate more precise predictions for the heaving due to the freezing of the ground.
Article
Full-text available
Conventionally, neural network constitutive laws for path-dependent elasto-plastic solids are trained via supervised learning performed on recurrent neural networks, with the time history of strain as input and the stress as input. However, training a neural network to replicate path-dependent constitutive responses require significantly more amount of data due to path dependence. This demand on diverse and abundance of accurate data, as well as the lack of interpretability to guide the data generation process, could become major roadblocks for engineering applications. In this work, we attempt to simplify these training processes and improve the interpretability of the trained models by breaking down the training of material models into multiple supervised machine learning programs for elasticity, initial yielding, and hardening laws that can be conducted sequentially. To predict pressure-sensitivity and rate dependence of the plastic responses, we reformulate the Hamliton-Jacobi equation such that the yield function is parametrized in product space spanned by the principle stress, the accumulated plastic strain, and time. To test the versatility of the neural network meta-modeling framework, we conduct multiple numerical experiments where neural networks are trained and validated against (1) data generated from known benchmark models, (2) data obtained from physical experiments, and (3) data inferred from homogenizing sub-scale direct numerical simulations of microstructures. The neural network model is also incorporated into an offline FFT-FEM model to improve the efficiency of the multiscale calculations.
Article
The development of highly accurate constitutive models for materials that undergo path-dependent processes continues to be a complex challenge in computational solid mechanics. Challenges arise both in considering the appropriate model assumptions and from the viewpoint of data availability, verification, and validation. Recently, data-driven modeling approaches have been proposed that aim to establish stress-evolution laws that avoid user-chosen functional forms by relying on machine learning representations and algorithms. However, these approaches not only require a significant amount of data but also need data that probes the full stress space with a variety of complex loading paths. Furthermore, they rarely enforce all necessary thermodynamic principles as hard constraints. Hence, they are in particular not suitable for low-data or limited-data regimes, where the first arises from the cost of obtaining the data and the latter from the experimental limitations of obtaining labeled data, which is commonly the case in engineering applications. In this work, we discuss a hybrid framework that can work on a variable amount of data by relying on the modularity of the elastoplasticity formulation where each component of the model can be chosen to be either a classical phenomenological or a data-driven model depending on the amount of available information and the complexity of the response. The method is tested on synthetic uniaxial data coming from simulations as well as cyclic experimental data for structural materials. The discovered material models are found to not only interpolate well but also allow for accurate extrapolation in a thermodynamically consistent manner far outside the domain of the training data. This ability to extrapolate from limited data was the main reason for the early and continued success of phenomenological models and the main shortcoming in machine learning-enabled constitutive modeling approaches. Training aspects and details of the implementation of these models into Finite Element simulations are discussed and analyzed.
Article
Recurrent Neural Network (RNN) based surrogate models constitute an emerging class of reduced order models of history-dependent material behavior. Recently, the authors have proposed an alternative RNN formulation that provides stress-responses independent of the time-discretization of the input-path, making it appropriate for integration into explicit finite element (FE) frameworks. Herein, we apply the same methodology to 2D and 3D datasets corresponding to the effective mechanical behavior of an aluminum alloy as obtained through Crystal Plasticity simulations. In both cases, we obtain reasonable approximations of the behavior using RNN models of size ranging from 5’000 to 100’000 parameters. We also develop a methodology to reduce observed numerical instabilities of the finite element implementations.
Article
Many numerical models for simulating freezing and thawing phenomena of soil have been developed due to emerging geotechnical issues in cold regions. In particular, coupled thermo‐hydro‐mechanical (THM) analysis is used to evaluate complicated deformation, thermal, and moisture transport behavior of freezing–thawing soils. This study proposes a soil‐freezing characteristic curve (SFCC) that is robust and adaptive with various computational frameworks, including the THM approach. The proposed SFCC can also account for different soil types by incorporating the particle size distribution. Here an automatic regression scheme is adopted to update the SFCC associated with deformation and thermal changes. In addition, a smoothing algorithm is adopted to prevent a sharp change of the SFCC due to phase transition between the liquid water and crystal ice. Based on experimental works in the literature, the applicability of our model is demonstrated when the initial water contents and soil particle distribution differ. We further investigate the performance of the proposed SFCC as a constitutive model within a simplified THM framework. Our results show that the proposed model captures the desired behavior of different soil types in the freezing process, such as freezing temperature depreciation, the effect of compaction, and mechanical loading on unfrozen water content.