ArticlePDF Available

Geometric learning for computational mechanics Part II: Graph embedding for interpretable multiscale plasticity

Authors:

Abstract

The history-dependent behaviors of classical plasticity models are often driven by internal variables evolved according to phenomenological laws. The difficulty in interpreting how these internal variables represent a history of deformation, the lack of direct measurement of these internal variables for calibration and validation, and the weak physical underpinning of those phenomenological laws have long been criticized as barriers to creating realistic models. In this work, geometric machine learning on graph data (e.g., finite element solutions) is used to establish a connection between nonlinear dimensional reduction techniques and plasticity models. Geometric learning-based encoding on graphs allows the embedding of rich time-history data onto a low-dimensional Euclidean space such that the evolution of plastic deformation can be predicted in the embedded feature space. A corresponding decoder can then convert these low-dimensional internal variables back into a weighted graph such that the dominating topological features of plastic deformation can be observed and analyzed.
Computer Methods in Applied Mechanics and Engineering manuscript No.
(will be inserted by the editor)
Geometric learning for computational mechanics Part II: Graph embedding
for interpretable multiscale plasticity
Nikolaos N. Vlassis ·WaiChing Sun
Received: November 7, 2022/ Accepted: date
Abstract The history-dependent behaviors of classical plasticity models are often driven by internal vari-
ables evolved according to phenomenological laws. The difficulty to interpret how these internal variables
represent a history of deformation, the lack of direct measurement of these internal variables for calibra-
tion and validation, and the weak physical underpinning of those phenomenological laws have long been
criticized as barriers to creating realistic models. In this work, geometric machine learning on graph data
(e.g. finite element solutions) is used as a means to establish a connection between nonlinear dimensional
reduction techniques and plasticity models. Geometric learning-based encoding on graphs allows the em-
bedding of rich time-history data onto a low-dimensional Euclidean space such that the evolution of plastic
deformation can be predicted in the embedded feature space. A corresponding decoder can then convert
these low-dimensional internal variables back into a weighted graph such that the dominating topological
features of plastic deformation can be observed and analyzed.
Keywords graph convolutional neural network, internal variables, plasticity, machine learning in
mechanics
1 Introduction
The composition of a macroscopic plasticity model often requires the following steps. First, there are
observations of causality relations deduced by modelers to hypothesize mechanisms that lead to the plastic
flow. These causality relations along with constraints inferred from physics and universally accepted prin-
ciples lead to mathematical equations. For instance, the family of Gurson models employs the observation
of void growth to employ the yield surface (Gurson,1977). Crystal plasticity models relate the plastic flow
with slip systems to predict the anisotropic responses of single crystals (Rice,1971;Uchic et al.,2004;Clay-
ton,2010;Ma and Sun,2020;Ma et al.,2021). Granular plasticity models propose theories that relate the
fabric of force chains and porosity to the onset of plastic yielding and the resultant plastic flow (Cowin,
1985;Sun,2013;Kuhn et al.,2015;Wang and Sun,2016,2018;Sun et al.,2022). Finally, the mathemati-
cal equations are then either used directly in engineering analysis and designs (e.g. the Mohr-Coulomb
envelope) or are incorporated into a boundary value problem in which the approximation solution can
be obtained from a partial differential equation solver that provides incremental updates of stress-strain
relations.
However, a subtle but significant limitation of this paradigm is that it imposes the burdens on modelers
of being able to describe the mechanisms verbally via terminologies or atomic facts (cf. Daitz (1953); Griffin
(1964). ) before they can convert the theoretical ideas into mathematical equations or computer algorithms.
Our perception of mechanics is therefore limited by our language or more precisely the limited availability
Corresponding author: WaiChing Sun, Ph.D
Associate Professor, Department of Civil Engineering and Engineering Mechanics, Columbia University, 614 SW Mudd, Mail
Code: 4709, New York, NY 10027 Tel.: 212-854-3143, Fax: 212-854-6267, E-mail: wsun@columbia.edu
2 Nikolaos N. Vlassis, WaiChing Sun
of descriptors that enable us to record hypotheses and proposed causalities (Auer et al.,2018;Wang et al.,
2019,2021;Sun et al.,2022). What if there exists a mechanism or more precisely an evolution of spatial
patterns that dominate the macroscopic responses but there have not yet been a proper set of terminologies
to describe them properly? What happens if the underlying mechanisms for plastic deformation or other
path-dependent behaviors cannot be described in a manner that is simultaneously elegant, precise, and
sufficient?
Note that macroscopic plasticity is often formulated based on the assumption that an effective medium
that is homogenized at the macroscopic scale exists such that it yields comparable constitutive responses
to the real materials. As such, not only the volume-averaged physical qualities, such as porosity and dislo-
cation density but also how these history-dependent patterns evolve over space and time, may affect the
overall constitutive responses. The evolution of patterns, however, is often more difficult to be precisely
described in mathematical terms than the volume averaged or homogenized physical quantities (such as
dislocation density and porosity), and hence less likely to be incorporated into constitutive laws directly.
The consequence is that, while many models can accurately capture the essence and general trend of the
plasticity at the macroscopic scales, the lack of predictions on the underlying spatial patterns within the
representative elementary volume makes it difficult to further improve the accuracy of constitutive models
due to the insufficient precision afforded by the existing descriptors.
The classical alternative to bypass this issue is to introduce additional internal variables and the cor-
responding phenomenological evolution laws for these internal variables such that, as Rice (1971) points
out, the internal arrangement not describable through explicit state variables can be captured (cf. Section
2.4 (Rice,1971;Dafalias and Popov,1976)). For instance, Karapiperis et al. (2021) criticize the implicit and
ad-hoc nature of the internal variables and propose the model-free paradigm as a potential alternative.
This paradigm could be feasible if new data could be generated on-demand in a cost-efficient manner, or
if the existing database has sufficiently dense data points to capture all types of path-dependent behaviors
of an RVE. However, the curse of dimensionality, the lack of a proper distance structure, and the demand
for a large amount of data could all be obstacles to model-free inelasticity solver, as demonstrated, for
instance, in He and Chen (2020); Eggersmann et al. (2019); Carrara et al. (2020). A second alternative to
bypass the use of internal variables is to introduce a concurrent (e.g.Fish et al. (2007); Sun and Mota (2014);
Sun et al. (2017)) or hierarchical multiscale scheme (e.g. (Yvonnet and He,2007;Geers et al.,2010;Liu et al.,
2016b;Sun and Wong,2018;He et al.,2022)). The trade-off of these multiscale schemes is the additional
cost required to upscale the constitutive responses through computational homogenization. As such, the
training of a neural network (cf. Ghaboussi et al. (1991)) or other alternatives such as Gaussian processes
(e.g. Fuhg et al. (2022); Frankel et al. (2022)) can provide a cost-saving way to generate surrogates that
bypass the on-the-fly computational homogenization. Nevertheless, without a set of internal variables to
represent history in a low-dimensional space, how and whether history-dependent effects are sufficiently
captured become ambiguous. For instance, early machine learning constitutive models, which are also
free of internal variables, may simply employ a few previous incremental strains as inputs to introduce
a path-dependent effect (Ghaboussi et al.,1991;Lefik et al.,2009). However, determining the optimal in-
cremental steps involved in the predictions for a given loading rate is not trivial. Wang and Sun (2018)
and later Mozaffar et al. (2019) introduce long short-term memory (LSTM) networks to incorporate path
dependence for constitutive laws while avoiding the vanishing gradient issues that might otherwise be en-
countered in the training. However, adversarial examples generated from deep reinforcement learning (cf.
Wang et al. (2021)) have revealed the difficulty of generating a robust extrapolation of predictions outside
the sampling ranges.
This paper presents a new alternative to introducing evolution laws of internal variables to solve elasto-
plasticity problems with a new focus on connecting the macroscopic internal variables with the spatial
patterns of plastic deformation at the sub-scale level through machine learning. Instead of introducing
internal variables doomed to be not explicitly describable, as stated in Rice (1971), our goal here is to cre-
ate a generic framework where a graph convolutional neural network may deduce internal variables that
can double as low-dimensional descriptors of microstructural patterns of representative elementary vol-
ume. Here, we consider a multiscale modeling problem in which high-fidelity finite element simulations
of microstructures or digital image correlation generate weighted graph data. By using the encoder of a
graph-based autoencoder to embed the plastic deformation data stored in a finite element mesh onto a
low-dimensional vector space, we introduce an autonomous approach to represent strain history in a low-
Geometric learning for multiscale plasticity 3
dimensional Euclidean space where evolution laws can be then deduced. Meanwhile, the graph decoder
maps this low-dimensional vector space back to the weighted graph that represents the plastic patterns
and therefore illustrates the dominant patterns that govern the geometry of the yield surface and the evo-
lution of the plastic flow (DeMers and Cottrell,1992;Hinton and Salakhutdinov,2006;Xu and Duraisamy,
2020;He et al.,2021;Bridgman et al.,2022).
The rest of the paper is organized as follows. In Section 2, we discuss the representation of plastic distri-
bution patterns as graphs, the graph autoencoder architecture that will be utilized to generate the encoded
graph-based descriptors, as well as how they will be incorporated into the neural network elastoplastic
constitutive model to make forward interpretable predictions. In Section 3, we describe the generation of
the elastoplastic database of graphs for complex microstructures and the training of the graph autoencoder
as well as conduct parametric studies on the robustness of the architecture. In Section 4, we investigate the
forward prediction capacity of the proposed neural network constitutive model, compare it with recurrent
neural networks from the literature, and discuss the ability to decode the predicted graph-based descrip-
tors to interpret the behavior in the microscale. Section 5provides concluding remarks.
2 Geometric learning for graph-based multiscale plasticity descriptors
In this section, our objective is to explain how we generate graph-based internal variables to represent
complex elastoplastic microstructures and how these machine learning-generated internal variables are in-
corporated into a component-based neural network plasticity model (Vlassis and Sun,2022). In Section 2.1,
we discuss the interpretation of the plastic distribution of a microstructure as a node-weighted undirected
graph. In Section 2.2, we describe the graph convolutional filter that extracts topological information from
these graphs and the overall graph autoencoder architecture for the generation of the encoded feature
vector internal variables. In Section 2.3, we discuss the constitutive model components that comprise the
neural network elastoplastic model. Finally, in Section 2.4, we present the return mapping algorithm that
will interpret these model components to make forward interpretable elastoplastic predictions.
2.1 Low-dimensional representation of internal history variables in a finite element mesh
In finite element simulations, there is no direct access to the internal variables as a smooth field. Con-
stitutive updates only occur at the locations of integration points. While a projection may enable the con-
struction of smooth field data suitable for supervised learning conducted via a convolution neural net-
work, such a projection may introduce errors especially when strain localization occurs (Mota et al.,2013;
Na et al.,2019;Ma and Sun,2022). As such, we use a collection of undirected weighted graphs as an alter-
native to represent the patterns of the plastic deformation in the RVE to bypass the need of reconstructing
a smooth field in the Euclidean space.
Consider the body of a representative elementary volume discretized in finite elements. In each finite
element, there is at least one integration point where the constitutive update is carried out. In our numerical
examples, we consider a two-dimensional triangular mesh with one integration point per element. In this
case, the integration points of the finite elements constitute the vertex set Vand the position and the plastic
strain constitute the weight w. Finally, we connect a pair of graph nodes with an edge if and only if the
two finite elements of this pair of integration points share at least one finite element node (see Fig. 1).
Repeating this process connects all integration points Vacross elements with an adjacent edge set Eto
form an undirected graph G= (V,E), a two-tuple that represents the topology of the integration points
where V={v1, . . . , vN}is the vertex/node set that represents the Nnumber of integration points of the
mesh and EV×Vis the edge set that represents the element connectivity. The resultant undirected
unweighted graph then represents the topology of the integration points and provides a data structure to
store the history data to generate internal variables.
We attempt to represent the irreversible/memory-dependent patterns stored in the vertices of the graph
in a compressed low-dimensional space. In other words, we attempt to create a low-dimensional vector
space RDenc where an element ζiRDenc of this space may (1) represent memory effects stored in these
integration points in the graph Gand (2) any vector admissible ζmust fulfill the necessary conditions for
the valid internal variables, which are listed below (Rice,1971;Kestin and Rice,1969).
4 Nikolaos N. Vlassis, WaiChing Sun
The plastic portion of the total strain is caused by the change in internal variables for a fixed stress (and
temperature).
The elastic portion of the total strain is caused by the change of stress (and/or temperature) while
holding the internal variable fixed.
The dimension of the latent space of the vertex-weighted graph G0, denoted as Denc, is a hyperparameter
of which the optimal value may vary on a case-by-case basis but can be fine-tuned through trial and error,
as shown in Section 4.
For a multiscale model where a surrogate model is constructed to replace the direct numerical simula-
tions at the RVE and provides the constitutive updates, a feasible candidate is to consider the coordinates
of the integration points and the plastic strain with the RVE as the vertex weight to form a vertex(node)-
weighted graph, a 3-tuple G0(V,E,w). The vertex weight of a vertex-weighted graph w:VRDmaps
an element of the vertex set onto a vector weight where D=5 for two-dimensional cases and D=9 for
three-dimensional cases. The dimension of the vertex weight vector here is the sum of the the dimensions
of the position vector and the number of independent components of the plastic strain tensor.
We then run direct numerical simulations of the RVE to collect snapshots of this vertex-weighted graph
under different loading cases and prescribed boundary conditions of the RVE, with a coverage require-
ment similar to those used for constructing orthogonal bases for reduced order simulations via the method
of snapshots (Rowley and Marsden,2000;Zhong and Sun,2018;Zhong et al.,2021). Assuming that the
sampling for the path-dependent behaviors is sufficient, the next step is to establish a pair of mappings be-
tween these weighted graphs and their low-dimensional representation. This pair is obtained by training
a graph autoencoder in which the encoder maps onto a vector space spanned by the collection of encoded
feature vectors ζiand the decoder maps a given vector ζonto a weighted graph that can, in turn, be inter-
preted via a finite element mesh (see Fig. 1). As such, the autoencoder is used as a tool to perform nonlinear
dimensionality reduction. The architecture of this graph autoencoder is described in Section 2.2 .
!!" element "!" element
##
#$
$#$
67879: ;<:=:89 >:?@ ABCD@ E:DB:?:89C97F8
integration
point
mesh node
Fig. 1: Interpretation of a finite element mesh as an undirected graph. The nodes of the graph represent the
integration points of the mesh. Two nodes are connected with an edge when the integration points they
represent are from neighboring elements - their respective triangular elements share at least one vertex.
Remark 1 In this work, the plasticity graphs can be sufficiently represented with two set of data stored in
matrices. The first set of data stored in the adjacency matrix A, which records the connectivity of the graph.
The adjacency matrix has dimensions N×N. If nodes of the connectivity graph niand njare the endpoints
of an edge, then the entry Ai j is 1. If not, Aij =0. The second set of data is represented by a node feature
matrix Xthat has dimensions N×Dwhere Dis the number of features for each node in the vertex set with
Nelements. Each node in the plasticity graphs has a feature vector of length D, which can be viewed as
the graph weight of the corresponding node. Each row of the node feature matrix represents the features
of a node. These two data structures will also be the actual input of the graph autoencoder described in the
following section.
Geometric learning for multiscale plasticity 5
2.2 Graph autoencoder for the generation of graph-based internal variables
The plasticity graphs described in the previous section will be the basis for the generation of the internal
variables that will characterize the plastic behavior at the macroscale. With the recent increased demand
on geometric learning for graph and manifold data, many applications of autoencoders have expanded to
the non-Euclidean space. Similar to the autoencoder designed for Euclidean data, a graph autoencoder can
perform a variety of common tasks such as, latent representation and link prediction (Kipf and Welling,
2016b), graph embedding (Pan et al.,2018), and clustering (Wang et al.,2017). This section focuses on how
a graph autoencoder architecture can be used to generate internal variables that are directly connected to
spatial patterns of a RVE. Autoencoder architectures (DeMers and Cottrell,1992;Hinton and Salakhutdi-
nov,2006;Vincent et al.,2008) have been used on high-dimensional data structures to perform non-linear
unsupervised dimensionality reduction. Autoencoder architectures consist of two parts, an encoder and a
decoder. The encoder compresses the high-dimensional structure in a latent space of a much smaller di-
mensionality, extracting features and patterns from the structure represented as an encoded feature vector.
The decoder decompresses the encoded feature vector back to the original higher dimension attempting
to accurately reconstruct the autoencoder input. Convolutional autoencoders in the Euclidean space have
been widely used in applications on images to perform - among others - dimensionality reduction, classi-
fication, resolution increase, and denoising tasks (Zeng et al.,2015;Lore et al.,2017;Chen et al.,2017;He
et al.,2021).
!"#
$%&'%()*+%&
!(%,-( .%%(+&/
01&21
!"#
$%&'%()*+%&
01&21
34565789
:98;<5=5<> ?48@A
BC=D7;<4E=<CF
:98;<5=5<> ?48@A
G7=DFCF HC8<E4C IC=<D4
?48@A G7=DFC4 ?48@A JC=DFC4
Fig. 2: Graph autoencoder neural network architecture. The autoencoder encodes the original node-
weighted plasticity graph into an encoded feature vector. The encoded feature vector is then decoded
to a reconstructed plasticity graph.
In this work, the plasticity simulation data are processed through a graph autoencoder Lthat aims to
encode and reconstruct the plastic strain features at the graph nodes. The autoencoder inputs a plasticity
graph Gas described in Section 2.1 and outputs its reconstruction b
G=L(G). The capacity of the autoen-
coder is two-fold and is driven by the two parts that constitute the architecture, the encoder Lenc and the
decoder Ldec. The primary objective of the encoder is to represent, or encode, the graph structure so that it
can be easily exploited by other machine learning models, such as the yield function and kinetic law archi-
tecture described in Section 2.3. The encoder interpolates the node feature data (topology and plastic strain
tensor) that represent the entire domain of the simulation into a vector space of much smaller dimension-
ality. These encoded feature vectors represent the patterns recorded during the elastoplastic loading and
can be utilized as internal variables for data-driven macroscopic elasto-plastic models. The neural network
components that constitute the constitutive update algorithm are described in Section 2.3.
The decoder decodes these encoded internal variables and reconstructs the distribution of the plastic
strain within the RVE represented by the corresponding decoded weight graph. As such, the reconstructed
graph node values is corresponding to the plastic strain values of the integration points in the finite ele-
ment mesh. This connection between internal variables and graph data makes the evolution of the internal
variables interpretable through decoding. As shown in the latter sections, a salient feature of the proposed
framework is the capacity to indirect predict the evolution of the high-dimensional pattern rearrangement
in the microscopic scope through the kinetic law of the low-dimensional encoded feature vector. This treat-
ment open a new door to predict the spatial patterns of the plastic strain that are also compatible with
6 Nikolaos N. Vlassis, WaiChing Sun
the physical constraints of the up-scaled constitutive laws (e.g. consistency condition, incremental stress
correction from plastic flow, Karush-Kuhn-Tucker conditions) upon homogenization.
2.2.1 Graph autoencoder architecture
In recent years, graph neural network (GNN) techniques(Hamilton et al.,2017b) based on node-neighborhood
aggregation and graph-level pooling (Scarselli et al.,2008;Kipf and Welling,2016a;Hamilton et al.,2017a;
Defferrard et al.,2016) have become increasingly popular tools to embed graph data. In general, graph
neural network layers utilize recursive aggregation or message passing methods to aggregate and interpo-
late the feature vectors of a node and its neighbors to compute a new feature vector. A representation of
the entire graph feature can be achieved by global pooling or other graph-level operators (Li et al.,2015;
Zhang et al.,2018). These graph layers extract information from the graph connectivity and graph feature
to be learned/encoded by other neural network layers.
The graph convolutional layers present in this work are Graph Isomorphism Network (GIN) layers
introduced by (Xu et al.,2018). This variant of the GNN was shown to discriminate / represent graph
structures as well as the Weisfeiler-Lehman graph isomorphism test (Leman and Weisfeiler,1968). The
Weifeiler-Lehman test checks the isomorphism between two graphs it tests if the graphs are topolog-
ically identical. This is achieved by a robust injective aggregation algorithm update that maps different
node neighborhoods to different unique feature vectors. To maximize its representation capacity, a graph
aggregation algorithm should be injective and not map two different neighborhoods of nodes to the same
representation. The suggested Graph Isomorphism Network borrows these aggregation concepts and gen-
eralizes the Weifeiler-Lehman test. The GIN architecture attempts to maximize the capacity of graph neural
networks by ensuring that two isomorphic graph structures are embedded in the same representation and
two non-isomorphic ones to different representations.
The GIN layer models an injective neighborhood aggregation of multisets (features of a neighborhood
of nodes) by approximating multiset functions with neural networks. The layer’s formulation is based on
the assumption that a neural network can serve as a universal approximator of any function as shown by
(Hornik et al.,1989) and, thus, can also approximate a parametrized aggregation function. The GIN layer
formulation is the following:
h(k)
v=MLP(k)
1+e(k)·h(k1)
v+
u∈N(v)
h(k1)
u
, (1)
where h(k)
vis the node vfeature representation of the k-th layer of the architecture, eis a learnable parameter
or a fixed scalar, Nis the number of nodes, and MLP is the multi-layer perceptron to learn the aggregation
function approximation. The layer can also be formulated in terms of the adjacency matrix Aand the
feature matrix X:
X(k)=MLP(k)(A+ (1+e)·I)·X(k1). (2)
This graph operator extracts features from local neighborhoods in the plasticity graph structure. In
order to get a graph-level representation, we also utilize a graph-level pooling operation on the features of
all of the nodes. The pooling operation used was a global average pooling defined as:
r(k)=1
Ni
Ni
n=1
h(k1)
n, (3)
where r(k)is the global pooled feature vector of the graph. This global feature vector will be learned and
encoded by the following multi-layer perceptron architecture to create the encoded feature vector of the
graph.
A schematic of the neural network architecture used on the plasticity graphs is provided in Fig. 2. The
neural network consists of GIN convolutional, graph pooling, and fully-connected layers. The network’s
inputs include a graph adjacency matrix Aand a node feature matrix Xas described in Section 2.1. The
network’s output is the approximated reconstruction of the node feature matrix b
X. All the graphs in the
Geometric learning for multiscale plasticity 7
data set have a constant connectivity matrix as the connectivity originates from the finite element mesh. The
node features of the graph correspond to the plastic strain distribution at the current time step and change
during loading. Thus, the autoencoder maps the input graph 3-tuple G(V,E,w)to the reconstructed graph
b
G(V,E,bw).
The encoder Lenc architecture compressed the high-dimensional graph into an encoded feature. The
graph adjacency matrix Aand the node feature matrix Xare input in GIN layer with 64 output channel-
s/filters the dimensionality of the feature representation for every node is increased from Dto 64. This is
achieved by setting up a fully-connected layer of 64 neurons that will serve as the MLP of the aggregation
function as shown in Eq. (2). The activation function for this layer is set to be the Rectified Linear Unit
function (ReLU). The constant eis set constant and equal to zero. The convolutional layer is followed by
a global average pooling layer as described in Eq. (3) to generate a global representation of 64 features.
This is fed into a fully-connected Dense layer of 64 neurons with a ReLu activation function. The output is
connected to another Dense layer that will produce the encoded feature vector of dimension Denc.
The decoder Ldec architecture decompresses the encoded feature vector back to the original graph
space. The first layer of the decoder is a Dense layer with N·Denc neurons, where Nis the number of
nodes in the graph, with a ReLU activation function. The output of this layer is reshaped to form a N×Denc
feature matrix. This feature matrix is passed on GIN convolutional layer along with the adjacency matrix
A. The GIN layer has a number of filters equal to D the MLP approximator of the aggregation is a Dense
layer with a width of Dneurons. The layer has a linear activation function. The output of the decoder is
the approximated reconstructed feature matrix of the graph b
X.
The reconstruction loss function for the autoencoder is set up to minimize the discrepancy between the
input node feature matrix Xand the output approximated feature matrix b
Xof every graph sample. Since
the coordinates of the nodes remain constant, the loss function is targeting the accuracy of the predicted
plastic strain tensor at the nodes. The loss function is modeled after a node-wise mean squared error loss
function of the features. The autoencoder function is parametrized by weights Wand biases bsuch that
L=L(A,X|W,b)and the training objective can be defined as:
W0,b0=argmin
W,b 1
M
M
k=1 1
N
N
l=1
xpl
k,lbxpl
k,l
2
2!!, (4)
where Mis the number of graph samples in the data set, Nis the number of graph nodes, and xpl
k,l,bxpl
k,lare
the true and approximated plastic strain tensor components of node lof graph krespectively.
The optimization of the autoencoder weights and biases is performed using the Adam optimizer (Kingma
and Ba,2014) with the learning rate set to 13and the rest of the parameters selected as the PyTorch de-
fault ones. The layers’ kernel weight matrices and bias vectors were initialized using the default He normal
initialization (He et al.,2015). The autoencoder was trained for 2000 epochs with a batch size of 20 graph
samples. The graph autoencoder architecture is built and optimized using the PyTorch neural network
library (Paszke et al.,2019) and its geometric learning extension PyTorch Geometric (Fey and Lenssen,
2019).
2.3 Supervised learning problems for the graph-enhanced constitutive model
In this work, we propose the learning problems required to establish components for the constitutive
laws that are integrated together to perform elastoplasticity predictions including the predictions of the
encoded feature vector internal variables as described in Section 2.2 that will be used to interpret the spatial
patterns of the RVE. The formulation of the Sobolev learning problem for the elasticity model component
is omitted for brevity and to avoid repetitions (see Vlassis et al. (2020); Vlassis and Sun (2021b); Vlassis
et al. (2022) for more implementation details). A brief discussion of the training of this model for a specific
RVE is provided in Section 4.1. The reader may also refer to Vlassis et al. (2020) for an implementation of
a geometric learning elasticity model intended to predict anisotropic elasticity for a family of RVEs with
different microstructures.
The plastic components of the neural network-based elastoplasticity framework will be driven by three
neural networks, i.e.,
8 Nikolaos N. Vlassis, WaiChing Sun
1. a feed-forward neural network-based yield function that predicts the plastic yielding given the current
stress and loading history;
2. a recurrent neural network-based kinetic law that infers the microstructural evolution represented by
the encoded feature vector from the history of macroscopic plastic strain;
3. a feed-forward neural network that predicts the macroscopic non-associative plastic flow based on the
rate of change of the encoded feature vector.
These three components will be integrated along with the hyperelastic energy functional using a strain-
space return mapping algorithm to predict the material’s macroscopic elastoplastic behavior as described
in Section 2.4. What follows is the formulation of the learning problems that generate these neural net-
works.
2.3.1 Feed-forward neural network-based yield function
In this work, we adopt the level set plasticity concept (Vlassis and Sun,2021b,a) in which a neural
network is used to generate the yield function inferred from a set of point data of the homogenized stress
points collected via microscale simulations. As such, we circumvent the need to hand-craft a yield function
model by replacing the construction of the yield function with a neural network level set initialization prob-
lem. To generate the yield function, every yield surface point cloud stress point at each time step is paired
with an encoded feature vector obtained at the same instant of the RVE simulations. The establishment of
a yield function will distinguish the elastic path-independent responses from the plastic path-dependent
counterparts. In addition, we also impose a restriction such that the evolution of the encoded feature vector
internal variables may only occur during the plastic loading and stop upon elastic unloading. Hence, it will
allow for the interpreted plastic distribution in the microscale to only evolve during a plastic increment.
The yield function neural network inputs the stress state and the current accumulated plastic strain.
Presumably, one may also establish yield function as a function of stress and encoded feature vectors
where the evolution of the encoded feature vector is captured by a recurrent neural network. However, the
construction of such a yield function can be complicated due to the high dimensionality.
As such, we borrow the ideas from generalized plasticity models (Pastor et al.,1990) where we in-
troduce a yield function at a low-dimensional parametric space but introduce a more elaborated effort to
capture the complexity of the plastic flow by predicting the path-dependent relationships among (1) the
homogenized plastic strain, (2) the encoded feature vectors that represent the dominating plastic strain
patterns at the RVE, and (3) the resultant plastic flow. As such, the yield function is only represented in the
two stress invariant pqspace. Thus, we reduce the stress representation from six dimensions (symmetric
stress tensor) to an equivalent two-dimensional representation x(p,q)where pis the mean pressure and q
the deviatoric stress invariants of the Cauchy stress tensor.
The accumulated plastic strain and the stress invariants can be seen as the coordinates of a point cloud
of yielding surface fΓsamples. In previous work, the yield stress point cloud collected from experiments
would be pre-processed into a yield function level set φ. The evolution of this level set (hardening) would
be predicted through a neural network that emulates the solution of the Hamilton-Jacobi level set extension
problem. The incremental solutions of this problem is the evolving level set taken at a given monotonically
increasing pseudo-time t, which is the accumulated plastic strain internal variable ξin our case. The yield
function instance fnthat corresponds to a plastic strain level ep
nand in turn an encoded feature vector ζn
would be the level set solution of the Hamilton-Jacobi problem φn
To generate the neural network yield function, we first recast the yield function finto a signed distance
function φ, such that f(p,q,ξ) = φ(p,q,ξ). We define a neural network approximation of the level set yield
function fas b
f=b
f(p,q,ξ|W,b), parametrized by weights Wand biases b.
The training objective of the neural network optimization is modeled after the minimization of the L2
norm of yield surface points in the data set and the Eikonal equation solution that reads |∇xφ|=1, while
prescribing the signed distance function to 0 at xfΓ. Minimizing the Eikonal equation solution loss term
will implicitly ensure the construction of the yield function level set. The training loss function at training
samples (xi,ζi)for i[1, ..., M]is defined as:
W0,b0=argmin
W,b 1
M
M
i=1
fib
fi
2
2+w
xb
fi
2
21!, (5)
Geometric learning for multiscale plasticity 9
where wis a weight hyperparameter for the Eikonal equation term. Unlike previous level set plasticity
models (e.g. Vlassis and Sun (2021b)) which focus on using Sobolev training to bypass the use of the non-
associative flow rule, this modeling framework predicts the history dependence of the plastic flow via a
neural network non-associative flow rule. This treatment avoids the potential gradient conflict issues for
multi-objective training and simplifies the neural network training (Bahmani and Sun,2021). In this work,
a feed-forward neural network is trained to predict how the current spatial patterns represented by the
encoded feature vectors lead to the changes in plastic flow. Meanwhile, the encoded feature vector itself
is predicted by another recurrent neural network that links the history of the homogenized plastic strain
of a particular RVE to an encoded feature vector that represents the current plastic strain field within this
RVE (see Section 2.3.3). Directly predicting the plastic flow from phenomenological observations between
the evolution of hand-picked microstructural descriptors and the resultant macroscopic plastic flow is not
a new idea. In fact, this treatment is commonly used in generalized plasticity models (Pastor et al.,1990;
Dafalias and Manzari,2004;Wang et al.,2016;Sun,2013;Liu et al.,2016a). The key innovation here is
the replacement of the hand-picked descriptors with a generic encoded feature vector approach that can
efficiently reduce the dimensions of the topological information without losing the information that might
otherwise compromise the accuracy of predictions.
2.3.2 Recurrent neural network-based kinetic law for the encoded feature vector
In this work, the evolution of the encoded feature vector internal variables is characterized by a macro-
scopic kinetic law that takes the history of the homogenized plastic strains of an RVE as input and outputs
the change of the encoded feature vector that represents the spatial patterns of plastic strain field of this
specific RVE. We assume no prior knowledge of the exact parameterization of this kinetic law. Instead,
we obtain the approximation of this kinetic law by training a recurrent neural network that provides the
solution of a dynamical system (Funahashi and Nakamura,1993;Bailer-Jones et al.,1998).
Here, we hypothesize that (1) the mapping from the homogenized plastic strain and the encoded fea-
ture vector is surjective but not necessarily injective and that (2) the evolution of the encoded feature
vector solely depends on the time history of the plastic strain tensor. The first condition is reasonable as it
is possible to have different plastic strain fields within the same RVE that can be homogenized to the same
macroscopic plastic strain, while the second condition is adopted for simplicity.
As such, the trained recurrent neural network takes the history of the homogenized plastic strain tensor
ep
hist =hep
n`, . . . , ep
n1,ep
niof length `and outputs the encoded feature vector ζnat time step n. The
relation between the plastic strain and the encoded feature vector is approximated by a neural network
architecture defined as b
ζ=b
ζ(ep
hist)and parametrized by weights Wand biases b. The training objective
for samples ζifor i[1, ..., M]is defined as:
W0,b0=argmin
W,b 1
M
M
i=1
ζib
ζ(ep
hist,i)
2
2!, (6)
where the back-propagation occurs through time such that the history of the plastic strain tensor is trained
as time series data.
The kinetic law neural network is utilized in the return mapping algorithm described in Section 2.4 to
predict the change of the encoded feature vector at every constitutive update. Since the encoded feature
vector is designed to change only during the plastic step, the kinetic law is only needed when the material
is yielding. The kinetic law neural network will be used in parallel with the yield function neural network
described in the previous section in the return mapping scheme to make forward predictions that are con-
sistent with the current plastic strain. At every step of the elastoplastic simulation, be it in the elastic or
plastic behavior, there is access to the encoded feature vector that can in turn be decoded into the origi-
nal plasticity graph space to interpret the plastic strain distribution of the material. A demonstration and
discussion of this capacity are described in Section 4.3. The predicted encoded feature vector will also be
utilized to predict the plastic flow direction of the current plastic step.
10 Nikolaos N. Vlassis, WaiChing Sun
2.3.3 Feed-forward neural network-based multiscale coupling plastic flow
The last constitutive model component introduced in this work is a plastic flow neural network. This
network maps the encoded feature vector internal variable to the current plastic flow direction in the
macroscale. Assuming the existence of a plastic potential function ϑand the plastic flow is non-associative,
the following relationship holds,
˙ep=˙
λ∂ϑ
σ, (7)
where λis the plastic multiplier. The plastic flow direction is not derived by the yield function stress gra-
dient to allow a more precise prediction of the plastic flow made possible by the newly gained knowledge
of the microstructural evolution (represented by the kinetic law of the encoded feature vector). For the
general 3D case, the stress gradient of the plastic potential can be spectrally decomposed such that:
∂ϑ
σ=
3
A=1
gAnAnAfor A=1, 2, 3, (8)
where gA=∂ϑ
∂σAand σAis the principal stress and nAthe respective principal direction.
In this work, the plastic potential has not been explicitly reconstructed from the plastic flow, although
it might be possible as demonstrated in Vlassis et al. (2022). Instead, a generalized plasticity framework is
adopted. As such, the mapping from the change of the encoded feature vector δζ to the plastic flow gis,
approximated by a neural network architecture defined as bg=bg(δζ)and parametrized by weights Wand
biases b. The training objective for samples gifor i[1, ..., M]is defined as:
W0,b0=argmin
W,b 1
3M
M
i=1
3
A=1kgA,ib
gA,i(δζi)k2
2!, (9)
where δζiis the interpolated change of the encoded feature vectors, which can be approximated via
the Backward or Forward Euler method. Note that, in contrast to the kinetic law that is approximated via a
recurrent neural network (see Section 2.3.2), we assume that the mapping from the encoded feature vector
to the macroscopic plastic flow can be sufficiently represented with a feed-forward neural network. This
is a reasonable because the encoded feature vector is used to represent the entire plastic strain field within
an RVE and the plastic flow can simply be inferred by homogenizing the graph data decoded from the
encoded feature vector. Presumably, one may even bypass the training of neural network plastic flow pre-
dictor by performing exactly this calculation. This learning problem is nevertheless introduced for ease of
implementation. As demonstrated in Section 4.1, training a sufficiently accurate neural network predictor
for plastic flow is feasible.
2.4 Return mapping algorithm
In this section, we provide the implementation details for the return mapping algorithm to make for-
ward elasoplasticity predictions. The fully implicit stress integration algorithm allows for the incorporation
of the graph-based internal variables generated from the autoencoder architecture as described in Sec-
tion 2.2 in the prediction scheme. The return mapping algorithm is designed in the principal strain space
is described in Algorithm 1.
This formulation of the return mapping algorithm requires all the strain and stress measures are in
principal axes. However, this is not limiting for the choice of strain and stress space formulation of the
constitutive law components as the framework allows for coordinate system transformation through au-
tomatic differentiation. The automatic differentiation is facilitated with the use of the Autograd library
(Maclaurin et al.,2015). The yield function is written in a two stress invariant formulation (p,q)as de-
scribed in Section 2.3.1. Through automatic differentiation and a series of chain rules the constitutive model
predictions are expressed in the principal axes, allowing for any invariant formulation of the yield function
during training.
Geometric learning for multiscale plasticity 11
The elastoplastic behavior is modeled through a predictor-corrector scheme that integrates the elastic
prediction with the corrections by the yield function neural network. It is noted that the elastic update
predictions for the hyperelastic energy functional and the plasticity terms encountered in the return map-
ping algorithm are evaluated as neural network predictions using the offline trained energy functional,
yield function, and kinetic law using the Tensorflow (Abadi et al.,2016) and Keras (Chollet et al.,2015)
machine learning libraries. The hyperelastic neural network is based on the prediction of an energy func-
tional with interpretable first-order and second-order derivatives (stress and stiffness respectively). The
plasticity neural networks utilized in this algorithm are described in Section 2.3. Besides the capacity to
predict the values of the approximated functions, these libraries also allow for the automatic evaluation of
the approximated function derivatives that are required to perform the return mapping constitutive up-
dates and constructing the local Newton-Raphson tangent matrix as well as the necessary coordinate set
transformation chain rules.
Algorithm 1 Return mapping algorithm in strain-space for encoded feature vector internal variable plas-
ticity.
Require: Hyperelastic energy functional b
ψeneural network, yield function b
fneural network, the encoded
feature vector neural network b
ζ, and the plastic flow network bg.
1. Compute trial elastic strain
Compute ee tr
n+1=ee
n+e.
Spectrally decompose ee tr
n+1=3
A=1ee tr
Antr,Antr,A.
2. Compute trial elastic stress
Compute σtr
A=b
ψe/∂ee
Afor A=1, 2, 3 and the corresponding ptr,qtr at ee tr
n+1.
3. Check yield condition and perform return mapping if loading is plastic
if b
fptr,qtr ,ξn0then
Set σn+1=3
A=1σtr
Antr,Antr,Aand exit.
else
Compute encoded feature vector ζn=b
ζ(ep
histn).
Compute plastic flow direction b
ϑ
∂σA=bg(δζn)for A=1, 2, 3.
Solve for ee
1,ee
2,ee
3, and ξn+1such that b
f(p,q,ξn+1)=0.
Compute σn+1=3
A=1b
ψe/∂ee
Antr,Antr,Aand exit.
end if
The return mapping also incorporates a non-associative flow rule to update the plastic flow direction
instead of using the stress gradient of the yield function. This is achieved by incorporating the predictions
of the plastic flow bgnetwork described in Section 2.3.3. Through the local iteration scheme, the solution
for the true elastic strain values can be retrieved using the solved for discrete plastic multiplier λand the
predicted flow direction as follows:
ee
A=ee tr
Aλb
ϑ
∂σA
=ee tr
Aλb
gA(c
δζ),A=1, 2, 3. (10)
The return mapping algorithm requires a hyperelastic energy functional neural network b
ψe, a yield
function b
f, a kinetic law b
ζ, and a plastic flow bgneural network that are pre-trained offline. Given the
elastic strain tensor at the current loading step, a trial elastic stress state is calculated using the hyperelastic
energy functional neural network. The yield condition is checked for the trial elastic stress state and the
current plastic strain level. If the predicted yield function is positive, the trial stress is the in the elastic
region and is the actual stress. The encoded feature vector remains constant. If the yield function is non-
positve the trial stress is in the inadmissible stress region and an Newton-Raphson optimization scheme
is utilized to correct the stress prediction. The current encoded feature vector is predicted from the time
history of plastic strain tensors and is used to predict the current plastic flow directions. The goal of the
12 Nikolaos N. Vlassis, WaiChing Sun
return mapping algorithm is to solve for the prinicipal elastic strains and the plastic strain such that the
predicted yield function is equal to zero and the stress updates is consistent with the plastic flow. The
encoded feature vector at every step can be converted back into the corresponding weighted graph via
the graph decoder Ldec neural network. This weighted graph can be converted back into information in a
finite element mesh and therefore enable us to interpret the microstructure.
It is noted that the return mapping algorithm is formulated via the principle direction is provided in
this section for the generalization purpose. This setting is sufficient for isotropic materials. In our numerical
examples, we only introduce two-dimensional cases to illustrate the ideas for simplicity. The generaliza-
tion of the return mapping algorithm for anisotropic materials is straightforward, but the training of the
yield function and the plastic flow model in the higher dimensional parametric space is not trivial. This
improvement will be considered in the future but is out of the scope of this study.
3 Training of interpretable graph embedding internal variables
In this section, we demonstrate the procedure of embedding field simulation data to construct graph-
based internal variables. A graph convolutional autoencoder is used to compress the graph structures that
carry the plastic deformation distribution of a microstructure. In Section 3.1, we demonstrate the process of
generating the plasticity data through finite element method (FEM) simulations and post-processing them
into weighted graph structures. In Section 3.2, we showcase the performance capacity of the autoencoder
architecture as well as its ability to reproduce the plasticity graph structures. Finally, in Section 3.3, we
perform a sensitivity training test for the autoencoder architecture on different FEM meshes for the same
microstructure.
3.1 Generation of the plasticity graph database
In this work, the autoencoders used for the generation of the graph-based internal variables and the
neural network constitutive models used for the forward predictions are trained on data sets generated by
FEM elastoplasticity simulations. To test the autoencoders’ capacity to generate encoded feature vectors
regardless of the microstructure and plastic strain distribution patterns the FEM mesh represents, we test
the algorithm with two microstructures of different levels of complexity. The two microstructures A and
B are demonstrated in Fig. 3(a) and (b) respectively. The outline of the microstructures is a square with
a side of 1 mm. This figure also shows the meshing of the two microstructures. The microstructures A
and B are discretized by 250 and 186 triangular elements respectively with one integration point each. An
investigation of different mesh sizes and the sensitivity of the encoded feature generation is demonstrated
in Section 3.3. Each integration point of mesh corresponds to a node in the equivalent graph (also shown
in Fig. 3). The integration points of the neighboring elements - elements that share at least one vertex - are
connected with an edge in the constructed graph as described in Section 2.1.
The constitutive model selected for the local behavior at the material points was linear elasticity and J2
plasticity with isotropic hardening. The local behavior is predicted with an energy minimization algorithm
that is introduced in Miehe (2002). The local optimization algorithm is omitted for brevity. The local linear
elastic material has a Young’s modulus of E=2.0799MPa and a Poisson ratio of ν=0.3. The local J2 plas-
ticity has an initial yield stress of 100kPa and a hardening modulus of H=0.1E. During the simulation, the
elastic, plastic, and total strain as well as the hyperelastic energy functional and stress are saved for every
integration point. The recorded plastic strain tensor along with the initial integration points coordinates
will be used as the node weight vector for the plasticity graphs as described in Section 2.1. The strain, en-
ergy, and stress values will be volume averaged and used to train the neural network constitutive models
for the homogenized response as described in Section 2.3.
To capture varying patterns of distribution of plastic strain, the finite element simulations were per-
formed under various combinations of uniaxial and shear loading. The loading was enforced with dis-
placement boundary conditions applied to all the sides of the mesh for both microstructures A and B.
The combinations of displacement boundary conditions are sampled by rotating a loading displacement
vector from 0to 90whose components are the uniaxial displacements in the two directions for the
Geometric learning for multiscale plasticity 13
(a)
(b)
Fig. 3: Two microstructures represented as a finite element mesh and an equivalent node-weighted undi-
rected graph as described in Section 2.1.
pure axial displacement cases and the uniaxial and shear displacements for the combined uniaxial and
shear loading. The maximum displacement magnitude for axial and shear loading vector components are
ugoal =1.5 ×103mm. We sample a total of 100 loading combinations/FEM simulations for each mi-
crostructure. During each of these simulations, we record the constitutive response at every material point
and post-process it as a node-weighted graph and a volume average response. For every simulation, we
record 100 time steps, thus collecting 10000 training sample pairs of graphs and homogenized responses
for each microstructure.
3.2 Training of the graph autoencoder
100101102103
Epoch
10°5
10°4
10°3
Loss
Autoencoder Reconstruction Training Loss
100101102103
Epoch
10°4
10°3
Loss
Autoencoder Reconstruction Training Loss
(a) (b)
Fig. 4: Autoencoder reconstruction training loss for microstructures A and B (a and b respectively) as
defined in Eq. 4.
In this section, we demonstrate the training performance of the autoencoder architecture on the two mi-
crostructure data sets described in Section 3.1. We also show the capacity of the autoencoder to reproduce
the plasticity graphs in the training samples. The autoencoder layer architecture and training procedure for
both data sets are described in Section 2.2.1. The dimension of the encoded feature vector in this example is
set to Denc =16. An examination of the effect of the encoded feature vector size is described in Section 3.3.
14 Nikolaos N. Vlassis, WaiChing Sun
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.1
0.2
0.3
0.4
0.5
0.6
¯
p
0.0 0.2 0.4 0.6
Benchmark Node ¯
p
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Predicted Node ¯
p
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.1
0.2
0.3
0.4
0.5
0.6
¯
p
0.0 0.2 0.4 0.6
Benchmark Node ¯
p
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Predicted Node ¯
p
(b)
Fig. 5: Prediction of the autoencoder architecture of the microstructure A for two loading paths (a and b).
The graph node size and color represent the magnitude of the accumulated plastic strain ep. The node-wise
predictions for epare also demonstrated.
The training curves for the autoencoder’s reconstruction loss function in Eq.(4) is shown in Fig. 4.
The autoencoder appears to have similar loss function performance for both microstructures. The autoen-
coder performs slightly better for microstructure A as it is tasked to learn and reproduce patterns for a
seemingly simpler microstructure compared to microstructure B. The training loss curves in this figure
demonstrate the overall performance of the autoencoder architecture the encoder and decoder compo-
nents of the architecture are trained simultaneously. The capacity of the autoencoder to reconstruct the
plasticity distribution patterns is explored in Fig. 5and Fig. 6for microstructures A and B respectively. In
these figures, we showcase the reconstruction capacity of the plastic strain for two different time steps for
each microstructure. The time steps selected are from two different loading path combinations resulting
in different plasticity graph patterns. We demonstrate how the autoencoder can reproduce these patterns
by comparing the internal variable graph data the autoencoder input with the graph reconstruction
the autoencoder output. We also show the accuracy of the node-wise prediction of the accumulated plastic
strain for these microstructures. It is noted that the autoencoder predicts the values of the full plastic strain
tensor at the nodes. However, these plots show the accumulated plastic strain epvalues calculated from
the predicted strain tensor at the nodes for easier visualization.
The autoencoder architecture provides the flexibility of utilizing its two components, the encoder Lenc
and the decoder Ldec, separately. In this section, we demonstrate the encoder’s ability to process the high-
dimensional graph structure in encoded feature vector ζtime histories. In Fig. 7(a) & (c) and Fig. 8(a) &
(c), we show the predicted encoded feature vector ζnfor a time step of a loading path for microstructures
A and B respectively. These encoded feature vectors specifically correspond to the graphs shown in Fig. 5
and Fig. 6respectively. In Fig. 7(b) & (d) and Fig. 8(b) & (d), we demonstrate the time series of plasticity
graphs encoded in time series of encoded feature vectors. It is highlighted that the encoded feature vector
values do not change during the elastic/path-independent part of the loading. This is directly attributed
to the fact that the plasticity graph is constant (zero plastic strain at the nodes) before yielding for all the
time steps and will be further discussed in Section 4. The benefit of separately using the decoder Ldec as
a post-processing step to interpret the predicted encoded feature vectors is also explored in the following
sections.
Geometric learning for multiscale plasticity 15
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.2
0.4
0.6
¯
p
0.0 0.2 0.4 0.6 0.8
Benchmark Node ¯
p
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Predicted Node ¯
p
(a)
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.2
0.4
0.6
0.8
1.0
¯
p
0.00 0.25 0.50 0.75 1.00
Benchmark Node ¯
p
0.0
0.2
0.4
0.6
0.8
1.0
Predicted Node ¯
p
(b)
Fig. 6: Prediction of the autoencoder architecture of the microstructure B for two loading paths (a and b).
The graph node size and color represent the magnitude of the accumulated plastic strain ep. The node-wise
predictions for epare also demonstrated.
3.3 Mesh sensitivity and Encoded Feature Vector dimension
In this section, we investigate the behavior for different dimensionality of the graph data set and the
compression of the graph information. In the first experiment, we test the effect of the size of the input
plasticity graph that will be reconstructed by the autoencoder. We generate three data sets from finite ele-
ment simulations with different mesh sizes for microstructure A. All meshes consist of the same triangular
elements described in the previous section. The number of elements in the mesh and the corresponding
nodes in the post-processed graphs are N=100, N=250, and N=576. For the mesh generation, we
start with the N=100 mesh and refine once to obtain the N=250 mesh and twice to obtain the N=576
mesh. The refinement was performed automatically using the meshing software library Cubit (Blacker
et al.,1994). The data sets for the meshes of N=100, N=250, and N=576 nodes mesh were generated
through the same FEM simulation setup and a subset of the combinations of uniaxial and shear loading
paths described in Section 3.1 gathering 2500 training samples of graphs.
The results of the training experiment on the mesh sensitivity are demonstrated in Fig. 9. The figure
shows the reconstruction loss Eq. (4) for the three mesh sizes. The autoencoder architecture was identical to
the one described in Section 2.2.1 with the only change being the number of input nodes. The reconstruction
loss exhibits a minor improvement as the number of nodes in the graph increase. This is attributed to the
density of the information available for the autoencoder to learn the patterns from there is a higher
resolution of adjacent nodes’ features. However, an increase in graph size may increase the duration of the
training procedure. Since the benefit of increasing the mesh size is not significant in this set of numerical
experiments, we opt for the N=250 node mesh to use for the rest of this work.
In a second numerical experiment, we examine the effect of the size of the encoded feature vector on
the capacity of the autoencoder to learn and reconstruct the plasticity distribution patterns. We re-train the
autoencoder using the data set of 10000 graphs generated on the N=250 node mesh for microstructure
A as described in Section 3.1. We perform three training experiments selecting different sizes Denc for the
encoded feature vector Denc =2, Denc =16, and Denc =32. The autoencoder architectures are identical to
the one described in Section 2.2.1. All the convolutional filters and the Dense layers have the same size. The
only Dense layers that are affected are those around the encoded feature vector whose input and output
sizes are modified to accommodate the different sizes of encoded feature vector.
16 Nikolaos N. Vlassis, WaiChing Sun
n
°0.25
0.00
0.25
0.50
0 20 40 60 80 100
Time step
°0.3
°0.2
°0.1
0.0
0.1
0.2
0.3
Feature
0
20
40
60
80
Time step
history
(a) (b)
n
°0.25
0.00
0.25
0.50
0 20 40 60 80 100
Time step
°0.1
0.0
0.1
0.2
0.3
0.4
Feature
0
20
40
60
80
Time step
history
(c) (d)
Fig. 7: Prediction of encoded feature vector ζby the encoder Lenc for microstructure A. (a,c) The encoded
feature vector ζnfor a single time step for the plastic graphs shown in Fig. 5a and Fig. 5b respectively.
(b,d) The encoded feature vector ζhistory for all the time steps in the loading paths of Fig. 5a and Fig. 5b
respectively.
The training performance for these three training experiments is demonstrated in Fig. 10. Compared to
encoded feature vector sizes Denc =16 and Denc =32, the Denc =2 autoencoder architecture seems to fail
to compress the information as well with a loss performance difference of about two orders of magnitude.
The maximum compression achieved for this autoencoder architecture setup appears to be two features.
This dimensionality appears to be the smallest feasible encoding limit for this particular data set. It is also
possible that more sampling from different loading paths may also increase this minimal dimensionality.
Jumping from Denc =16 to Denc =32 encoded feature vector components, only a small improvement
in the reconstruction capacity is observed. The reconstruction capacity is also illustrated in Fig. 11. The
decoder fails to accurately reconstruct the plasticity graph from the Denc =2 encoded feature vector (Fig. 11
a). However, for Denc =16 and Denc =32 the decoder accurately reproduces the plasticity patterns (Fig. 11
b and c). It is expected for dimensions larger than Denc =32 the benefit in reconstruction capacity will me
minimal. Thus, the encoded feature vector dimension selected for the rest of this work is Denc =16 for
which the dimension reduction capacity is considered adequate and computationally efficient.
Remark 2 It should be noted that the relatively small losses observed in Fig. 9only indicate that the three
reconstructions are independently successful in the sense that the reconstructed graphs are all sufficiently
close to the original ones. It is plausible that a mapping can be constructed among the latent spaces of
graphs originating from different finite element meshes, as indicated by Fetty et al. (2020) and Asperti and
Tonelli (2022). The manipulations and mapping between latent spaces are out of the scope of this paper
but will be further investigated in the future.
Geometric learning for multiscale plasticity 17
n
°1
0
1
0 20 40 60 80 100
Time step
°0.6
°0.4
°0.2
0.0
0.2
0.4
Feature
0
20
40
60
80
Time step
history
(a) (b)
n
°1
0
1
0 20 40 60 80 100
Time step
°1.0
°0.8
°0.6
°0.4
°0.2
0.0
Feature
0
20
40
60
80
Time step
history
(c) (d)
Fig. 8: Prediction of encoded feature vector ζby the encoder Lenc for microstructure B. (a,c) The encoded
feature vector ζnfor a single time step for the plastic graphs shown in Fig. 6a and Fig. 6) respectively.
(b,d) The encoded feature vector ζhistory for all the time steps in the loading paths of Fig. 6a and Fig. 6b
respectively.
𝑁 = 100 𝑁 = 250 𝑁 = 576
100101102103
Epoch
10°4
10°3
10°2
10°1
Loss
Autoencoder Reconstruction Training Loss
N= 100
N= 250
N= 576
100101102103
Epoch
10°4
10°3
Loss
Autoencoder Reconstruction Training Loss
(a) (b)
Fig. 9: Autoencoder reconstruction training loss for microstructure A as defined in Eq. 4with the size of
the encoded feature vector Denc =16 and graph sizes of N=100, N=250, and N=576 nodes.
4 Numerical Example: Multiscale plasticity with graph internal variables
In this section, we demonstrate how the graph internal variables generated by the graph autoencoder
are incorporated into the predictions at the macroscale constitutive law as well as how they can be decoded
and be used to estimate and interpret the evolution of microstructures within an RVE. In Section 4.1, we
demonstrate the training results for the neural network constitutive models described in Section 2.3. In
Section 4.2, the predictions of these neural network constitutive models are integrated with a return map-
ping algorithm to make forward elastoplastic predictions and how they compare with recurrent neural
18 Nikolaos N. Vlassis, WaiChing Sun
100101102103
Epoch
10°5
10°4
10°3
10°2
10°1
100
Loss
Autoencoder Reconstruction Training Loss
D=2
D= 16
D= 32
100101102103
Epoch
10°4
10°3
Loss
Autoencoder Reconstruction Training Loss
(a) (b)
Fig. 10: Autoencoder reconstruction training loss for microstructure A as defined in Eq. 4with the size of
the encoded feature vector Denc =2, Denc =16, and Denc =32.
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.1
0.2
0.3
0.4
0.5
0.6
¯
p
0.0 0.2 0.4 0.6
Benchmark Node ¯
p
0.0
0.1
0.2
0.3
0.4
0.5
Predicted Node ¯
p
(a)
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.1
0.2
0.3
0.4
0.5
0.6
¯
p
0.0 0.2 0.4 0.6
Benchmark Node ¯
p
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Predicted Node ¯
p
(b)
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.1
0.2
0.3
0.4
0.5
0.6
¯
p
0.0 0.2 0.4 0.6
Benchmark Node ¯
p
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Predicted Node ¯
p
(c)
Internal Variable Graph Data Internal Variable Graph Reconstruction
0.0
0.1
0.2
0.3
0.4
0.5
0.6
¯
p
! = 2
! = 16
! = 32
Fig. 11: Comparison of the autoencoder architecture’s capacity to reconstruct the accumulated plastic strain
pattern of the microstructure A with the size of the encoded feature vector (a) Denc =2, (b) Denc =16, and
(c) Denc =32. The graph node size and color represent the magnitude of the accumulated plastic strain ep.
The node-wise predictions for epare also demonstrated.
Geometric learning for multiscale plasticity 19
network architectures from the literature. In Section 4.3, we show the forward prediction capacity of the
neural network models in unknown loading paths and demonstrate the behavior of the encoded feature
vector prediction variables and how they can be translated back to the original graph space with the help
of the graph decoder.
4.1 Training of constitutive models
100101102103
Epoch
10°6
10°5
10°4
10°3
10°2
Loss
eTraining Loss
Training
Validation
0 5 10 15 20
Benchmark e
0
5
10
15
20
Predicted e
Calibration
Validation
0 5 10 15
Benchmark e
0.0
2.5
5.0
7.5
10.0
12.5
15.0
17.5
Predicted e
Calibration
Validation
100101102103
Epoch
10°6
10°5
10°4
10°3
10°2
Loss
eTraining Loss
Training
Validation
(a)
(b)
Fig. 12: Prediction and training loss of the hyperelastic energy functional ψeconstitutive models for mi-
crostructures (a) A and (b) B.
In this section, we demonstrate all the training experiments for the neural network components that
will be used in the return mapping algorithm to make forward elastoplastic predictions as described in
Section 2.3. The training experiments for the hyperalstic energy functional, the yield function, and the
kinetic law are performed for both microstructures A and B.
In a first training experiment, we first validate the capacity of capturing the hyperelastic behavior of the
material in different loading directions. Neural network hyperelastic laws has been previously trained to
be compatible with mechanics knowledge (e.g. ployconvexity) and thermodynamic constraints (Le et al.,
2015;Klein et al.,2022;Vlassis et al.,2022). The feed-forward neural network inputs the elastic strain tensor
in Voigt notation in two dimensions (e11,e22 ,e12) and outputs the hyperelastic energy function b
ψe. Through
differentiation of the network’s prediction, the stress and stiffness are also output and are constrained
with a higher-order Sobolev norm similarly formulated as in Vlassis and Sun (2021b) that is omitted for
brevity. The hyperelastic network was trained and validated for each microstructure on a total of 10000
sample points (8000 used for training and 2000 for validation) that were gathered from the FEM simulations
described in the previous section. The hyperelastic constitutive response was recorded in both the elastic
and plastic regions. During the plastic response, the elastic increment that corresponds to the true plastic
increment is recorded.
The neural network follows a multilayer perceptron architecture. It consists of a hidden Dense layer
(100 neurons/ReLU), followed by one Multiply layer, then another hidden Dense layer (100 neurons/ReLU),
and an output Dense layer (Linear). The Multiply layer was first introduced in (Vlassis and Sun,2021b) to
modify and increase the degree of continuity of the neural network’s hidden layers. It performs a simple
20 Nikolaos N. Vlassis, WaiChing Sun
100101102103
Epoch
10°5
10°4
10°3
10°2
Loss
Yield Function Training Loss
100101102103
Epoch
10°6
10°5
10°4
10°3
10°2
Loss
Yield Function Training Loss
(a)
100101102103
Epoch
10°6
10°5
10°4
10°3
10°2
Loss
Yield Function Training Loss
Yield Surface Training
Yield Surface Validation
Eikonal Equation Training
Eikonal Equation Validation
1
°100 °80 °60 °40 °20
p
0
20
40
60
80
q
yield surface
°64
°48
°32
°16
0
16
32
48
64
Elastic(
range
Inadmissible(
stress
(b)
Fig. 13: (a) Training loss curves for the yield surface and Eikonal equation loss terms (Eq. (5)) for microstruc-
tures A and B. (b) An instance of the yield function level set prediction for microstructure A.
elementwise multiplication of a layer’s output with itself. It was shown to increase the smoothness of the
learned functions and also allow for better control and reduction of the higher-order constraints, such as a
H2loss function terms. The layers’ kernel weight matrix was initialized with a Glorot uniform distribution
and the bias vector with a zero distribution. The model was trained for 1000 epochs with a batch size of
100 using the Nadam optimizer, set with default values. The training curves and the predictions for all the
samples in the data set are demonstrated in Fig. 12 for which equally great performance was observed for
both microstructures.
Similarly, the neural network yield function is trained on a total of 10000 sample points (8000 and 2000
for training and validation respectively) for each microstructure. The formulation and training objective
for the feed-forward network is as described in Section 2.3.1. It consists of a hidden Dense layer (100
neurons/ReLU), followed by a Multiply layer, then another hidden Dense layer (100 neurons/ReLU) and
another Multiply layer, and the output is fed in a Dense layer with a linear activation function. For all
the Dense layers, the kernel weight matrix was initialized with a Glorot uniform distribution and the bias
vector with a zero distribution. Each model was trained for 1000 epochs with a batch size of 100 with the
Nadam optimizer, set with default values. The Eikonal equation term weight in the loss function Eq. (5)
is set to w=1. The training results for the yield function neural networks are showcased in Fig. 13. The
training loss curves for the two terms of the training loss function in Eq. (5) are shown in Fig. 13(a) with
similar performance for both microstructures. The Eikonal equation loss term aims to ensure the predicted
yield function is a signed distance function and a predicted instance of it is demonstrated in Fig. 13(b).
We fit the kinetic law neural networks as described in Section 2.3.2 to predict the encoded feature
vectors as a function of the plastic strain tensor time history. The architecture was trained for each data set
of 10000 samples of plastic strain and encoded feature vector pairs, split into an 8000 training and a 2000
validation set. The plastic strain tensors were pre-processed in time history sequences of length `=4. The
recurrent neural network architecture is based on the Gated Recurrent Unit-based (GRU) (Chung et al.,
2014). The network consists of two GRU hidden layers of 32 units each with a sigmoid recurrent activation
function and a tanh layer activation function). This is followed by two Dense hidden layers (100 neurons
and a ReLU activation function) and an output Dense layer (16 neurons and a Linear activation function).
The model was trained for 1000 epochs with a batch size of 128 using the Nadam optimizer, set with
default values, and the training curves and encoded feature vector components prediction are showcased
in Fig. 14.
In the last training experiment, we train the plastic flow neural networks to predict the plastic flow evo-
lution in the elastoplasticity simulations as described in Section 2.3.3. The architecture was trained for each
data set of 10000 samples of encoded feature vector and plastic flow component pairs, split into an 8000
training and a 2000 validation set. The input encoded feature vector has 16 components. The multilayer
perceptron architecture consists of four hidden Dense layers of 100 neurons each and a ReLU activation
function. The output of the neural network is a Dense layer with 2 neurons and a Linear activation function.
Geometric learning for multiscale plasticity 21
Fig. 14: (a,d) Training loss curves for the encoded feature vector b
ζkinetic law, (b,e) a prediction of the
encoded feature vectors along a loading path, and (c,f) prediction of all the encoded feature vector compo-
nents in the data set for microstructures A and B.
Fig. 15: Training curves for the plastic flow network for microstructures A and B.
The layers’ kernel weight matrix was initialized with a Glorot uniform distribution and the bias vector with
a zero distribution. The model was trained for 1000 epochs with a batch size of 100 using the Nadam op-
timizer. The training curves for the plastic flow networks for the two microstructures are demonstrated in
Fig. 15. As expected these neural networks are very accurate as they input the highly descriptive encoded
feature vector structure that represents the entire plastic state distribution in the microscale to predict the
homogenized plastic flow behavior.
22 Nikolaos N. Vlassis, WaiChing Sun
4.2 Comparison with recurrent neural network architectures
In this section, we test the capacity and robustness of the return mapping elastoplasticity model to make
forward path-dependent predictions on unseen loading paths. We compare this capacity with recurrent
neural network models from the literature that are commonly used to predict data structures in the form
of time series and often plasticity. We design training experiments for the two recurrent architectures: a
GRU architecture and a 1D convolutional architecture and test all models against unseen loading paths of
increased complexity.
100101102
Epoch
10°6
10°5
10°4
10°3
10°2
Loss
RNN Training Loss
Training
Validation
100101102103
Epoch
10°4
10°3
10°2
Loss
Augmented RNN Training Loss
Training
Validation
100101102103
Epoch
10°4
10°3
10°2
Loss
Augmented Conv1D Training Loss
Training
Validation
Fig. 16: Training loss curves for the RNN architecture, RNN architecture with data augmentation, and the
Conv1D architecture with data augmentation.
The first recurrent network used is based on the GRU layer architecture that is trained on a pair of
total strain and total stress time histories. For this 2D data set, the strain state is represented by the strain
tensor ein Voigt notation (e11,e22 ,e12), and the stress state is represented by the two stress invariants
(p,q). Specifically, the network inputs a time history of input strains of `previous time steps to predict
the current stress state. For the n-th time step, the network inputs the pre-processed time history of strain
tensors as [en`, . . . , en1,en]and outputs the current stress [pn,qn]. The time history length was chosen to
be `=30. The recurrent neural network architecture (RNN) consists of a series of two GRU hidden layers
(32 units each, a sigmoid recurrent activation function, and a tanh activation function) and a series of two
Dense hidden layers (100 neurons and a ReLU activation function) with an output Dense layer (2 neurons
and a Linear activation function). The model was trained for 1000 epochs with a batch size of 128 using the
Nadam optimizer, set with default values, and a mean squared error loss function.
The second recurrent architecture used is based on the 1D convolutional layer architecture that uses a
one-dimensional variation of the convolutional filter (LeCun et al.,1995;Oord et al.,2016) to extract infor-
0.00 0.01 0.02 0.03 0.04 0.05 0.06
v
0
10
20
30
40
50
60
p
Benchmark
RNN Prediction
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
q
Benchmark
RNN Prediction
(a) (b)
0.00 0.01 0.02 0.03 0.04 0.05
v
0
10
20
30
40
50
p
Benchmark
RNN Prediction
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
q
Benchmark
RNN Prediction
(c) (d)
Fig. 17: (a,b) Prediction of the RNN architecture on a monotonic loading curve. (c,d) Prediction of the RNN
architecture for a loading curve with unseen random unloading and reloading paths.
Geometric learning for multiscale plasticity 23
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
90
q
Benchmark
Augmented RNN Prediction
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
q
Benchmark
Augmented RNN Prediction
(a) (b)
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
90
q
Benchmark
Augmented Conv1D Prediction
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
q
Benchmark
Augmented Conv1D Prediction
(c) (d)
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
90
q
Benchmark
NN Return Mapping Prediction
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
s
0
10
20
30
40
50
60
70
80
q
Benchmark
NN Return Mapping Prediction
(e) (f)
Fig. 18: Prediction of the data augmented (a,b) RNN architecture, (c,d) Conv1D architecture, and (e,f) the
neural network-based return mapping algorithm for loading curves with unseen random unloading and
reloading paths.
mation from time series of fixed length. The plasticity data time series is similarly pre-processed and input
as the previously described RNN architecture with a time history length `=30. The 1D convolutional
neural network architecture (Conv1D) consists of a series of three 1D convolutional layers with 32, 64, and
128 filters respectively, with a kernel size of 4 with ReLU activation functions. The output of the convolu-
tional filter is flattened and fed into two Dense layers (100 neurons and ReLU activation functions) and an
output Dense layer (2 neurons and a Linear activation). The model was trained for 100 epochs with a batch
size of 128, the Nadam optimizer, and a mean squared error loss function.
In a first training experiment, before we compare with the neural network return mapping model, we
train the RNN architecture with the same data set for microstructure A that the return mapping constitutive
model was trained on as described in Section 3.1. It is noted that all the loading paths in that data set were
monotonic. The 10000 samples of strain-stress constitutive responses were pre-processed in 10000 strain
time history samples and their corresponding stress response 8000 were used for training and 2000 were
used for calibration. The training loss curve for the RNN architecture is shown in Fig. 16. To test the capacity
of the architecture to make forward predictions, we first test the RNN in unseen monotonic loading paths.
The results can be seen in Fig. 17 (a & b) where the RNN architecture can robustly predict monotonic
loading patterns that resemble the ones used for training. However, when we introduce several random
unloading and reloading paths Fig. 17 (c & d), the RNN architecture cannot make predictions which is
expected as it has not seen these types of patterns before in the training data set.
Thus, we design another training experiment for the recurrent architectures to compare more fairly with
the return mapping algorithm. We augment the training data set for the RNN and Conv1D architectures by
introducing random elastic loading and unloading paths of random lengths to the previously monotonic
data set. The data set now includes 32400 samples of strain and stress responses the previous 10000
samples and 22400 samples generated through the augmentation procedure. 25920 of these samples will
24 Nikolaos N. Vlassis, WaiChing Sun
be used for training and 6480 for validation. The training curves for the Augmented RNN and Conv1D
architectures are shown in Fig. 16.
We can now compare these augmented architectures with the neural network return mapping algo-
rithm. It is noted that all the return mapping algorithm models were still trained on the 10000 sample
data set as described in Sec. 4.1. The results of the comparison experiment for unseen loading paths are
demonstrated in Fig. 18. In Fig. 18 (a, b, c, d), the augmented RNN and Conv1D architectures can now
recognize loading and reloading behaviors qualitatively better. While several unseen unloading-reloading
paths are more accurate than others the recurrent architectures are not robust in distinguishing the elastic
from the plastic region. In some elastic paths, we observe path-dependent phenomena that are potentially
attributed to the network not seeing an elastic region at the specific strain states. The return mapping al-
gorithm model is observed to be both accurate and robust in these predictions (Fig. 18 (e & f)), even when
trained to about a third of the data compared to recurrent architectures. This is achieved by decoupling
the elastic and plastic behavior with the help of the yield function neural network allowing the model to
recognize when the elastic and elastoplastic behaviors should be used for the predictions.
Another important point is that the black-box recurrent neural networks do not allow access to other
plasticity metrics during the forward prediction simulations. Unless there is another network training to
predict metrics such as the accumulated plastic strain or other descriptors that evolve in the plastic region,
we neither monitor nor can we estimate how the microstructure evolves upon yielding. In the following
section, we will demonstrate how these feature vectors can be interpreted with the help of the graph de-
coder to have a complete understanding of the predicted plastic state distribution within the RVE.
4.3 Interpretable multiscale plasticity of complex microstructures
Benchmark Prediction
Fig. 19: (a,b,c) Prediction of stress invariants p,q, and accumulated plastic strain epusing the return map-
ping algorithm for a monotonic loading of microstructure A. (d,e) Prediction of all the encoded feature
vector ζcomponents and the corresponding decoded internal variable graph for a monotonic loading of
microstructure A.
Geometric learning for multiscale plasticity 25
Benchmark Prediction
Fig. 20: (a,b,c) Prediction of stress invariants p,q, and accumulated plastic strain epusing the return map-
ping algorithm for a monotonic loading of microstructure B. (d,e) Prediction of all the encoded feature
vector ζcomponents and the corresponding decoded internal variable graph for a monotonic loading of
microstructure B.
(a) (b) (c) (d)
Fig. 21: Prediction of the plastic flow components for two loading cases for microstructures A and B (a & b
respectively). Both the predicted via yield function stress gradient b
f
∂σAand the non-associative plastic flow
b
ϑ
∂σApredictions are shown.
In this last section, we demonstrate the capacity of the models to make forward predictions for unseen
loading paths and interpret them in the microstructure. The return mapping algorithm does not only pre-
dict the strain-stress response of the material but also the plastic strain response and the encoded feature
vector variables. These can then be decoded by the graph decoder Ldec to interpret the microstructures’
elastoplastic behavior. We provide tests of unseen loading path simulations for both microstructures A and
B. The training of the constitutive models used to make the forward predictions is described in Section 4.1
and the decoder used for each microstructure is described in Section 3.2.
We first test the models’ capacity to make predictions of the plastic state on monotonic data. We demon-
strate the result for the predicted stress state in Fig. 19 (a & b) and Fig. 20 (a & b) for microstructures A and
B respectively. We also record the homogenized plastic strain tensor of the microstructures. For simplicity,
26 Nikolaos N. Vlassis, WaiChing Sun
Fig. 22: Prediction of deviatoric stress q, accumulated plastic strain ep, and the encoded feature vector ζ
components using the return mapping algorithm for a cyclic loading of microstructure A.
Fig. 23: Prediction of deviatoric stress q, accumulated plastic strain ep, and the encoded feature vector ζ
components using the return mapping algorithm for a cyclic loading of microstructure B.
we are demonstrating the predicted accumulated plastic strain measure epin Fig. 19 (c) and Fig. 20 (c).
Using the trained kinetic law neural network, we can make forward predictions of the encoded feature
vectors b
ζthat are consistent with the current predicted homogenized plastic state. The results of these pre-
dictions are shown in 19 (d) and Fig. 20 (d). These predicted curves are a close match to the benchmark
data and can closely capture the behavior in the macroscale. We can now interpret this homogenized be-
havior as the corresponding one in the microscale. Using the trained decoder for each microstructure, we
can recover the plastic strain distributions as shown in Fig.19 (e) and Fig. 20 (e) for microstructures A and
B respectively. It is noted that while only the node-wise prediction of the accumulated plastic strain epis
shown, the decoded recovers the entire plastic strain tensor ep. This is done for simplicity of presentation.
The node-wise predictions of the plastic strain are accurate and the decoder can qualitatively capture the
general plastic distribution patterns and the plastic strain localization nodes in the microstructure.
We also demonstrate the capacity of the model to predict the plastic flow with the help of the encoded
feature vector internal variables. In Fig. 21, we compare the computed plastic flow components using the
neural network yield function stress gradient b
f
∂σAprediction and the plastic potential stress gradient b
ϑ
∂σA
as predicted by the plastic flow bgnetwork with the benchmark simulations. The results demonstrated are
for two blind prediction curves for each microstructure Fig. 21(a) corresponds to Fig. 19 and Fig. 21(c)
corresponds to Fig. 20. The accuracy of the bgnetwork predictions on the flow is higher than that of the yield
function stress gradient. This is attributed to the decoupling of the yielding and hardening from the plastic
flow directions allowing for more flexibility of the neural networks to fit these complex laws. Network
bgalso utilizes the highly descriptive encoded feature vector b
ζinput that allows for more refined control
of the plastic flow than the volume averaged accumulated plastic strain metric used in the yield function
formulation.
Finally, we conduct a similar blind test experiment but with added blind unloading and reloading
elastic paths in the loading strains. The results for microstructures A and B are demonstrated in Fig. 22
and Fig. 23 respectively. As discussed in Section 4.2, the model does not have any difficulty recognizing
Geometric learning for multiscale plasticity 27
the elastic and plastic regions of the loading path since the behaviors are distinguished with the help of
the neural network yield function. This also constrains the evolution of the plastic strain and the encoded
feature vector to happen only during the plastic loading. Since the kinetic law neural network is a feed-
forward architecture, there is no history dependence and no change in the plastic strain corresponds to no
change in the encoded feature vector. The decoder architecture is also path-independent so no change in
the encoded feature vector corresponds to no change in the respective decoded plastic graph. This is also
achieved by the specific way the encoded feature vectors are constructed. The input node features in the
autoencoder are the mesh node coordinates and the plastic strains as described in Section 2.1. This specific
design ensures that the plastic graph does not evolve during elastic unloading/reloading and prevents any
artificial memory effect in the elastic regime. The lack of memory effect in the elastic regime is necessary
for the encoded feature vector to be internal variables for rate-independent plasticity models where the
history-dependent effect is only triggered once the yield criterion is met. This would not be the case if
other integration point data, such as the total strain or stress, are incorporated into the graph encoder.
Note that this switch between path-independent and path-dependent behaviors may also have im-
plications for other neural network constitutive laws. In particular, if a black-box recurrent neural net-
work is used to forecast history-dependent stress-strain responses, then one must ensure that the history-
dependent effect is not manifested in the elastic region. For instance, if the LSTM architecture is used, then
one must ensure that the forget gate is trained to turn on to filter out any potential artificial influence of
the strain history.
5 Conclusion and future perspectives
The macroscopic inelastic behaviors of materials are manifested from the evolution of microstructures.
As such, the major challenges of the macroscopic models used in engineering practice are just establishing
sophisticated physics-informed predictions between input and output, but also exploring an efficient and
economic way to represent the evolution of microstructures in a lower-dimensional space where it is easier
to establish models. In this work, we introduce a graph autoencoder framework that deduces internal
variables that can effectively represent graph data obtained from representative elementary volumes. The
encoder component of the autoencoder architecture maps data stored in a weighted graph onto a vector
space spanned by low-dimensional descriptors that represents the key spatial features that govern the
macroscopic responses, while the decoder component of the architecture projects these encoded feature
vector-based descriptors for a robust interpretation of the material’s behavior at the microscopic scale.
By establishing this multiscale connection through the graph autoencoder, we introduce a new internal
variable-driven plasticity framework where the long-lasting issues of the lack of interpretability of internal
variables can be resolved. A particularly important aspect of this framework is that allows us to bypass
the process of hand-picking descriptors or geometric measures to establish theories for plastic flow. This
flexibility afforded by the graph autoencoder identifies a set of internal variables that are tailored to the
specific way the micro-structures evolve, without hand-crafting physical measures or descriptors of the
mechanisms. Note that, while the graphs that store the data could be high-dimensional, the macroscopic
surrogate model is built in a low-dimensional space. Introducing a higher dimensional parametric space for
the yield function and the plastic flow (by, for instance, assuming no material symmetry or incorporating
higher-order kinematics) may potentially lead to an even more accurate and precise model, but also at
the expense of increased difficulty in training and validation. Another interesting aspect is to study the
robustness of the proposed framework against different types of noise stored in different data structures. In
this work, we have not focused on mechanisms to generalize the learned model against noise as the graph
data is obtained through direct numerical simulations. Exploring different options to properly de-noise
different forms of stored data, such as point sets (e.g. sensors distributed in three-dimensional objects),
graphs (e.g. finite element solutions or network data), and manifolds using the autoencoders or other
techniques is another important topic we have not investigated. Research in these directions are currently
in progress.
28 Nikolaos N. Vlassis, WaiChing Sun
6 Acknowledgments
The authors are supported by the National Science Foundation under grant contracts CMMI-1846875
and OAC-1940203, and the Dynamic Materials and Interactions Program from the Air Force Office of Sci-
entific Research under grant contracts FA9550-19-1-0318, FA9550-21-1-0391 and FA9550-21-1-0027, with
additional support provided to WCS by the Department of Energy DE-NA0003962. These supports are
gratefully acknowledged. The views and conclusions contained in this document are those of the authors,
and should not be interpreted as representing the official policies, either expressed or implied, of the spon-
sors, including the Army Research Laboratory or the U.S. Government. The U.S. Government is authorized
to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation
herein.
7 Data availability
The data that support the findings of this study are available from the corresponding author upon
request.
References
Mart´
ın Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado,
Andy Davis, Jeffrey Dean, Matthieu Devin, et al. Tensorflow: Large-scale machine learning on heteroge-
neous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
Andrea Asperti and Valerio Tonelli. Comparing the latent space of generative models. Neural Computing
and Applications, pages 1–18, 2022.
S¨
oren Auer, Viktor Kovtun, Manuel Prinz, Anna Kasprzik, Markus Stocker, and Maria Esther Vidal. To-
wards a knowledge graph for science. In Proceedings of the 8th International Conference on Web Intelligence,
Mining and Semantics. ACM, jun 2018. doi: 10.1145/3227609.3227689. URL https://doi.org/10.
1145%2F3227609.3227689.
Bahador Bahmani and WaiChing Sun. Training multi-objective/multi-task collocation physics-informed
neural network with student/teachers transfer learnings. arXiv preprint arXiv:2107.11496, 2021.
Coryn AL Bailer-Jones, David JC MacKay, and Philip J Withers. A recurrent neural network for modelling
dynamical systems. network: computation in neural systems, 9(4):531, 1998.
Ted D Blacker, William J Bohnhoff, and Tony L Edwards. Cubit mesh generation environment. volume 1:
Users manual. Technical report, Sandia National Labs., Albuquerque, NM (United States), 1994.
Wyatt Bridgman, Xiaoxuan Zhang, Greg Teichert, Mohammad Khalil, Krishna Garikipati, and Reese Jones.
A heteroencoder architecture for prediction of failure locations in porous metals using variational infer-
ence. arXiv preprint arXiv:2202.00078, 2022.
Pietro Carrara, Laura De Lorenzis, Laurent Stainier, and Michael Ortiz. Data-driven fracture mechanics.
Computer Methods in Applied Mechanics and Engineering, 372:113390, 2020.
Min Chen, Xiaobo Shi, Yin Zhang, Di Wu, and Mohsen Guizani. Deep feature learning for medical image
analysis with convolutional autoencoder neural network. IEEE Transactions on Big Data, 7(4):750–758,
2017.
Franc¸ois Chollet et al. Keras. https://keras.io, 2015.
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated
recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
John D Clayton. Nonlinear mechanics of crystals, volume 177. Springer Science & Business Media, 2010.
Stephen C Cowin. The relationship between the elasticity tensor and the fabric tensor. Mechanics of materi-
als, 4(2):137–147, 1985.
Yannis F Dafalias and Majid T Manzari. Simple plasticity sand model accounting for fabric change effects.
Journal of Engineering mechanics, 130(6):622–634, 2004.
YF Dafalias and EP Popov. Plastic internal variables formalism of cyclic plasticity. 1976.
Edna Daitz. The picture theory of meaning. Mind, 62(246):184–201, 1953.
Geometric learning for multiscale plasticity 29
Micha¨
el Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs
with fast localized spectral filtering. Advances in neural information processing systems, 29, 2016.
David DeMers and Garrison Cottrell. Non-linear dimensionality reduction. Advances in neural information
processing systems, 5, 1992.
Robert Eggersmann, Trenton Kirchdoerfer, Stefanie Reese, Laurent Stainier, and Michael Ortiz. Model-free
data-driven inelasticity. Computer Methods in Applied Mechanics and Engineering, 350:81–99, 2019.
Lukas Fetty, Mikael Bylund, Peter Kuess, Gerd Heilemann, Tufve Nyholm, Dietmar Georg, and Tommy
L¨
ofstedt. Latent space manipulation for high-resolution medical image synthesis via the stylegan.
Zeitschrift f¨ur Medizinische Physik, 30(4):305–314, 2020.
Matthias Fey and Jan E. Lenssen. Fast graph representation learning with PyTorch Geometric. In ICLR
Workshop on Representation Learning on Graphs and Manifolds, 2019.
Jacob Fish, Mohan A Nuggehally, Mark S Shephard, Catalin R Picu, Santiago Badia, Michael L Parks, and
Max Gunzburger. Concurrent atc coupling based on a blend of the continuum stress and the atomistic
force. Computer methods in applied mechanics and engineering, 196(45-48):4548–4560, 2007.
Ari Frankel, Craig M Hamel, Dan Bolintineanu, Kevin Long, and Sharlotte Kramer. Machine learning
constitutive models of elastomeric foams. Computer Methods in Applied Mechanics and Engineering, 391:
114492, 2022.
Jan N Fuhg, Michele Marino, and Nikolaos Bouklas. Local approximate gaussian process regression for
data-driven constitutive models: development and comparison with neural networks. Computer Methods
in Applied Mechanics and Engineering, 388:114217, 2022.
Ken-ichi Funahashi and Yuichi Nakamura. Approximation of dynamical systems by continuous time re-
current neural networks. Neural networks, 6(6):801–806, 1993.
Marc GD Geers, Varvara G Kouznetsova, and WAM1402 Brekelmans. Multi-scale computational homoge-
nization: Trends and challenges. Journal of computational and applied mathematics, 234(7):2175–2182, 2010.
Jamshid Ghaboussi, JH Garrett Jr, and Xiping Wu. Knowledge-based modeling of material behavior with
neural networks. Journal of engineering mechanics, 117(1):132–153, 1991.
James Griffin. Wittgenstein’s logical atomism. 1964.
Arthur L Gurson. Continuum theory of ductile rupture by void nucleation and growth: Part i—yield
criteria and flow rules for porous ductile media. 1977.
Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. Ad-
vances in neural information processing systems, 30, 2017a.
William L Hamilton, Rex Ying, and Jure Leskovec. Representation learning on graphs: Methods and appli-
cations. arXiv preprint arXiv:1709.05584, 2017b.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-
level performance on imagenet classification. In Proceedings of the IEEE international conference on computer
vision, pages 1026–1034, 2015.
Qizhi He and Jiun-Shyan Chen. A physics-constrained data-driven approach based on locally convex
reconstruction for noisy database. Computer Methods in Applied Mechanics and Engineering, 363:112791,
2020.
Xiaolong He, Qizhi He, and Jiun-Shyan Chen. Deep autoencoders for physics-constrained data-driven
nonlinear materials modeling. Computer Methods in Applied Mechanics and Engineering, 385:114034, 2021.
Xiaolong He, Karan Taneja, Jiun-Shyan Chen, Chung-Hao Lee, John Hodgson, Vadim Malis, Usha Sinha,
and Shantanu Sinha. Multiscale modeling of passive material influences on deformation and force out-
put of skeletal muscles. International Journal for Numerical Methods in Biomedical Engineering, 38(4):e3571,
2022.
Geoffrey E Hinton and Ruslan R Salakhutdinov. Reducing the dimensionality of data with neural net-
works. science, 313(5786):504–507, 2006.
Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal
approximators. Neural networks, 2(5):359–366, 1989.
K Karapiperis, L Stainier, M Ortiz, and JE Andrade. Data-driven multiscale modeling in mechanics. Journal
of the Mechanics and Physics of Solids, 147:104239, 2021.
Joseph Kestin and James R Rice. Paradoxes in the application of thermodynamics to strained solids. Citeseer,
1969.
30 Nikolaos N. Vlassis, WaiChing Sun
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980, 2014.
Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. arXiv
preprint arXiv:1609.02907, 2016a.
Thomas N Kipf and Max Welling. Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016b.
Dominik K Klein, Mauricio Fern´
andez, Robert J Martin, Patrizio Neff, and Oliver Weeger. Polyconvex
anisotropic hyperelasticity with neural networks. Journal of the Mechanics and Physics of Solids, 159:104703,
2022.
Matthew R Kuhn, WaiChing Sun, and Qi Wang. Stress-induced anisotropy in granular materials: fabric,
stiffness, and permeability. Acta Geotechnica, 10(4):399–419, 2015.
BA Le, Julien Yvonnet, and Q-C He. Computational homogenization of nonlinear elastic materials using
neural networks. International Journal for Numerical Methods in Engineering, 104(12):1061–1084, 2015.
Yann LeCun, Yoshua Bengio, et al. Convolutional networks for images, speech, and time series. The
handbook of brain theory and neural networks, 3361(10):1995, 1995.
M Lefik, DP Boso, and BA Schrefler. Artificial neural networks in numerical modelling of composites.
Computer Methods in Applied Mechanics and Engineering, 198(21-26):1785–1804, 2009.
AA Leman and Boris Weisfeiler. A reduction of a graph to a canonical form and an algebra arising during
this reduction. Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968.
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. Gated graph sequence neural networks.
arXiv preprint arXiv:1511.05493, 2015.
Yang Liu, WaiChing Sun, and Jacob Fish. Determining material parameters for critical state plasticity
models based on multilevel extended digital database. Journal of Applied Mechanics, 83(1), 2016a.
Yang Liu, WaiChing Sun, Zifeng Yuan, and Jacob Fish. A nonlocal multiscale discrete-continuum model
for predicting mechanical behavior of granular materials. International Journal for Numerical Methods in
Engineering, 106(2):129–160, 2016b.
Kin Gwn Lore, Adedotun Akintayo, and Soumik Sarkar. Llnet: A deep autoencoder approach to natural
low-light image enhancement. Pattern Recognition, 61:650–662, 2017.
Ran Ma and WaiChing Sun. Computational thermomechanics for crystalline rock. part ii: Chemo-damage-
plasticity and healing in strongly anisotropic polycrystals. Computer Methods in Applied Mechanics and
Engineering, 369:113184, 2020.
Ran Ma and WaiChing Sun. A finite micro-rotation material point method for micropolar solid and fluid
dynamics with three-dimensional evolving contacts and free surfaces. Computer Methods in Applied Me-
chanics and Engineering, 391:114540, 2022.
Ran Ma, WaiChing Sun, and Catalin R Picu. Atomistic-model informed pressure-sensitive crystal plasticity
for crystalline hmx. International Journal of Solids and Structures, 232:111170, 2021.
Dougal Maclaurin, David Duvenaud, and Ryan P Adams. Autograd: Effortless gradients in numpy. In
ICML 2015 AutoML workshop, volume 238, 2015.
Christian Miehe. Strain-driven homogenization of inelastic microstructures and composites based on an
incremental variational formulation. International Journal for numerical methods in engineering, 55(11):1285–
1322, 2002.
Alejandro Mota, WaiChing Sun, Jakob T Ostien, James W Foulk, and Kevin N Long. Lie-group interpola-
tion and variational recovery for internal variables. Computational Mechanics, 52(6):1281–1299, 2013.
M Mozaffar, R Bostanabad, W Chen, K Ehmann, Jian Cao, and MA Bessa. Deep learning predicts path-
dependent plasticity. Proceedings of the National Academy of Sciences, 116(52):26414–26420, 2019.
SeonHong Na, Eric C Bryant, and WaiChing Sun. A configurational force for adaptive re-meshing of
gradient-enhanced poromechanics problems with history-dependent variables. Computer Methods in
Applied Mechanics and Engineering, 357:112572, 2019.
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal
Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. Wavenet: A generative model for raw audio.
arXiv preprint arXiv:1609.03499, 2016.
Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, and Chengqi Zhang. Adversarially regularized
graph autoencoder for graph embedding. arXiv preprint arXiv:1802.04407, 2018.
M Pastor, OC Zienkiewicz, and AHC0702 Chan. Generalized plasticity and the modelling of soil behaviour.
International Journal for numerical and analytical methods in geomechanics, 14(3):151–190, 1990.
Geometric learning for multiscale plasticity 31
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas
Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy,
Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, high-
performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. Alche-
Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32,
pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/
9015-pytorch-an-imperative-style- high-performance-deep-learning-library.
pdf.
James R Rice. Inelastic constitutive relations for solids: an internal-variable theory and its application to
metal plasticity. Journal of the Mechanics and Physics of Solids, 19(6):433–455, 1971.
Clarence W Rowley and Jerrold E Marsden. Reconstruction equations and the karhunen–lo`
eve expansion
for systems with symmetry. Physica D: Nonlinear Phenomena, 142(1-2):1–19, 2000.
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph
neural network model. IEEE transactions on neural networks, 20(1):61–80, 2008.
WaiChing Sun. A unified method to predict diffuse and localized instabilities in sands. Geomechanics and
Geoengineering, 8(2):65–75, 2013.
WaiChing Sun and Alejandro Mota. A multiscale overlapped coupling formulation for large-deformation
strain localization. Computational Mechanics, 54(3):803–820, 2014.
WaiChing Sun and Teng-fong Wong. Prediction of permeability and formation factor of sandstone with
hybrid lattice boltzmann/finite element simulation on microtomographic images. International Journal of
Rock Mechanics and Mining Sciences, 106:269–277, 2018.
WaiChing Sun, Zhijun Cai, and Jinhyun Choo. Mixed arlequin method for multiscale poromechanics prob-
lems. International Journal for Numerical Methods in Engineering, 111(7):624–659, 2017.
Xiao Sun, Bahador Bahmani, Nikolaos N Vlassis, WaiChing Sun, and Yanxun Xu. Data-driven discovery
of interpretable causal relations for deep learning material laws with uncertainty propagation. Granular
Matter, 24(1):1–32, 2022.
Michael D Uchic, Dennis M Dimiduk, Jeffrey N Florando, and William D Nix. Sample dimensions influence
strength and crystal plasticity. Science, 305(5686):986–989, 2004.
Pascal Vincent, Hugo Larochelle, Yoshua Bengio, and Pierre-Antoine Manzagol. Extracting and composing
robust features with denoising autoencoders. In Proceedings of the 25th international conference on Machine
learning, pages 1096–1103, 2008.
Nikolaos N. Vlassis and WaiChing Sun. Component-Based Machine Learning Paradigm for Discovering
Rate-Dependent and Pressure-Sensitive Level-Set Plasticity Models. Journal of Applied Mechanics, 89(2),
11 2021a. ISSN 0021-8936. doi: 10.1115/1.4052684. URL https://doi.org/10.1115/1.4052684.
021003.
Nikolaos N Vlassis and WaiChing Sun. Sobolev training of thermodynamic-informed neural networks for
interpretable elasto-plasticity models with level set hardening. Computer Methods in Applied Mechanics
and Engineering, 377:113695, 2021b.
Nikolaos N Vlassis and WaiChing Sun. Component-based machine learning paradigm for discovering
rate-dependent and pressure-sensitive level-set plasticity models. Journal of Applied Mechanics, 89(2),
2022.
Nikolaos N Vlassis, Ran Ma, and WaiChing Sun. Geometric deep learning for computational mechanics
part i: Anisotropic hyperelasticity. Computer Methods in Applied Mechanics and Engineering, 371:113299,
2020.
Nikolaos N Vlassis, Puhan Zhao, Ran Ma, Tommy Sewell, and WaiChing Sun. Molecular dynamics inferred
transfer learning models for finite-strain hyperelasticity of monoclinic crystals: Sobolev training and
validations against physical constraints. International Journal for Numerical Methods in Engineering, 2022.
Chun Wang, Shirui Pan, Guodong Long, Xingquan Zhu, and Jing Jiang. Mgae: Marginalized graph au-
toencoder for graph clustering. In Proceedings of the 2017 ACM on Conference on Information and Knowledge
Management, pages 889–898, 2017.
Kun Wang and WaiChing Sun. A semi-implicit discrete-continuum coupling method for porous media
based on the effective stress principle at finite strain. Computer Methods in Applied Mechanics and Engi-
neering, 304:546–583, 2016.
32 Nikolaos N. Vlassis, WaiChing Sun
Kun Wang and WaiChing Sun. A multiscale multi-permeability poroplasticity model linked by recursive
homogenizations and deep learning. Computer Methods in Applied Mechanics and Engineering, 334:337–
380, 2018.
Kun Wang, WaiChing Sun, Simon Salager, SeonHong Na, and Ghonwa Khaddour. Identifying material pa-
rameters for a micro-polar plasticity model via x-ray micro-computed tomographic (ct) images: lessons
learned from the curve-fitting exercises. International Journal for Multiscale Computational Engineering, 14
(4), 2016.
Kun Wang, WaiChing Sun, and Qiang Du. A cooperative game for automated learning of elasto-plasticity
knowledge graphs and models with ai-guided experimentation. Computational Mechanics, 64(2):467–499,
2019.
Kun Wang, WaiChing Sun, and Qiang Du. A non-cooperative meta-modeling game for automated third-
party calibrating, validating and falsifying constitutive laws with parallelized adversarial attacks. Com-
puter Methods in Applied Mechanics and Engineering, 373:113514, 2021.
Jiayang Xu and Karthik Duraisamy. Multi-level convolutional autoencoder networks for parametric pre-
diction of spatio-temporal dynamics. Computer Methods in Applied Mechanics and Engineering, 372:113379,
2020.
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks?
arXiv preprint arXiv:1810.00826, 2018.
Julien Yvonnet and Q-C He. The reduced model multiscale method (r3m) for the non-linear homogeniza-
tion of hyperelastic media at finite strains. Journal of Computational Physics, 223(1):341–368, 2007.
Kun Zeng, Jun Yu, Ruxin Wang, Cuihua Li, and Dacheng Tao. Coupled deep autoencoder for single image
super-resolution. IEEE transactions on cybernetics, 47(1):27–37, 2015.
Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. An end-to-end deep learning architec-
ture for graph classification. In Thirty-second AAAI conference on artificial intelligence, 2018.
Xinran Zhong and WaiChing Sun. An adaptive reduced-dimensional discrete element model for dynamic
responses of granular materials with high-frequency noises. International Journal for Multiscale Computa-
tional Engineering, 16(4), 2018.
Xinran Zhong, WaiChing Sun, and Ying Dai. A reduced-dimensional explicit discrete element solver for
simulating granular mixing problems. Granular Matter, 23(1):1–13, 2021.
Article
The effective properties of metamaterials are tailored through the design of their internal structures. According to their main building block, the family of porous three-dimensional metamaterials is divided into truss-, plate- and shell-lattices. The exploration of their full design-space is hampered in practice by a lack of a systematic method to represent their topologies. Here, we demonstrate for the first time that graph models provide an effective representation of shell-lattices. This new graph representation is then leveraged to obtain deep learning-based structure-property models. Using finite element simulations, the stiffness and heat conductivity tensors are established for more than 40,000 microstructural configurations. We find that a modified crystal graph convolutional neural network model provides an accurate description of the structure-property relations. We anticipate the proposed graph-based modeling framework to be applicable to any man-made periodic microstructure, thereby enabling the design and discovery of new materials exhibiting exceptional mechanical, thermal, electrical or magnetic properties.
Article
Full-text available
Different encodings of datapoints in the latent space of latent-vector generative models may result in more or less effective and disentangled characterizations of the different explanatory factors of variation behind the data. Many works have been recently devoted to the exploration of the latent space of specific models, mostly focused on the study of how features are disentangled and of how trajectories producing desired alterations of data in the visible space can be found. In this work we address the more general problem of comparing the latent spaces of different models, looking for transformations between them. We confined the investigation to the familiar and largely investigated case of generative models for the data manifold of human faces. The surprising, preliminary result reported in this article is that (provided models have not been taught or explicitly conceived to act differently) a simple linear mapping is enough to pass from a latent space to another while preserving most of the information. This is full of consequences for representation learning, potentially paving the way to the transformation of editing trajectories from one space to another, or the adaptation of disentanglement techniques between different generative domains.
Article
Full-text available
Passive materials in human skeletal muscle tissues play an important role in force output of skeletal muscles. This paper introduces a multiscale modeling framework to investigate how age-associated variations in micro-scale passive muscle components, including microstructural geometry (e.g., connective tissue thickness) and material properties (e.g., anisotropy), influence the force output and deformations of the continuum skeletal muscle. We first define a representative volume element (RVE) for the microstructure of muscle and determine the homogenized macro-scale mechanical properties of the RVE from the separate mechanical properties of the individual components of the RVE, including muscle fibers and connective tissue with its associated collagen fibers. The homogenized properties of the RVE are then used to define the elements of the continuum muscle model to evaluate the force output and deformations of the whole muscle. Conversely, the regional deformations of the continuum model are fed back to the RVE model to determine the responses of the individual micro-scale components. Simulations of muscle isometric contractions at a range of muscle lengths are performed to investigate the effects of muscle architectural changes (e.g., pennation angles) due to ageing on force output and muscle deformation. The correlations between the pennation angle, the shear deformation in the micro-scale connective tissue (an indicator for the lateral force transmission), the angle difference between the fiber direction and principal strain direction (PSD) and the resulting shear deformation at the continuum scale, as well as the force output of the skeletal muscle are also discussed. This article is protected by copyright. All rights reserved.
Article
Full-text available
This paper introduces an explicit material point method designed specifically for simulating the micropolar continuum dynamics in the finite deformation and finite microrotation regime. The material point method enables us to simulate large deformation problems while circumventing the potential mesh distortion without remeshing. To eliminate rotational motion damping and loss of angular momentum during the projection, we introduce the mapping for microinertia and angular momentum between particles and grids through the affine particle-in-cell approach. The microrotation and the curvature at each particle are updated through zero-order forward integration of the microgyration and its spatial gradient. We show that the microinertia and the angular momentum are conserved during the projections between particles and grids in our formulation. We verify the formulation and implementation by comparing with the analytical dispersion relation of micropolar waves under the small strain and small microrotation, as well as the analytical soliton solution for solids undergoing large deformation and large microrotation. We also demonstrate the capacity of the proposed computational framework to handle a wide spectrum of simulations that exhibit size effects in the geometrical nonlinear regime through three representative numerical examples, i.e., a cantilever beam torsion problem, a fragment-impact penetration problem, and a micropolar fluid discharging problem.
Article
Full-text available
Conventionally, neural network constitutive laws for path-dependent elasto-plastic solids are trained via supervised learning performed on recurrent neural networks, with the time history of strain as input and the stress as input. However, training a neural network to replicate path-dependent constitutive responses require significantly more amount of data due to path dependence. This demand on diverse and abundance of accurate data, as well as the lack of interpretability to guide the data generation process, could become major roadblocks for engineering applications. In this work, we attempt to simplify these training processes and improve the interpretability of the trained models by breaking down the training of material models into multiple supervised machine learning programs for elasticity, initial yielding, and hardening laws that can be conducted sequentially. To predict pressure-sensitivity and rate dependence of the plastic responses, we reformulate the Hamliton-Jacobi equation such that the yield function is parametrized in product space spanned by the principle stress, the accumulated plastic strain, and time. To test the versatility of the neural network meta-modeling framework, we conduct multiple numerical experiments where neural networks are trained and validated against (1) data generated from known benchmark models, (2) data obtained from physical experiments, and (3) data inferred from homogenizing sub-scale direct numerical simulations of microstructures. The neural network model is also incorporated into an offline FFT-FEM model to improve the efficiency of the multiscale calculations.
Article
Full-text available
Cyclotetramethylene-Tetranitramine (HMX) is a secondary explosive used in military and civilian applications. Its plastic deformation is of importance in the initiation of the decomposition reaction, but the details of plasticity are not yet fully understood. It has been recently shown that both the elastic constants and the critical resolved shear stress for plastic deformation are pressure sensitive. Since initiation takes place during shock loading, the pressure sensitivity of plasticity is highly relevant. In this work, we examine the pressure-sensitivity of the dynamic mechanical behavior of HMX. To this end, we use an elastic-plastic continuum constitutive model of single crystal HMX in which the anisotropic elastic constants and direction-dependent yield stress are rendered pressure-sensitive. The pressure sensitivity is calibrated based on input from molecular models. We observe that accounting for pressure sensitivity changes significantly the profile of the elastic-plastic wave and the wave propagation speed upon impact. The accumulated dissipation profile and the total dissipation also exhibit profound differences between the simulations that take account of the pressure-dependence of the plastic deformation and the pressure independent counterpart. 2 Ran Ma et al.
Article
Full-text available
We introduce a deep learning framework designed to train smoothed elastoplasticity models with interpretable components, such as stored elastic energy function, field surface, and plastic flow that may evolve based on a set of deep neural network predictions. By recasting the yield function as an evolving level set, we introduce a deep learning approach to deduce the solutions of the Hamilton-Jacobi equation that governs the hardening/softening mechanism. This machine learning hardening law may recover any classical hand-crafted hardening rules and discover new mechanisms that are either unbeknownst or difficult to express with mathematical expressions. Leveraging Sobolev training to gain control over the derivatives of the learned functions, the resultant machine learning elastoplasticity models are thermody-namically consistent, interpretable, while exhibiting excellent learning capacity. Using a 3D FFT solver to create a polycrystal database, numerical experiments are conducted and the implementations of each component of the models are individually verified. Our numerical experiments reveal that this new approach provides more robust and accurate forward predictions of cyclic stress paths than those obtained from black-box deep neural network models such as the recurrent neural network, the 1D convolutional neural network, and the multi-step feed-forward models.
Article
Full-text available
We present a reduced-dimensional proper orthogonal decomposition (POD) solver to accelerate discrete element method (DEM) simulations of the granular mixing problem. We employ the method of snapshots to create a low-dimensional solution space from previous DEM simulations. By reducing the dimensionality of the problem, we accelerate the calculations of the incremental solution with fewer degrees of freedom (DOF), while enabling a larger stable time step due to the filtering of low-energy mode. We analyze two feasible strategies to generate the reduced-dimensional basis, one generating by finding the orthogonal basis from the global snapshots captured at the same location in the parametric domains ; another one employing the known POD bases from the closest known cases. Our results show that, when POD bases are generated via the local strategy, the reduced-order model is a more efficient alternative to the full-scale simulations for extrapolating behaviors in the parametric domain. Numerical examples of granular mixing problems are presented to demonstrate the efficiency and accuracy of the proposed approach.
Article
In this work we employ an encoder–decoder convolutional neural network to predict the failure locations of porous metal tension specimens based only on their initial porosities. The process we model is complex, with a progression from initial void nucleation, to saturation, and ultimately failure. The objective of predicting failure locations presents an extreme case of class imbalance since most of the material in the specimens does not fail. In response to this challenge, we develop and demonstrate the effectiveness of data- and loss-based regularization methods. Since there is considerable sensitivity of the failure location to the particular configuration of voids, we also use variational inference to provide uncertainties for the neural network predictions. We connect the deterministic and Bayesian convolutional neural network formulations to explain how variational inference regularizes the training and predictions. We demonstrate that the resulting predicted variances are effective in ranking the locations that are most likely to fail in any given specimen.
Article
Flexible foams are a class of materials often used in transportation systems to mitigate mechanical shocks and vibrations. Polydispersity causes these foams to have a complex microstructure composed of a matrix polymer material and nearly spherical voids. This structure leads to a complex material response upon loading where the load for a given displacement becomes highly dependent upon the current volume fraction of voids in the deformed foam. This complex behavior makes it challenging to develop constitutive models for flexible foams where typically only the homogenized response of the foam is considered during model development, calibration, and eventual deployment. To overcome these challenges we utilize a micromechanics finite element method simulation-informed machine learning framework to develop new constitutive models with idealized foam microstructures of moderate density in mind. Several different machine learned models will be presented and validated against micromechanics finite element simulation which were absent from the training dataset. Specifically, traditional data-driven machine learned regression models will be compared with machine learned models which learn the deviation of representative volume element data from a traditional homogenized constitutive model. A discussion on the strengths and weaknesses of each of the approaches will be presented.
Article
In the present work, two machine learning based constitutive models for finite deformations are proposed. Using input convex neural networks, the models are hyperelastic, anisotropic and fulfill the polyconvexity condition, which implies ellipticity and thus ensures material stability. The first constitutive model is based on a set of polyconvex, anisotropic and objective invariants. The second approach is formulated in terms of the deformation gradient, its cofactor and determinant, uses group symmetrization to fulfill the material symmetry condition, and data augmentation to fulfill objectivity approximately. The extension of the dataset for the data augmentation approach is based on mechanical considerations and does not require additional experimental or simulation data. The models are calibrated with highly challenging simulation data of cubic lattice metamaterials, including finite deformations and lattice instabilities. A moderate amount of calibration data is used, based on deformations which are commonly applied in experimental investigations. While the invariant-based model shows drawbacks for several deformation modes, the model based on the deformation gradient alone is able to reproduce and predict the effective material behavior very well and exhibits excellent generalization capabilities. In addition, the models are calibrated with transversely isotropic data, generated with an analytical polyconvex potential. For this case, both models show excellent results, demonstrating the straightforward applicability of the polyconvex neural network constitutive models to other symmetry groups.