PreprintPDF Available

MeshingNet3D: Efficient Generation of Adapted Tetrahedral Meshes for Computational Mechanics

Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

We describe a new algorithm for the generation of high quality tetrahedral meshes using artificial neural networks. The goal is to generate close-to-optimal meshes in the sense that the error in the computed finite element (FE) solution (for a target system of partial differential equations (PDEs)) is as small as it could be for a prescribed number of nodes or elements in the mesh. In this paper we illustrate and investigate our proposed approach by considering the equations of linear elasticity, solved on a variety of three-dimensional geometries. This class of PDE is selected due to its equivalence to an energy minimization problem, which therefore allows a quantitative measure of the relative accuracy of different meshes (by comparing the energy associated with the respective FE solutions on these meshes). Once the algorithm has been introduced it is evaluated on a variety of test problems, each with its own distinctive features and geometric constraints, in order to demonstrate its effectiveness and computational efficiency.
Content may be subject to copyright.
MeshingNet3D: Efficient Generation of Adapted
Tetrahedral Meshes for Computational Mechanics
Zheyan Zhang, Peter K. Jimack, He Wang
School of Computing, University of Leeds, UK
We describe a new algorithm for the generation of high quality tetrahedral
meshes using artificial neural networks. The goal is to generate close-to-
optimal meshes in the sense that the error in the computed finite element
(FE) solution (for a target system of partial differential equations (PDEs))
is as small as it could be for a prescribed number of nodes or elements in
the mesh. In this paper we illustrate and investigate our proposed approach
by considering the equations of linear elasticity, solved on a variety of three-
dimensional geometries. This class of PDE is selected due to its equivalence
to an energy minimization problem, which therefore allows a quantitative
measure of the relative accuracy of different meshes (by comparing the energy
associated with the respective FE solutions on these meshes). Once the
algorithm has been introduced it is evaluated on a variety of test problems,
each with its own distinctive features and geometric constraints, in order to
demonstrate its effectiveness and computational efficiency.
Keywords: Optimal mesh generation, Finite element methods, Machine
learning, Artificial neural networks
1. Introduction1
The finite element method (FEM) is one of the most widely used ap-2
proaches for solving systems of partial differential equations (PDEs), which3
arise across multiple applications in computational mechanics [1, 2]. The4
key feature in determining the efficiency of the FEM on any given problem is5
the quality of the mesh: in general terms, the finer the mesh the better the6
solution but the greater the computational cost of obtaining it. This trade-7
off has led to a vast body of research into the generation of high-quality FE8
Preprint submitted to Advances in Engineering Software May 11, 2021
meshes over decades. Typically, the objective is either to generate a mesh9
for which the corresponding FE solution has a prescribed accuracy using a10
minimal number of degrees of freedom (e.g. [3]), or to generate the best11
possible mesh for a predetermined number of degrees of freedom (e.g. [4]).12
In this paper we focus primarily on the latter, however the two approaches13
are very closely related.14
Interest in the use of data driven methods to obtain solutions of PDEs has15
grown significantly in recent years, largely due to the increase in computing16
power that supports the application of deep neural networks (NNs) [5, 6]. In17
this work however, we do not aim to apply NNs to estimate PDE solutions18
directly: instead we consider their use to estimate optimal meshes on which19
to compute traditional FE approximations. The rationale for this is our20
hypothesis that, for a given approximation error, a larger representation error21
can be tolerated in a NN to estimate the FE meshes than for a NN to estimate22
a family of PDE solutions directly. We present a universal deep-learning-23
based mesh generation system, MeshingNet3D, that extends our initial 2D24
ideas, [7], by building upon classical a posteriori error estimation techniques25
and adopting a new local coordinate system. Consequently, MeshingNet3D is26
able to guide non-uniform mesh generation for a wide range of PDE systems27
with rich variations of geometries, boundary conditions and PDE parameters.28
In the remainder of this section we provide brief overviews of classical29
non-uniform mesh generation methods, artificial neural networks and mean30
value coordinates (which are core to the generality of our algorithm). Key31
areas of related research are also highlighted. Section 2 then describes our32
methodology in full, whilst Section 3 provides detailed validation and testing.33
The paper concludes with a discussion of our findings and of the outlook for34
further developments.35
1.1. Non-uniform mesh generation36
When applying the FEM to approximate the solution of a computational37
mechanics problem, it is necessary to define both the type of elements and38
the computational mesh upon which the approximation is sought. The sim-39
plest elements are piecewise linear functions on simplexes (triangles in 2D40
and tetrahedra in 3D) however other choices are widely used. In 3D, these41
include higher order Lagrange elements (also defined on tetrahedra), tri-42
linear and triquadratic elements (defined on octahedra) and more general43
elements associated with discontinuous Galerkin methods, which may be ap-44
plied on hybrid meshes [8]. In this paper we restrict our consideration to45
unstructured tetrahedral meshes [9, 10]. Structured meshes of octahedra do46
have some advantages, such as requiring less memory, however they are less47
flexible when considering complex geometries or when targeting highly non-48
uniform meshes, with optimal approximation properties, which is the goal of49
this work.50
When the volumes of the elements in a given mesh are approximately51
equal, and the aspect ratio of each tetrahedron is bounded by a small con-52
stant, we refer to the mesh as being uniform. Theoretical results about the53
asymptotic convergence of the FEM typically hold for sequences of finer and54
finer uniform meshes [11]. For many problems such meshes are not the best55
choice however: since the error in the corresponding FE solution may be56
much greater on some elements than on others. In such cases it would usu-57
ally be far better to have more elements in the “high error” regions and fewer58
elements in the “low error” regions. The resulting mesh may have the same59
number of elements in total (with a non-uniform size distribution) but per-60
mit a much more accurate FE representation of the true solution. Ideally, we61
would like to identify an element size distribution to ensure that a prescribed62
global error tolerance can be obtained with the fewest possible number of el-63
ements [12]. In practice this is attempted through the use of prior knowledge64
to control the mesh size distribution (e.g. geometrical information or a priori65
analysis [13]), or through an iterative process based around a posteriori error66
estimates for intermediate solutions [3].67
This iterative approach to mesh generation consists of three steps: (i)68
compute an FE solution on a coarse mesh; (ii) estimate the error locally69
throughout this solution; (iii) adapt the mesh based upon this estimate. At70
the next iteration these steps are repeated, beginning with the mesh produced71
in (iii).72
There is a large body of work on the development of cheap and reliable a73
posteriori error estimators. Popular approaches include those which involve74
solving a set of local problems on each element, or on small patches of ele-75
ments, to directly estimate the error function [14, 15], and those based upon76
the recovery of derivatives of the solution field by sampling at particular77
points and then interpolating with a higher degree polynomial [16, 17]. For78
example, in the context of linear elasticity problems, the elasticity energy79
density of a computed solution is evaluated at each element and the recov-80
ered energy density value at each vertex is defined to be the average of its81
adjacent elements. The local stress error is then proportional to the differ-82
ence between the recovered piece-wise linear energy density and the original83
piece-wise constant values, [16]. Considering its wide application in engineer-84
ing practice, this “ZZ error estimate” has been used as the baseline in our85
work, to generate comparative data (and meshes) from which MeshingNet3D86
will be trained, and against which it will be evaluated.87
The third step in the iterative approach to mesh generation is to adapt the88
existing coarse mesh based upon the estimated local error distribution. This89
may be achieved through the creation of an entirely new mesh, with target90
element sizes guided by the local error estimate [9], or via local adaptivity. In91
the latter case the mesh can be moved locally (r-refinement) [4] and/or locally92
refined/coarsened (h-refinement) [18]. No matter the type of refinement, the93
iterative process will generally require multiple passes to obtain a high quality94
final mesh. This therefore becomes a time-consuming pre-processing step –95
which we seek to avoid in this work.96
1.2. Deep neural networks97
Artificial neural networks (ANNs) are used to approximate mappings be-98
tween specified inputs and outputs. They achieve this through a composition99
that is loosely based upon the neurons in a biological brain: there are a num-100
ber of layers of nodes which are connected in a predetermined manner, and101
each node combines inputs received from the previous layer to generate an102
output that is passed to the next layer (with the first layer representing the103
input vector and the last layer the output vector). A number of free pa-104
rameters are associated with each node, defining the action of that node,105
and these are prescribed based upon the minimization of a chosen training106
loss function. This learning problem is therefore equivalent to a nonlinear,107
multivariate optimization. Furthermore, since the loss function is designed108
to be differentiable with respect to the network parameters, an ANN can be109
trained using gradient decent methods such as stochastic gradient descent110
In recent years, with the developments in parallel hardware, so-called112
deep neural networks (DNNs), with many layers and very large numbers of113
parameters, have been proven to be remarkably effective at high-level tasks114
such as object recognition [20]. Within computational science, DNNs have115
also been explored to solve ordinary differential equations (ODEs) and PDEs116
in both supervised [21, 22] and unsupervised [23] settings. In the latter cases117
the network parameters are evaluated based upon a residual minimization,118
rather than using a labelled training data set, as in the supervised case. In119
each approach however, whilst the results are very promising, it is difficult120
to obtain high accuracy in the solutions and it is currently not possible to121
provide any guarantees on the accuracy. Consequently, rather than solving122
PDEs directly, our focus in what follows is to use DNNs to provide an esti-123
mate of the optimal finite element mesh, with the goal of obtaining the most124
efficient possible finite element solution.125
1.3. Mean value coordinates126
An important feature of our algorithm is the use of mean value coor-127
dinates (MVCs). These are a generalization of the barycentric coordinate128
system for simplexes [24, 25], to polygons in 2D and polyhedra with tri-129
angular faces in 3D [26], whereby the coordinates of any point within the130
polygon/polyhedron may be expressed as a convex combination of the po-131
sitions of the boundary vertices. Consequently, all interior points in the132
neighbourhood of an arbitrary boundary vertex have a high value of the cor-133
responding MVC component. MVCs also have a number of properties that134
make them attractive choices as input parameters to a DNN, for example135
their local smoothness with respect to spatial variations, as well as being136
both scale and rotationally invariant.137
1.4. Related work138
As noted above, recent developments in DNNs have led to renewed in-139
terest in the application of machine learning (ML) to the direct solution of140
PDEs and PDE systems [27]. The majority of this research is based upon141
supervised learning strategies, such as [28], which requires the use of a con-142
ventional solver to generate training data. Once trained, the NN is able to143
solve problems of the same type much more quickly than the original solver.144
Recently there has also been a growth in interest in the development and ap-145
plication of unsupervised learning methods, which act as independent PDE146
solvers without the need to refer to external supervisory information. These147
have been investigated particularly in the context of physics-informed algo-148
rithms [29, 30] or those targeting high-dimensional PDEs, [5, 6]. However,149
the issue still remains that, whether solving computational mechanics prob-150
lems directly via supervised or unsupervised learning, current capabilities151
do not provide any a priori guarantees of accuracy. Indeed, even when a152
such solution has been produced, it is not generally possible to estimate how153
accurate it is.154
Some previous authors have also considered the application of ML to155
mesh-related problems, sharing our aim of enhancing traditional FE solvers156
rather than replacing them completely. Examples include mesh quality as-157
sessment [31, 32] and mesh partitioning algorithms, such as [33], to comple-158
ment parallel distributed solvers. Research into mesh generation using ML159
has also been undertaken, both for pure shape representation [34, 35], and160
as the basis for an efficient finite element solver [36]. This latter approach161
is based upon a self-organizing network but is restricted to fitting (and op-162
timizing) a mesh of a fixed topology to a prescribed geometry. In [37] a163
recurrent network is used to enhance the traditional iterative approach to164
mesh generation through the use of ML to control the mesh adaptivity step.165
They are able to show results that match the quality of iteratively refined166
meshes using conventional error estimates and refinement strategies. Whilst167
these works each replace some aspects of the conventional mesh generation168
step with a NN, none of them address the specific problem tackled in this169
paper, where we seek to generate a single, non-uniform, tetrahedral mesh170
that provides a pseudo-optimal finite element representation of the solution171
of an unseen problem.172
In this context of using ML to guide high-quality non-uniform mesh gen-173
eration there is relatively little prior research. In [38], for example, early174
knowledge-based approaches were considered, though with limited success.175
Time-dependent remeshing is studied by [39], where an NN is used to under-176
take time-series predictions that identify areas of greater (and less) refinement177
at different times, though on a domain with a simple geometry. In [40, 41]178
NNs were applied successfully to generate high quality finite element meshes179
for elliptic PDEs, however the input vectors are highly problem-dependent:180
requiring specific a priori knowledge of the geometries being considered. The181
challenge of using DNNs to generate psuedo-optimal FE meshes on quite gen-182
eral geometries was first considered in [7] for selected two-dimensional PDEs.183
This paper extends these ideas to problems in three dimensions, to consider184
PDE systems with rich variation in geometry, boundary conditions and ma-185
terial properties.186
2. Methodology187
The goal of this research is to develop a robust, and widely applicable,188
mesh generation procedure for the efficient FE solution of systems of elliptic189
PDEs. Our particular emphasis here is on the equations of linear elasticity,190
however the approach described in this section may equally be applied to191
any family of problems for which a reliable a posteriori local error estimator192
is available to support the training phase for our neural network. In the first193
subsection we provide an overview of our methodology, with further details194
on the software design, training data generation and the training of the deep195
network given in the following subsections.196
2.1. Theory197
We seek to automatically generate high quality FE meshes for arbitrary198
instances within a given family of PDE problems, where each instance is199
defined by the domain geometry Gred (from a predefined family of possible200
geometries), the PDE parameters M, and the applied boundary conditions201
B. For any given mesh, the corresponding FE solution is assumed to be a202
unique solution, for which we have available a means of determining the local203
error. This computed a posteriori error is also assumed to be unique, and204
provides a mechanism for determining a desired FE element size for each205
location within the domain. Consequently, in order to generate a pseudo-206
optimal FE mesh we seek to estimate a mapping Fthat represents an ideal207
spatial distribution of the FE element sizes:208
Here, X is the specified location in the domain and Sis the target element209
size (for example average edge length) at X. Noting that we define each210
instance by its specific geometry, parameter values and boundary conditions,211
we may express this mapping more precisely as:212
F:XS(G, B, M ;X) (2)
Our goal is to make use of offline training to create a neural network that is213
able to learn the mapping214
F:G, B, M, X S(3)
After training, the NN is able to predict a pseudo-optimal mesh-size distri-215
bution for unseen problems. Specifically, given G, B, M for any problem,216
and an arbitrary sample point X, the NN outputs a target element size at217
that sample point. This is precisely the information required by a 3D mesh218
generator in order to generate a non-uniform, unstructured finite element219
2.2. Software and evaluation221
In this paper we use Tetgen for tetrahedral mesh generation, [9], and222
FreeFem++ to assemble and solve the corresponding global FE systems [42].223
The input to Tetgen includes a .poly file containing vertices and edges of224
polygons that define the boundary of the computational domain. From this225
file Tetgen is able to generate a uniform mesh based upon a single parameter,226
indicating a constant target element size. To generate a non-uniform mesh,227
Tetgen reads a background mesh from .b.ele,.b.node and .b.face files, and228
an element size list file, that defines target element sizes correspond-229
ing to the vertices of the background mesh. Having defined a valid mesh,230
FreeFem++ is able to solve variational problems that are user defined. To231
do this it executes a .edp script file containing information such as: how to232
import the mesh; what type of finite element to use, what the specific vari-233
ational form is; and which solver to apply. In this paper all examples are234
based upon the use linear tetrahedral elements (as gerenated by Tetgen) and235
the Lam´e solver (for the equations of linear elasticity).236
To mass produce training problems a simple script has been produced that237
allows an appropriate .poly file to be generated for a given geometry G. Then,238
for each geometry this calls FreeFem++ to obtain linear elasticity solutions239
for a range of material parameters (M) and boundary conditions (B). Note240
that FreeFem++ not only solves the elasticity equations but also computes241
the total stored energy, which may be used to evaluate the quality of a given242
FE solution. This is because the underlying PDE system corresponds to an243
energy minimization problem, so the analytic solution minimizes the energy244
functional over all functions from the appropriate Sobolov space ((H1
this case). On the other hand, the FE solution minimizes the energy over all246
functions in the space of piecewise linear functions on the given tetrahedral247
mesh. Since this is a subspace of (H1
E)3, the energy corresponding to the FE248
solution is always greater than the energy of the analytic solution. Therefore,249
the lower the energy associated with the computed FE solution the better the250
solution. Consequently, the quality of any given set of tetrahedral meshes,251
for a particular problem, may be ranked based upon the computed energy252
corresponding to the finite element solution on each mesh: the lower the253
energy the better the mesh. We will use this observation as part of the254
evaluation of our approach.255
Figure 1: Illustration of the training data for MeshingNet3D : each individual problem is
defined by the geometry G, the PDE parameters Mand the boundary condition parame-
ters B(not shown here). However for each such problem there are multiple sample points,
X, in the domain, with the corresponding local mesh size Sspecified.
2.3. Data generation256
Training data is required in order to sample the mapping of equation257
(2). Each training problem is defined by parameters that uniquely define the258
geometry (G), PDE parameters (M) and boundary conditions (B). For each259
such problem, multiple training data are generated by specifying numerous260
points, X, at which the target mesh size is given. This is illustrated in261
Figure 1: for which there are 3000 test problems, each of which generates262
multiple inputs, corresponding to different points Xin the domain. The263
precise number of points Xis problem-dependent which should be sufficient264
to represent the spatial mesh size variation throughout the domain (too many265
points will not decrease the training performance but will slow down the data266
generation). For each input the generated output, used to train the NN, is267
the target mesh size, S, for that point and that problem.268
The value of Sis computed using a variation of the iterative approach269
to mesh generation, based upon a posteriori error estimation, described in270
Subsection 1.1. For each problem we generate a relatively coarse uniform271
mesh and compute the corresponding FE solution and error estimate. In272
this work we use the “ZZ” energy estimate of [16]. However, for different273
problems or different quantities of interest, other choices are possible. For274
each sample point, X, the estimated local error, E(X), can be converted to275
a target element size using an inverse relationship such as276
S(X) = K
for some scaling coefficient K. This is the value of S(X) used to define (2),277
as illustrated in Figure 1. The effect of the scaling coefficient is to control the278
total number of elements in the non-uniform mesh that is generated based279
upon the target local size distribution S(X). Hence, for each test problem,280
Kmay be adjusted iteratively in order to obtain a target number of elements281
in the non-uniform mesh (or a target total error in the FE solution).282
Note that the precise definition of the input vector in Figure 1 has to283
be problem dependent: a parameterization of the family of domains is re-284
quired to define G; the number of free parameters in the PDE systems has285
to be predetermined; and the possible boundary conditions must also be286
parameterized. For each example shown in the Experiments section of this287
paper a different input vector has therefore been prescribed. Nevertheless,288
the methodology described here is shown to work robustly on all settings.289
The final component required for the data generation is the algorithm to290
select the sample points Xfor each of the training problems. This is achieved291
via two steps: first an initial non-uniform mesh with predefined target ele-292
ment number is generated (e.g. by Tetgen) based upon the a posteriori error293
computed on the coarse uniform mesh; then we sample a fixed percentage294
of the elements of this mesh (we find that 10% is adequate), choosing each295
Xto be the MVCs of the centroids of the sampled elements. Note that the296
advantage of sampling from the non-uniform, rather than the uniform, mesh297
is that the training data is weighted based upon the error distribution: our298
experiments show this to be advantageous.299
2.4. Training and using the neural network300
The deep learning platform that we use in this work is Keras [43] based301
on Tensorflow [44]. Our networks are fully connected, typically with six302
hidden layers, though we find that our results are not especially sensitive to303
the number of layers or the precise number of neurons per layer. We do ob-304
serve however that it is advantageous to first increase and then decrease the305
number of neurons per layer as we pass forward through the network. The306
activation functions selected in this model are rectified linear functions [45]307
for the hidden units, with linear activation in the output layer. Before train-308
ing, the input data is linearly normalised and 10% is selected for validation309
(monitoring the validation loss during training can help to identify and pre-310
vent over-fitting). The training itself uses mean square error loss and the311
stochastic gradient descent optimiser, Adam[46], with batch sizes of 128: for312
each of the examples considered in this paper this takes no longer than 3313
hours on a Nvidia RTX 2070 graphics card.314
Once trained, the NN can be used to guide mesh generation for unseen315
problems in real time. Given a new problem, defined by G, M and B, a316
uniform background mesh is generated based upon Galone. For each element317
in this background mesh we compute the MVC of its centre and concatenate318
this with the problem parameters to form an input vector for the NN. The319
corresponding output is the target element size at the centre of that element.320
The background mesh, with its associated target element size distribution,321
is then used to allow TetGen to generate the desired non-uniform mesh. If322
the total number of elements in this mesh is outside of the required range323
then each S(X) may be scaled linearly before generating an updated non-324
uniform mesh. In this way, an adapted tetrahedral mesh of a specified size is325
generated directly, without the need to compute a sequence of FE solutions326
and a posteriori error estimates, as would otherwise be the case.327
3. Computational Experiments328
We present four computational tests which allow us to analyse the per-329
formance of MeshingNet3D across a range of different problems, geometries,330
boundary conditions and PDE parameters. The first and the third case in-331
volve prismatic geometries, which permit the description of spatial locations332
based upon “2.5D MVCs”. These are composed of regular 2D MVCs in the333
x-y planes plus an additional z-coordinate. The second case uses general334
3D Cartesian coordinates, whilst the final example uses general polyhedral335
geometries and fully 3D MVCs.336
For each of the examples we provide a brief description of the problem,337
followed by a discussion of the network topology used (including the specific338
input vector) and the training undertaken. We then present results based339
upon 500 unseen test problems. These results compare the FE solutions340
computed on the NN-guided mesh with those computed on a “ground truth”341
mesh of similar size, generated using the same ZZ a posteriori error estimator342
that was applied to train the network. We also compare against the FE343
solution computed on a uniform mesh with a similar number of elements. To344
facilitate these comparisons, for each of the 500 test problems, we compute345
the difference between the total energy of the FE solution on the NN-guided346
mesh with that of the FE solution on the comparison mesh. We then provide347
a histogram to illustrate the proportion of the test cases in different binned348
error ranges. A negative value of the difference indicates that the solution349
on the NN-based mesh has a lower energy and is therefore superior.350
3.1. Clamped beam351
We consider the problem of an over-hanging beam (under gravity), with352
different cross sections (G) and variable boundary conditions (B). In this case353
the material parameters (M) are not varied (the specific inputs to the Lam´e354
solver in FreeFEM++ being: density = 8000, Young’s modulus = 210 ×109
and Poisson’s ratio = 0.27).356
3.1.1. Problem specification357
The beam is a right prism with a convex quadrilateral cross section as358
illustrated in Figure 2. This cross section has vertices at (x0, y0) = (0,0) and359
(x1, y1) = (0,2), and also at (x2, y2) and (x3, y3) which are randomly sampled360
within x2(1.5,2.5), y2(1.5,2.5), x3(0.5,0.5) and y3(1.5,2.5) for361
each problem. The length of the beam is fixed (0 z6) and a boundary362
shear, with components (fx, fy,0), is applied at the face z= 6. The face z= 0363
is clamped and the bottom face is clamped between z= 0 and z=ζ, where364
2< ζ < 4 (randomly sampled for each problem). All other boundaries are365
free, subject to zero normal stress. Hence the input vector for this problem366
requires values for x2,y2,x3,y3,ζ,fxand fy, along with the MVCs of367
the point at which the mesh spacing is required. In these examples, the368
parameters fxand fyare constrained to lie in the range (106,106).369
3.1.2. Network information370
In this example our fully-connected network has six hidden layers with371
32, 64, 128, 64, 32 and 8 neurons respectively. Training data is generated372
based upon solving 3000 individual problems, each of which is obtained us-373
ing a random choice for each input parameter (selected uniformly from its374
range), leading to 10,740,746 individual input-output pairs. Of these, 10%375
are selected for validation and the remainder are used for training using a376
batch size of 128. The training takes 10 epochs, meaning that each item of377
data has been used an average of 10 times. Figure 13 shows the rates of378
convergence for the training, along with the corresponding validation curve.379
Figure 2: The geometry and boundary conditions for the Clamped beam, with constant
cross section along the z-axis. The gravity is uniformly distributed over the volume. The
surfaces bounded by four vertices with blue triangles are clamped.
3.1.3. Results380
Figure 3 demonstrates that the NN-guided meshes generally perform at381
least as well as the ground truth meshes (generated from explicitly-computed382
a posteriori error estimates) and, as expected, much better than uniform383
meshes. Two typical examples are shown in Figure 4, which compares NN-384
guided meshes (bottom) with their ground-truth counterparts (top). In each385
case the high mesh density near y= 0 and z=ζis easily captured. More386
significantly however, high and low mesh density regions are captured well387
throughout the domain, with a smooth variation between these regions.388
3.2. Laminar material389
In this example we consider a variation of the previous problem for which390
the material parameters (M) are now permitted to vary but the geometry391
(G) and the boundary conditions (B) are kept fixed.392
3.2.1. Problem specification393
A beam of dimensions 1 ×1×5 is composed of two horizontal layers, as394
illustrated in Figure 5. Each layer has a Young’s modulus (Etop and Ebot)395
between 109and 1011, and a Poisson’s ratio (νtop and νbot) between 0.05396
and 0.45. The densities of the two materials are both 8000 and the interface397
between the layers is at a height y=h(0.2,0.8). Half of the bottom surface398
Figure 3: For the Clamped beam, FE energies of neural network (NN) generated meshes
versus uniform mesh FE energies and ground truth (GT) energies. The height of each bar
represents the proportion of experiment results in the energy range shown on the x-axis
(as a percentage of the ground truth energy).
(y= 0, 0 < z < 2.5) is clamped, as is the surface z= 0. On the surface399
z= 5 a traction of amplitude 10000 is applied in the xdirection, with all400
other boundaries free to displace under zero normal-stress conditions. Hence401
the input vector for this problem requires values for Etop,Ebot,νtop,νbot
and h, along with the coordinates of the point at which the mesh spacing403
is required. We actually use log10 (Etop) and log10 (Ebot) as the first two404
input parameters.405
3.2.2. Network information406
In this example our fully-connected network has five hidden layers with 32,407
64, 32, 16 and 8 neurons respectively. Training data is generated based upon408
solving 3000 individual problems, each of which is obtained using a random409
choice for each input parameter (selected uniformly from its range), leading410
to 19,719,750 individual input-output pairs. Of these, 10% are selected for411
validation and the remainder are used for training using a batch size of 128.412
The training takes 15 epochs, and Figure 13 shows the rates of convergence413
for this training, along with the corresponding validation curve.414
3.2.3. Results415
Figure 6 demonstrates that, as in the previous example, the NN-guided416
meshes typically perform on a par with the ground truth meshes, and much417
better than uniform meshes. Two typical examples are shown in Figure 7:418
Figure 4: For the Clamped beam, ground truth meshes (top) and NN-guided meshes
(bottom) for two test cases.
in the case (a) and (c)419
(log10 (Etop),log10 (Ebot), νtop, νbot, h) = (10.82,9.17,0.34,0.20,0.34) ,
and for (b) and (d)420
(log10 (Etop),log10 (Ebot), νtop, νbot, h) = (9.17,10.33,0.44,0.21,0.41) .
In the first example the top layer has the higher Young’s modulus, which421
leads to a higher mesh density in this layer (for both the NN-guided and422
Figure 5: The boundary conditions and loads of the laminar material where the height
of the interface is random
Figure 6: For the Laminar material, FE energies of neural network (NN) generated
meshes versus uniform mesh FE energies and ground truth (GT) energies. The height of
each bar represents the proportion of experiment results in the energy range shown on the
x-axis (as a percentage of the ground truth energy).
the ground-truth meshes). Conversely, in the second example the bottom423
material is stiffer than the top and we see a very different distribution of the424
element size. In each case there is a strong correlation between the NN-guided425
mesh and the ground-truth case.426
3.3. hex-bolt with a hole427
We consider the problem of a hex-bolt (under torque), with different cross428
sections (G). In this case the material parameters (M) are not varied (the429
specific inputs to the Lam´e solver in FreeFEM++ being: density = 8000,430
Young’s modulus = 210 ×109and Poisson’s ratio = 0.27).431
3.3.1. Problem specification432
A regular hexagonal prism has an octagonal prism hole inside it where433
the height of the prism is h= 4 (Figure 8 left). On the cross section, the434
edge length of the regular hexagon is 4 and the octagon is coaxial with the435
hexagon. The eight vertices of the octagon lie on the same circle, whose436
radius varies r(0.2,1.0). The arc angles between vertices are random.437
Linear distributed pressures are applied to create a torque on the top (p=438
10000x+ 10000) and bottom (p=10000x10000) surfaces. The eight439
surfaces of the hole are clamped. The input vectors for this problem include440
the position of the octagon’s eight vertices and the MVCs of the target point441
Figure 7: (a)(c) and (b)(d) are two problems in the laminar material experiments. (a)
and (b) are ground truth meshes and (c) and (d) are non-uniform meshes guided by the
neural network
expressed with respect to both the vertices of the outer hexagon and the442
inner octagon (combined with its zcoordinate, z(1.0,0.0)).443
3.3.2. Network information444
In this example our fully-connected network has four hidden layers with445
32, 64, 16 and 8 neurons respectively. Training data is generated based upon446
solving 3000 individual problems, each of which is obtained using a random447
choice for each input parameter (selected uniformly from its range), leading448
to 10,748,618 individual input-output pairs. Of these, 10% are selected for449
validation and the remainder are used for training using a batch size of 128.450
The training takes 10 epochs and Figure 13 shows the convergence for this451
training, along with the corresponding validation curve.452
3.3.3. Results453
Figure 9 shows that the MeshingNet3D meshes are again better than454
uniform meshes and that the NN mesh energies are very close to those of455
the ground truth. As illustrated in Figure 10, the NN can successfully guide456
non-uniform mesh generation on very different geometries. This example457
also illustrates the success of the proposed approach on non-simply-connected458
Figure 8: The boundary conditions and loads of the hex bolt (left) and irregular
polyhedron (right). On hex bolt, eight surfaces of the hole are clamped, linear dis-
tributed pressure is applied on top and bottom surfaces.
domains. Note that the second problem (on the right) in Figure 10 illustrates459
one of the worst performing cases for the NN mesh relative to the ground460
truth: here, the NN mesh is more uniform than the ground truth (though461
still a vast improvement on a standard uniform mesh).462
3.4. Irregular polyhedron463
We now consider the problem of mesh generation on arbitrary twelve-464
faced polyhedra, with a range of geometries (G) and variable boundary con-465
ditions (B). In this case the material parameters (M) are not varied (the466
specific inputs to the Lam´e solver in FreeFEM++ being: density = 8000,467
Young’s modulus = 210 ×109and Poisson’s ratio = 0.27).468
3.4.1. Problem specification469
An irregular polyhedron with twelve triangular faces and eight vertices is470
illustrated in Fig 8 (right). The four “bottom” vertices are constrained to be471
co-planar and one of the two bottom triangular surfaces (i.e. the two triangles472
whose union is bounded by the four co-planar vertices) is clamped. In all473
training and testing problems the geometries are subject to the restriction474
that the four bottom vertices always lie in the same plane. A normal pressure475
of amplitude 10000 is applied on the two “top” surfaces (i.e. the triangular476
Figure 9: For hexbolt with a hole, FE energies of neural network (NN) generated meshes
versus uniform mesh FE energies and ground truth (GT) energies. The height of each bar
represents the proportion of experiment results in the energy range shown on the x-axis
(as a percentage of the ground truth energy).
faces whose union is bounded by the other four vertices) and zero normal477
stress is applied on the other nine triangular faces. The input vectors for478
this problem define the Cartesian coordinates of the eight vertices and the479
corresponding MVCs of the point at which the mesh spacing is required.480
3.4.2. Network information481
In this example our fully-connected network has four hidden layers with482
32, 64, 32, 16, and 8 neurons respectively. Training data is generated based483
upon solving 3000 individual problems, each of which is obtained using a ran-484
dom choice for each input parameter, leading to 7,383,999 individual input-485
output pairs. Of these, 10% are selected for validation and the remainder are486
used for training using a batch size of 128. We use the network after training487
10 epochs and Figure 13 shows the convergence for this training, along with488
the corresponding validation curve.489
3.4.3. results490
From Figure 11 it is cleear that the MeshingNet3D meshes are signifi-491
cantly better than uniform meshes and that the solution energies are rela-492
tively close to those of the ground truth: though in some cases the ground493
truth mesh is slightly superior. One such example is shown in Figure 12494
(three views of the same problem), where we see that the NN mesh appears495
to be more conservative in some aspects of its local refinement. Nevertheless,496
even in this worst-case scenario, the MeshingNet3D mesh generally has the497
same regions of refinement as the ground truth mesh.498
3.5. Discussion499
Across the four experiments described in this section we have shown re-500
sults over a range of geometries, boundary conditions and material parame-501
ters. For each problem the input layer of the NN is necessarily of a different502
dimension, which is dependent on the problem specification (along with the503
MVCs of the target point), whereas the output is always a single value rep-504
resenting the predicted mesh spacing at the target point. The number and505
size of the hidden layers is not a critical choice, but does naturally have some506
impact on the performance of the network.507
As an example, to illustrate this, Table 1 shows the performance of five508
different networks when applied to the fourth of the test problems above.509
In each case the networks have been trained on the same data set, with510
validation losses having converged after 10 epochs. The networks are then511
used to compute meshes on the same testing set of 500 unseen problems512
and the finite element solutions computed on all meshes. The energy of513
each solution is normalised against the energy of the finite element solution514
computed on the “ground truth” mesh so as to allow a meaningful average515
to be taken across all 500 cases. This is the value shown in the “normalised516
average energy” column of Table 1: so, the lower this energy the better the517
meshes are on average. The results shown in Subsection 3.4 are generated518
using NN3 from the table but NN2 and NN4 produced meshes of very similar519
quality. The network denoted by NN1 appears to have too few degrees of520
freedom to be able to model the non-uniform mesh patterns satisfactorily,521
whereas the network denoted by NN5 likely has too many degrees of freedom522
for the size of our training data set.523
Note that our NNs are always “spindly”, with the greatest number of neu-524
rons in the inner layers. We find from experiment that this kind of network525
appears to have the best performance for the set of tasks considered in this526
work. Given that our problems have a relatively small number of inputs and527
a very small number of outputs (typically one) this is perhaps not surpris-528
ing: to capture the highly nonlinear relationships between the inputs and the529
mesh spacing across the domain, significant complexity must be introduced530
into the network between the input and output layers.531
Finally, we note that MeshingNet3D has the potential to make simu-532
lations more efficient for designers who use pre-built 3D models provided533
NN NN structure training epochs normalised average energy
NN1 32-16-8 10 9.0×103
NN2 32-64-16-8 10 8.1×103
NN3 32-64-32-16-8 10 7.9×103
NN4 32-64-128-32-16-8 10 8.1×103
NN5 32-64-128-64-32-16-8 10 8.6×103
Table 1: Comparison of 5 different fully connected NNs based upon normalised average
energies of the finite element solutions. NN3 gives the lowest average energy and therefore
provides the best mean performance.
within Computer Aided Design (CAD) software to accelerate design. From534
screws and bolts, to washers and bearings, CAD can not only define ge-535
ometries but also materials. Embedding pre-trained MeshingNet3D in these536
CAD libraries could save meshing cost and provide high-quality non-uniform537
meshes. Similarity, MeshingNet3D can help parametric design where the NN538
is pre-trained for each geometry topology: under the guidance of the NN an539
appropriate mesh is generated in response to each iteration of the design. To540
implement this efficiently the challenge will be in defining a suitable family541
of boundary conditions as NN inputs, where forces due to interacting objects542
are unknown a priori. However, for components in a specific assembly, if543
contacts are defined, the load may be inferred by data-driven methods.544
4. Conclusions545
We have proposed a new framework for the generation of non-uniform546
three-dimensional finite element meshes. This is designed to produce meshes547
of the same quality as those obtained using traditional approaches, based548
upon a posteriori error estimates and local mesh refinement, but at a sub-549
stantially reduced computational cost. This has been implemented as Mesh-550
ingNet3D, building upon the 3D mesh generator Tetgen and the finite ele-551
ment package FreeFem++. By selecting the linear elasticity solver within552
FreeFem++ we have been able to undertake quantitative comparisons of553
different meshes based upon the energy minimization property of the elasto-554
static equations. Specifically, we can compare any two meshes by solving the555
finite element system on each mesh and then computing the stored energy of556
the solutions: the lower one being superior.557
We have assessed the performance of MeshingNet3D on four different558
problem families for which the optimal finite element mesh is generally highly559
non-uniform. In all cases we are able to demonstrate the capability to gen-560
erate meshes which are not only substantially better than uniform meshes561
for the same geometry, but which are comparable in quality to non-uniform562
meshes that are generated based upon the traditional (and expensive) ap-563
proach of undertaking a sequence of local adaptive steps involving finite el-564
ement solves and a posteriori error estimates. Perhaps not surprisingly, the565
benefits of MeshingNet3D are most apparent on those problems for which566
the optimal finite element mesh is far from uniform.567
The main limitation of our approach is associated with the need to define568
a different set of inputs for each family of problems that is to be considered.569
Hence, for each new family of problems being considered, it is necessary570
to define a set of inputs that fully reflects the richness of that family, and571
then to undertake training for a new network. Furthermore, as with most572
supervised learning approaches, there is a trade-off to be made between the573
level of generality of the family of problems that the user of MeshingNet3D574
wishes to consider and the amount of work that must be undertaken in the575
training phase of the algorithm. Nevertheless, in situations where many576
solutions are required for large numbers of related problems (such as design577
and optimization problems for example) this is likely to be a worthwhile578
expense. Finally, we note that, in cases where engineers may have limited579
confidence in their ability to define the most appropriate inputs (to define580
the geometry or boundary conditions for example), data analysis techniques581
such as principle components analysis may be used to find the most critical582
[1] P. M. Gresho, R. L. Sani, Incompressible flow and the finite element585
method. volume 1: Advection-diffusion and isothermal laminar flow586
[2] O. C. Zienkiewicz, R. L. Taylor, The finite element method for solid and588
structural mechanics, Elsevier, 2005.589
[3] R. Stevenson, Optimality of a standard adaptive finite element method,590
Foundations of Computational Mathematics 7 (2007) 245–269.591
[4] R. Mahmood, P. K. Jimack, Locally optimal unstructured finite element592
meshes in 3 dimensions, Computers & structures 82 (2004) 2105–2116.593
[5] E. Weinan, B. Yu, The deep ritz method: a deep learning-based nu-594
merical algorithm for solving variational problems, Communications in595
Mathematics and Statistics 6 (2018) 1–12.596
[6] J. Sirignano, K. Spiliopoulos, Dgm: A deep learning algorithm for solv-597
ing partial differential equations, Journal of computational physics 375598
(2018) 1339–1364.599
[7] Z. Zhang, Y. Wang, P. K. Jimack, H. Wang, Meshingnet: A new600
mesh generation method based on deep learning, arXiv preprint601
arXiv:2004.07016 (2020).602
[8] J. Chan, Z. Wang, A. Modave, J.-F. Remacle, T. Warburton, Gpu-603
accelerated discontinuous galerkin methods on hybrid meshes, Journal604
of Computational Physics 318 (2016) 142–168.605
[9] H. Si, Tetgen, a delaunay-based quality tetrahedral mesh generator,606
ACM Transactions on Mathematical Software (TOMS) 41 (2015) 11.607
[10] C. Geuzaine, J.-F. Remacle, Gmsh: A 3-d finite element mesh generator608
with built-in pre-and post-processing facilities, International journal for609
numerical methods in engineering 79 (2009) 1309–1331.610
[11] G. Strang, G. J. Fix, An analysis of the finite element method (1973).611
[12] W. D¨orfler, A convergent adaptive algorithm for poisson’s equation,612
SIAM Journal on Numerical Analysis 33 (1996) 1106–1124.613
[13] T. Apel, O. Benedix, D. Sirch, B. Vexler, A priori mesh grading for an614
elliptic problem with dirac right-hand side, SIAM journal on numerical615
analysis 49 (2011) 992–1005.616
[14] M. Ainsworth, J. T. Oden, A posteriori error estimation in finite element617
analysis, volume 37, John Wiley & Sons, 2011.618
[15] R. E. Bank, A. Weiser, Some a posteriori error estimators for elliptic619
partial differential equations, Mathematics of computation 44 (1985)620
[16] O. C. Zienkiewicz, J. Z. Zhu, A simple error estimator and adaptive622
procedure for practical engineerng analysis, International journal for623
numerical methods in engineering 24 (1987) 337–357.624
[17] O. C. Zienkiewicz, J. Z. Zhu, The superconvergent patch recovery and a625
posteriori error estimates. part 1: The recovery technique, International626
Journal for Numerical Methods in Engineering 33 (1992) 1331–1364.627
[18] W. Speares, M. Berzins, A 3d unstructured mesh adaptation algorithm628
for time-dependent shock-dominated problems, International Journal629
for Numerical Methods in Fluids 25 (1997) 81–104.630
[19] L. Bottou, Large-scale machine learning with stochastic gradient de-631
scent, in: Proceedings of COMPSTAT’2010, Springer, 2010, pp. 177–632
[20] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with634
deep convolutional neural networks, in: Advances in neural information635
processing systems, pp. 1097–1105.636
[21] T. Q. Chen, Y. Rubanova, J. Bettencourt, D. K. Duvenaud, Neural637
ordinary differential equations, in: Advances in neural information pro-638
cessing systems, pp. 6571–6583.639
[22] Z. Long, Y. Lu, X. Ma, B. Dong, Pde-net: Learning pdes from data,640
arXiv preprint arXiv:1710.09668 (2017).641
[23] J. Han, A. Jentzen, E. Weinan, Solving high-dimensional partial dif-642
ferential equations using deep learning, Proceedings of the National643
Academy of Sciences 115 (2018) 8505–8510.644
[24] K. Hormann, M. S. Floater, Mean value coordinates for arbitrary planar645
polygons, ACM Transactions on Graphics (TOG) 25 (2006) 1424–1441.646
[25] M. S. Floater, Mean value coordinates, Computer aided geometric647
design 20 (2003) 19–27.648
[26] M. S. Floater, G. K´os, M. Reimers, Mean value coordinates in 3d,649
Computer Aided Geometric Design 22 (2005) 623–631.650
[27] S. L. Brunton, B. R. Noack, P. Koumoutsakos, Machine learning for651
fluid mechanics, Annual Review of Fluid Mechanics 52 (2020) 477–508.652
[28] W. Tang, T. Shan, X. Dang, M. Li, F. Yang, S. Xu, J. Wu, Study on653
a poisson’s equation solver based on deep learning technique, in: 2017654
IEEE Electrical Design of Advanced Packaging and Systems Symposium655
(EDAPS), IEEE, pp. 1–3.656
[29] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural657
networks: A deep learning framework for solving forward and inverse658
problems involving nonlinear partial differential equations, Journal of659
Computational Physics 378 (2019) 686–707.660
[30] L. Sun, H. Gao, S. Pan, J.-X. Wang, Surrogate modeling for fluid flows661
based on physics-constrained deep learning without simulation data,662
Computer Methods in Applied Mechanics and Engineering 361 (2020)663
[31] S. Iqbal, G. F. Carey, Neural nets for mesh assessment, Technical Re-665
port, TEXAS UNIV AT AUSTIN, 2005.666
[32] X. Chen, J. Liu, Y. Pang, J. Chen, L. Chi, C. Gong, Developing a new667
mesh quality evaluation method based on convolutional neural network,668
Engineering Applications of Computational Fluid Mechanics 14 (2020)669
[33] A. Bahreininejad, B. Topping, A. Khan, Finite element mesh partition-671
ing using neural networks, Advances in Engineering Software 27 (1996)672
[34] Y. Feng, Y. Feng, H. You, X. Zhao, Y. Gao, Meshnet: Mesh neural674
network for 3d shape representation, in: Proceedings of the AAAI Con-675
ference on Artificial Intelligence, volume 33, pp. 8279–8286.676
[35] W. Yifan, N. Aigerman, V. G. Kim, S. Chaudhuri, O. Sorkine-Hornung,677
Neural cages for detail-preserving 3d deformations, in: Proceedings of678
the IEEE/CVF Conference on Computer Vision and Pattern Recogni-679
tion, pp. 75–83.680
[36] L. Manevitz, M. Yousef, D. Givoli, Finite–element mesh generation681
using self–organizing neural networks, Computer-Aided Civil and In-682
frastructure Engineering 12 (1997) 233–250.683
[37] J. Bohn, M. Feischl, Recurrent neural networks as optimal mesh refine-684
ment strategies, arXiv preprint arXiv:1909.04275 (2019).685
[38] B. Dolˇsak, A. Jezernik, I. Bratko, A knowledge base for finite element686
mesh design, Artificial intelligence in engineering 9 (1994) 19–27.687
[39] L. Manevitz, A. Bitar, D. Givoli, Neural network time series forecasting688
of finite-element mesh adaptation, Neurocomputing 63 (2005) 447–463.689
[40] R. Chedid, N. Najjar, Automatic finite-element mesh generation us-690
ing artificial neural networks-part i: Prediction of mesh density, IEEE691
Transactions on Magnetics 32 (1996) 5173–5178.692
[41] D. Dyck, D. Lowther, S. McFee, Determining an approximate finite ele-693
ment mesh density using neural network techniques, IEEE transactions694
on magnetics 28 (1992) 1767–1770.695
[42] F. Hecht, New development in freefem++, J. Numer. Math. 20 (2012)696
[43] F. Chollet, et al., Keras,, 2015.698
[44] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.699
Corrado, A. Davis, J. Dean, M. Devin, et al., Tensorflow: Large-scale700
machine learning on heterogeneous distributed systems, arXiv preprint701
arXiv:1603.04467 (2016).702
[45] V. Nair, G. E. Hinton, Rectified linear units improve restricted boltz-703
mann machines, in: ICML.704
[46] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization,705
arXiv preprint arXiv:1412.6980 (2014).706
Figure 10: hex-bolt experiment, ground truth meshes (top) and NN meshes (bottom) ,
the left and right are two problems that only have different geometries
Figure 11: For irregular polyhedron, FE energies of neural network (NN) generated
meshes versus uniform mesh FE energies and ground truth (GT) energies. The height of
each bar represents the proportion of experiment results in the energy range shown on the
x-axis (as a percentage of the ground truth energy).
Figure 12: A ground truth mesh (a, c and e) and corresponding NN mesh (b, d and f)
selected from 500 testing problems, they are in front (a and b), right (c and d) and bottom
(e and f) views.
Figure 13: Training and validation loss of the four experiment
ResearchGate has not been able to resolve any citations for this publication.
Full-text available
One of the difficult requirements imposed on high-quality CFD mesh generation has been the ability to evaluate the mesh quality efficiently. Due to the lack of a general and effective evaluating criterion, the current mesh quality evaluation task mainly relies on various quality metrics for the shape of mesh elements, such as angle, radius, edge and contextual information collected by pre-processing software. However, this line of methods greatly increases the pre-processing cost and may not guarantee a precise quality result. In this paper, we provide a solution to solve the mentioned issues, resulting in a CNN model GridNet and the first mesh dataset NACA-Market. GridNet takes the mesh file as input and then automatically evaluates the mesh quality. Experiment results show that GridNet is capable of performing automatic mesh quality evaluation and outperforms the widely used classifiers. We hope that the proposed large benchmark collection and network could fill in the gaps in the fields of CNN-based mesh quality evaluation and provide potential future research directions in this field.
Full-text available
The field of fluid mechanics is rapidly advancing, driven by unprecedented volumes of data from experiments, field measurements, and large-scale simulations at multiple spatiotemporal scales. Machine learning (ML) offers a wealth of techniques to extract information from data that can be translated into knowledge about the underlying fluid mechanics. Moreover, ML algorithms can augment domain knowledge and automate tasks related to flow control and optimization. This article presents an overview of past history, current developments, and emerging opportunities of ML for fluid mechanics. We outline fundamental ML methodologies and discuss their uses for understanding, modeling, optimizing, and controlling fluid flows. The strengths and limitations of these methods are addressed from the perspective of scientific inquiry that considers data as an inherent part of modeling, experiments, and simulations. ML provides a powerful information-processing framework that can augment, and possibly even transform, current lines of fluid mechanics research and industrial applications. Expected final online publication date for the Annual Review of Fluid Mechanics, Volume 52 is January 5, 2020. Please see for revised estimates.
Numerical simulations on fluid dynamics problems primarily rely on spatially or/and temporally discretization of the governing equation using polynomials into a finite-dimensional algebraic system. Due to the multi-scale nature of the physics and sensitivity from meshing a complicated geometry, such a process can be computational prohibitive for most real-time applications (e.g., clinical diagnosis and surgery planning) and many-query analyses (e.g., optimization design and uncertainty quantification). Therefore, developing a cost-effective surrogate model is of great practical significance. Deep learning (DL) has shown new promises for surrogate modeling due to its capability of handling strong nonlinearity and high dimensionality. However, the off-the-shelf DL architectures, the success of which heavily relies on a large amount of training data and interpolatory nature of the problem, fail to operate when the data becomes sparse. Unfortunately, data is often insufficient in most parametric fluid dynamics problems since each data point in the parameter space requires an expensive numerical simulation based on the first principle, e.g., Naiver--Stokes equations. In this paper, we provide a physics-constrained DL approach for surrogate modeling of fluid flows \emph{without} relying on any simulation data. Specifically, a structured deep neural network (DNN) architecture is devised to enforce the initial and boundary conditions, and the governing partial differential equations (i.e., Navier--Stokes equations) are incorporated into the loss of the DNN to drive the training. Numerical experiments are conducted on a number of internal flows relevant to hemodynamics applications, and the forward propagation of uncertainties in fluid properties and domain geometry is studied as well. The results show excellent agreement on the flow field and forward-propagated uncertainties between the DL surrogate approximations and the first-principle numerical simulations.
We introduce physics-informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given laws of physics described by general nonlinear partial differential equations. In this work, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differential equations. Depending on the nature and arrangement of the available data, we devise two distinct types of algorithms, namely continuous time and discrete time models. The first type of models forms a new family of data-efficient spatio-temporal function approximators, while the latter type allows the use of arbitrarily accurate implicit Runge–Kutta time stepping schemes with unlimited number of stages. The effectiveness of the proposed framework is demonstrated through a collection of classical problems in fluids, quantum mechanics, reaction–diffusion systems, and the propagation of nonlinear shallow-water waves.
We propose a deep learning-based method, the Deep Ritz Method, for numerically solving variational problems, particularly the ones that arise from partial differential equations. The Deep Ritz Method is naturally nonlinear, naturally adaptive and has the potential to work in rather high dimensions. The framework is quite simple and fits well with the stochastic gradient descent method used in deep learning. We illustrate the method on several problems including some eigenvalue problems. © 2018, School of Mathematical Sciences, University of Science and Technology of China and Springer-Verlag GmbH Germany, part of Springer Nature.
Technical Report
TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
High-dimensional PDEs have been a longstanding computational challenge. We propose a deep learning algorithm similar in spirit to Galerkin methods, using a deep neural network instead of linear combinations of basis functions. The PDE is approximated with a deep neural network, which is trained on random batches of spatial points to satisfy the differential operator and boundary conditions. The algorithm is mesh-less, which is key since meshes become infeasible in higher dimensions. Instead of forming a mesh, sequences of spatial points are randomly sampled. We implement the approach for American options (a type of free-boundary PDE which is widely used in finance) in up to 100 dimensions. We call the algorithm a "Deep Galerkin Method (DGM)".