Content uploaded by Zheyan Zhang

Author content

All content in this area was uploaded by Zheyan Zhang on May 11, 2021

Content may be subject to copyright.

MeshingNet3D: Eﬃcient Generation of Adapted

Tetrahedral Meshes for Computational Mechanics

Zheyan Zhang, Peter K. Jimack, He Wang

School of Computing, University of Leeds, UK

Abstract

We describe a new algorithm for the generation of high quality tetrahedral

meshes using artiﬁcial neural networks. The goal is to generate close-to-

optimal meshes in the sense that the error in the computed ﬁnite element

(FE) solution (for a target system of partial diﬀerential equations (PDEs))

is as small as it could be for a prescribed number of nodes or elements in

the mesh. In this paper we illustrate and investigate our proposed approach

by considering the equations of linear elasticity, solved on a variety of three-

dimensional geometries. This class of PDE is selected due to its equivalence

to an energy minimization problem, which therefore allows a quantitative

measure of the relative accuracy of diﬀerent meshes (by comparing the energy

associated with the respective FE solutions on these meshes). Once the

algorithm has been introduced it is evaluated on a variety of test problems,

each with its own distinctive features and geometric constraints, in order to

demonstrate its eﬀectiveness and computational eﬃciency.

Keywords: Optimal mesh generation, Finite element methods, Machine

learning, Artiﬁcial neural networks

1. Introduction1

The ﬁnite element method (FEM) is one of the most widely used ap-2

proaches for solving systems of partial diﬀerential equations (PDEs), which3

arise across multiple applications in computational mechanics [1, 2]. The4

key feature in determining the eﬃciency of the FEM on any given problem is5

the quality of the mesh: in general terms, the ﬁner the mesh the better the6

solution but the greater the computational cost of obtaining it. This trade-7

oﬀ has led to a vast body of research into the generation of high-quality FE8

Preprint submitted to Advances in Engineering Software May 11, 2021

meshes over decades. Typically, the objective is either to generate a mesh9

for which the corresponding FE solution has a prescribed accuracy using a10

minimal number of degrees of freedom (e.g. [3]), or to generate the best11

possible mesh for a predetermined number of degrees of freedom (e.g. [4]).12

In this paper we focus primarily on the latter, however the two approaches13

are very closely related.14

Interest in the use of data driven methods to obtain solutions of PDEs has15

grown signiﬁcantly in recent years, largely due to the increase in computing16

power that supports the application of deep neural networks (NNs) [5, 6]. In17

this work however, we do not aim to apply NNs to estimate PDE solutions18

directly: instead we consider their use to estimate optimal meshes on which19

to compute traditional FE approximations. The rationale for this is our20

hypothesis that, for a given approximation error, a larger representation error21

can be tolerated in a NN to estimate the FE meshes than for a NN to estimate22

a family of PDE solutions directly. We present a universal deep-learning-23

based mesh generation system, MeshingNet3D, that extends our initial 2D24

ideas, [7], by building upon classical a posteriori error estimation techniques25

and adopting a new local coordinate system. Consequently, MeshingNet3D is26

able to guide non-uniform mesh generation for a wide range of PDE systems27

with rich variations of geometries, boundary conditions and PDE parameters.28

In the remainder of this section we provide brief overviews of classical29

non-uniform mesh generation methods, artiﬁcial neural networks and mean30

value coordinates (which are core to the generality of our algorithm). Key31

areas of related research are also highlighted. Section 2 then describes our32

methodology in full, whilst Section 3 provides detailed validation and testing.33

The paper concludes with a discussion of our ﬁndings and of the outlook for34

further developments.35

1.1. Non-uniform mesh generation36

When applying the FEM to approximate the solution of a computational37

mechanics problem, it is necessary to deﬁne both the type of elements and38

the computational mesh upon which the approximation is sought. The sim-39

plest elements are piecewise linear functions on simplexes (triangles in 2D40

and tetrahedra in 3D) however other choices are widely used. In 3D, these41

include higher order Lagrange elements (also deﬁned on tetrahedra), tri-42

linear and triquadratic elements (deﬁned on octahedra) and more general43

elements associated with discontinuous Galerkin methods, which may be ap-44

plied on hybrid meshes [8]. In this paper we restrict our consideration to45

2

unstructured tetrahedral meshes [9, 10]. Structured meshes of octahedra do46

have some advantages, such as requiring less memory, however they are less47

ﬂexible when considering complex geometries or when targeting highly non-48

uniform meshes, with optimal approximation properties, which is the goal of49

this work.50

When the volumes of the elements in a given mesh are approximately51

equal, and the aspect ratio of each tetrahedron is bounded by a small con-52

stant, we refer to the mesh as being uniform. Theoretical results about the53

asymptotic convergence of the FEM typically hold for sequences of ﬁner and54

ﬁner uniform meshes [11]. For many problems such meshes are not the best55

choice however: since the error in the corresponding FE solution may be56

much greater on some elements than on others. In such cases it would usu-57

ally be far better to have more elements in the “high error” regions and fewer58

elements in the “low error” regions. The resulting mesh may have the same59

number of elements in total (with a non-uniform size distribution) but per-60

mit a much more accurate FE representation of the true solution. Ideally, we61

would like to identify an element size distribution to ensure that a prescribed62

global error tolerance can be obtained with the fewest possible number of el-63

ements [12]. In practice this is attempted through the use of prior knowledge64

to control the mesh size distribution (e.g. geometrical information or a priori65

analysis [13]), or through an iterative process based around a posteriori error66

estimates for intermediate solutions [3].67

This iterative approach to mesh generation consists of three steps: (i)68

compute an FE solution on a coarse mesh; (ii) estimate the error locally69

throughout this solution; (iii) adapt the mesh based upon this estimate. At70

the next iteration these steps are repeated, beginning with the mesh produced71

in (iii).72

There is a large body of work on the development of cheap and reliable a73

posteriori error estimators. Popular approaches include those which involve74

solving a set of local problems on each element, or on small patches of ele-75

ments, to directly estimate the error function [14, 15], and those based upon76

the recovery of derivatives of the solution ﬁeld by sampling at particular77

points and then interpolating with a higher degree polynomial [16, 17]. For78

example, in the context of linear elasticity problems, the elasticity energy79

density of a computed solution is evaluated at each element and the recov-80

ered energy density value at each vertex is deﬁned to be the average of its81

adjacent elements. The local stress error is then proportional to the diﬀer-82

ence between the recovered piece-wise linear energy density and the original83

3

piece-wise constant values, [16]. Considering its wide application in engineer-84

ing practice, this “ZZ error estimate” has been used as the baseline in our85

work, to generate comparative data (and meshes) from which MeshingNet3D86

will be trained, and against which it will be evaluated.87

The third step in the iterative approach to mesh generation is to adapt the88

existing coarse mesh based upon the estimated local error distribution. This89

may be achieved through the creation of an entirely new mesh, with target90

element sizes guided by the local error estimate [9], or via local adaptivity. In91

the latter case the mesh can be moved locally (r-reﬁnement) [4] and/or locally92

reﬁned/coarsened (h-reﬁnement) [18]. No matter the type of reﬁnement, the93

iterative process will generally require multiple passes to obtain a high quality94

ﬁnal mesh. This therefore becomes a time-consuming pre-processing step –95

which we seek to avoid in this work.96

1.2. Deep neural networks97

Artiﬁcial neural networks (ANNs) are used to approximate mappings be-98

tween speciﬁed inputs and outputs. They achieve this through a composition99

that is loosely based upon the neurons in a biological brain: there are a num-100

ber of layers of nodes which are connected in a predetermined manner, and101

each node combines inputs received from the previous layer to generate an102

output that is passed to the next layer (with the ﬁrst layer representing the103

input vector and the last layer the output vector). A number of free pa-104

rameters are associated with each node, deﬁning the action of that node,105

and these are prescribed based upon the minimization of a chosen training106

loss function. This learning problem is therefore equivalent to a nonlinear,107

multivariate optimization. Furthermore, since the loss function is designed108

to be diﬀerentiable with respect to the network parameters, an ANN can be109

trained using gradient decent methods such as stochastic gradient descent110

[19].111

In recent years, with the developments in parallel hardware, so-called112

deep neural networks (DNNs), with many layers and very large numbers of113

parameters, have been proven to be remarkably eﬀective at high-level tasks114

such as object recognition [20]. Within computational science, DNNs have115

also been explored to solve ordinary diﬀerential equations (ODEs) and PDEs116

in both supervised [21, 22] and unsupervised [23] settings. In the latter cases117

the network parameters are evaluated based upon a residual minimization,118

rather than using a labelled training data set, as in the supervised case. In119

each approach however, whilst the results are very promising, it is diﬃcult120

4

to obtain high accuracy in the solutions and it is currently not possible to121

provide any guarantees on the accuracy. Consequently, rather than solving122

PDEs directly, our focus in what follows is to use DNNs to provide an esti-123

mate of the optimal ﬁnite element mesh, with the goal of obtaining the most124

eﬃcient possible ﬁnite element solution.125

1.3. Mean value coordinates126

An important feature of our algorithm is the use of mean value coor-127

dinates (MVCs). These are a generalization of the barycentric coordinate128

system for simplexes [24, 25], to polygons in 2D and polyhedra with tri-129

angular faces in 3D [26], whereby the coordinates of any point within the130

polygon/polyhedron may be expressed as a convex combination of the po-131

sitions of the boundary vertices. Consequently, all interior points in the132

neighbourhood of an arbitrary boundary vertex have a high value of the cor-133

responding MVC component. MVCs also have a number of properties that134

make them attractive choices as input parameters to a DNN, for example135

their local smoothness with respect to spatial variations, as well as being136

both scale and rotationally invariant.137

1.4. Related work138

As noted above, recent developments in DNNs have led to renewed in-139

terest in the application of machine learning (ML) to the direct solution of140

PDEs and PDE systems [27]. The majority of this research is based upon141

supervised learning strategies, such as [28], which requires the use of a con-142

ventional solver to generate training data. Once trained, the NN is able to143

solve problems of the same type much more quickly than the original solver.144

Recently there has also been a growth in interest in the development and ap-145

plication of unsupervised learning methods, which act as independent PDE146

solvers without the need to refer to external supervisory information. These147

have been investigated particularly in the context of physics-informed algo-148

rithms [29, 30] or those targeting high-dimensional PDEs, [5, 6]. However,149

the issue still remains that, whether solving computational mechanics prob-150

lems directly via supervised or unsupervised learning, current capabilities151

do not provide any a priori guarantees of accuracy. Indeed, even when a152

such solution has been produced, it is not generally possible to estimate how153

accurate it is.154

Some previous authors have also considered the application of ML to155

mesh-related problems, sharing our aim of enhancing traditional FE solvers156

5

rather than replacing them completely. Examples include mesh quality as-157

sessment [31, 32] and mesh partitioning algorithms, such as [33], to comple-158

ment parallel distributed solvers. Research into mesh generation using ML159

has also been undertaken, both for pure shape representation [34, 35], and160

as the basis for an eﬃcient ﬁnite element solver [36]. This latter approach161

is based upon a self-organizing network but is restricted to ﬁtting (and op-162

timizing) a mesh of a ﬁxed topology to a prescribed geometry. In [37] a163

recurrent network is used to enhance the traditional iterative approach to164

mesh generation through the use of ML to control the mesh adaptivity step.165

They are able to show results that match the quality of iteratively reﬁned166

meshes using conventional error estimates and reﬁnement strategies. Whilst167

these works each replace some aspects of the conventional mesh generation168

step with a NN, none of them address the speciﬁc problem tackled in this169

paper, where we seek to generate a single, non-uniform, tetrahedral mesh170

that provides a pseudo-optimal ﬁnite element representation of the solution171

of an unseen problem.172

In this context of using ML to guide high-quality non-uniform mesh gen-173

eration there is relatively little prior research. In [38], for example, early174

knowledge-based approaches were considered, though with limited success.175

Time-dependent remeshing is studied by [39], where an NN is used to under-176

take time-series predictions that identify areas of greater (and less) reﬁnement177

at diﬀerent times, though on a domain with a simple geometry. In [40, 41]178

NNs were applied successfully to generate high quality ﬁnite element meshes179

for elliptic PDEs, however the input vectors are highly problem-dependent:180

requiring speciﬁc a priori knowledge of the geometries being considered. The181

challenge of using DNNs to generate psuedo-optimal FE meshes on quite gen-182

eral geometries was ﬁrst considered in [7] for selected two-dimensional PDEs.183

This paper extends these ideas to problems in three dimensions, to consider184

PDE systems with rich variation in geometry, boundary conditions and ma-185

terial properties.186

2. Methodology187

The goal of this research is to develop a robust, and widely applicable,188

mesh generation procedure for the eﬃcient FE solution of systems of elliptic189

PDEs. Our particular emphasis here is on the equations of linear elasticity,190

however the approach described in this section may equally be applied to191

any family of problems for which a reliable a posteriori local error estimator192

6

is available to support the training phase for our neural network. In the ﬁrst193

subsection we provide an overview of our methodology, with further details194

on the software design, training data generation and the training of the deep195

network given in the following subsections.196

2.1. Theory197

We seek to automatically generate high quality FE meshes for arbitrary198

instances within a given family of PDE problems, where each instance is199

deﬁned by the domain geometry Gred (from a predeﬁned family of possible200

geometries), the PDE parameters M, and the applied boundary conditions201

B. For any given mesh, the corresponding FE solution is assumed to be a202

unique solution, for which we have available a means of determining the local203

error. This computed a posteriori error is also assumed to be unique, and204

provides a mechanism for determining a desired FE element size for each205

location within the domain. Consequently, in order to generate a pseudo-206

optimal FE mesh we seek to estimate a mapping Fthat represents an ideal207

spatial distribution of the FE element sizes:208

F:X→S(1)

Here, X is the speciﬁed location in the domain and Sis the target element209

size (for example average edge length) at X. Noting that we deﬁne each210

instance by its speciﬁc geometry, parameter values and boundary conditions,211

we may express this mapping more precisely as:212

F:X→S(G, B, M ;X) (2)

Our goal is to make use of oﬄine training to create a neural network that is213

able to learn the mapping214

F:G, B, M, X →S(3)

After training, the NN is able to predict a pseudo-optimal mesh-size distri-215

bution for unseen problems. Speciﬁcally, given G, B, M for any problem,216

and an arbitrary sample point X, the NN outputs a target element size at217

that sample point. This is precisely the information required by a 3D mesh218

generator in order to generate a non-uniform, unstructured ﬁnite element219

mesh.220

7

2.2. Software and evaluation221

In this paper we use Tetgen for tetrahedral mesh generation, [9], and222

FreeFem++ to assemble and solve the corresponding global FE systems [42].223

The input to Tetgen includes a .poly ﬁle containing vertices and edges of224

polygons that deﬁne the boundary of the computational domain. From this225

ﬁle Tetgen is able to generate a uniform mesh based upon a single parameter,226

indicating a constant target element size. To generate a non-uniform mesh,227

Tetgen reads a background mesh from .b.ele,.b.node and .b.face ﬁles, and228

an element size list ﬁle .b.mtr, that deﬁnes target element sizes correspond-229

ing to the vertices of the background mesh. Having deﬁned a valid mesh,230

FreeFem++ is able to solve variational problems that are user deﬁned. To231

do this it executes a .edp script ﬁle containing information such as: how to232

import the mesh; what type of ﬁnite element to use, what the speciﬁc vari-233

ational form is; and which solver to apply. In this paper all examples are234

based upon the use linear tetrahedral elements (as gerenated by Tetgen) and235

the Lam´e solver (for the equations of linear elasticity).236

To mass produce training problems a simple script has been produced that237

allows an appropriate .poly ﬁle to be generated for a given geometry G. Then,238

for each geometry this calls FreeFem++ to obtain linear elasticity solutions239

for a range of material parameters (M) and boundary conditions (B). Note240

that FreeFem++ not only solves the elasticity equations but also computes241

the total stored energy, which may be used to evaluate the quality of a given242

FE solution. This is because the underlying PDE system corresponds to an243

energy minimization problem, so the analytic solution minimizes the energy244

functional over all functions from the appropriate Sobolov space ((H1

E)3in245

this case). On the other hand, the FE solution minimizes the energy over all246

functions in the space of piecewise linear functions on the given tetrahedral247

mesh. Since this is a subspace of (H1

E)3, the energy corresponding to the FE248

solution is always greater than the energy of the analytic solution. Therefore,249

the lower the energy associated with the computed FE solution the better the250

solution. Consequently, the quality of any given set of tetrahedral meshes,251

for a particular problem, may be ranked based upon the computed energy252

corresponding to the ﬁnite element solution on each mesh: the lower the253

energy the better the mesh. We will use this observation as part of the254

evaluation of our approach.255

8

Figure 1: Illustration of the training data for MeshingNet3D : each individual problem is

deﬁned by the geometry G, the PDE parameters Mand the boundary condition parame-

ters B(not shown here). However for each such problem there are multiple sample points,

X, in the domain, with the corresponding local mesh size Sspeciﬁed.

2.3. Data generation256

Training data is required in order to sample the mapping of equation257

(2). Each training problem is deﬁned by parameters that uniquely deﬁne the258

geometry (G), PDE parameters (M) and boundary conditions (B). For each259

such problem, multiple training data are generated by specifying numerous260

points, X, at which the target mesh size is given. This is illustrated in261

Figure 1: for which there are 3000 test problems, each of which generates262

multiple inputs, corresponding to diﬀerent points Xin the domain. The263

precise number of points Xis problem-dependent which should be suﬃcient264

to represent the spatial mesh size variation throughout the domain (too many265

points will not decrease the training performance but will slow down the data266

generation). For each input the generated output, used to train the NN, is267

the target mesh size, S, for that point and that problem.268

The value of Sis computed using a variation of the iterative approach269

to mesh generation, based upon a posteriori error estimation, described in270

Subsection 1.1. For each problem we generate a relatively coarse uniform271

mesh and compute the corresponding FE solution and error estimate. In272

this work we use the “ZZ” energy estimate of [16]. However, for diﬀerent273

problems or diﬀerent quantities of interest, other choices are possible. For274

each sample point, X, the estimated local error, E(X), can be converted to275

9

a target element size using an inverse relationship such as276

S(X) = K

E(X),(4)

for some scaling coeﬃcient K. This is the value of S(X) used to deﬁne (2),277

as illustrated in Figure 1. The eﬀect of the scaling coeﬃcient is to control the278

total number of elements in the non-uniform mesh that is generated based279

upon the target local size distribution S(X). Hence, for each test problem,280

Kmay be adjusted iteratively in order to obtain a target number of elements281

in the non-uniform mesh (or a target total error in the FE solution).282

Note that the precise deﬁnition of the input vector in Figure 1 has to283

be problem dependent: a parameterization of the family of domains is re-284

quired to deﬁne G; the number of free parameters in the PDE systems has285

to be predetermined; and the possible boundary conditions must also be286

parameterized. For each example shown in the Experiments section of this287

paper a diﬀerent input vector has therefore been prescribed. Nevertheless,288

the methodology described here is shown to work robustly on all settings.289

The ﬁnal component required for the data generation is the algorithm to290

select the sample points Xfor each of the training problems. This is achieved291

via two steps: ﬁrst an initial non-uniform mesh with predeﬁned target ele-292

ment number is generated (e.g. by Tetgen) based upon the a posteriori error293

computed on the coarse uniform mesh; then we sample a ﬁxed percentage294

of the elements of this mesh (we ﬁnd that 10% is adequate), choosing each295

Xto be the MVCs of the centroids of the sampled elements. Note that the296

advantage of sampling from the non-uniform, rather than the uniform, mesh297

is that the training data is weighted based upon the error distribution: our298

experiments show this to be advantageous.299

2.4. Training and using the neural network300

The deep learning platform that we use in this work is Keras [43] based301

on Tensorﬂow [44]. Our networks are fully connected, typically with six302

hidden layers, though we ﬁnd that our results are not especially sensitive to303

the number of layers or the precise number of neurons per layer. We do ob-304

serve however that it is advantageous to ﬁrst increase and then decrease the305

number of neurons per layer as we pass forward through the network. The306

activation functions selected in this model are rectiﬁed linear functions [45]307

for the hidden units, with linear activation in the output layer. Before train-308

ing, the input data is linearly normalised and 10% is selected for validation309

10

(monitoring the validation loss during training can help to identify and pre-310

vent over-ﬁtting). The training itself uses mean square error loss and the311

stochastic gradient descent optimiser, Adam[46], with batch sizes of 128: for312

each of the examples considered in this paper this takes no longer than 3313

hours on a Nvidia RTX 2070 graphics card.314

Once trained, the NN can be used to guide mesh generation for unseen315

problems in real time. Given a new problem, deﬁned by G, M and B, a316

uniform background mesh is generated based upon Galone. For each element317

in this background mesh we compute the MVC of its centre and concatenate318

this with the problem parameters to form an input vector for the NN. The319

corresponding output is the target element size at the centre of that element.320

The background mesh, with its associated target element size distribution,321

is then used to allow TetGen to generate the desired non-uniform mesh. If322

the total number of elements in this mesh is outside of the required range323

then each S(X) may be scaled linearly before generating an updated non-324

uniform mesh. In this way, an adapted tetrahedral mesh of a speciﬁed size is325

generated directly, without the need to compute a sequence of FE solutions326

and a posteriori error estimates, as would otherwise be the case.327

3. Computational Experiments328

We present four computational tests which allow us to analyse the per-329

formance of MeshingNet3D across a range of diﬀerent problems, geometries,330

boundary conditions and PDE parameters. The ﬁrst and the third case in-331

volve prismatic geometries, which permit the description of spatial locations332

based upon “2.5D MVCs”. These are composed of regular 2D MVCs in the333

x-y planes plus an additional z-coordinate. The second case uses general334

3D Cartesian coordinates, whilst the ﬁnal example uses general polyhedral335

geometries and fully 3D MVCs.336

For each of the examples we provide a brief description of the problem,337

followed by a discussion of the network topology used (including the speciﬁc338

input vector) and the training undertaken. We then present results based339

upon 500 unseen test problems. These results compare the FE solutions340

computed on the NN-guided mesh with those computed on a “ground truth”341

mesh of similar size, generated using the same ZZ a posteriori error estimator342

that was applied to train the network. We also compare against the FE343

solution computed on a uniform mesh with a similar number of elements. To344

facilitate these comparisons, for each of the 500 test problems, we compute345

11

the diﬀerence between the total energy of the FE solution on the NN-guided346

mesh with that of the FE solution on the comparison mesh. We then provide347

a histogram to illustrate the proportion of the test cases in diﬀerent binned348

error ranges. A negative value of the diﬀerence indicates that the solution349

on the NN-based mesh has a lower energy and is therefore superior.350

3.1. Clamped beam351

We consider the problem of an over-hanging beam (under gravity), with352

diﬀerent cross sections (G) and variable boundary conditions (B). In this case353

the material parameters (M) are not varied (the speciﬁc inputs to the Lam´e354

solver in FreeFEM++ being: density = 8000, Young’s modulus = 210 ×109

355

and Poisson’s ratio = 0.27).356

3.1.1. Problem speciﬁcation357

The beam is a right prism with a convex quadrilateral cross section as358

illustrated in Figure 2. This cross section has vertices at (x0, y0) = (0,0) and359

(x1, y1) = (0,2), and also at (x2, y2) and (x3, y3) which are randomly sampled360

within x2∈(1.5,2.5), y2∈(1.5,2.5), x3∈(−0.5,0.5) and y3∈(1.5,2.5) for361

each problem. The length of the beam is ﬁxed (0 ≤z≤6) and a boundary362

shear, with components (fx, fy,0), is applied at the face z= 6. The face z= 0363

is clamped and the bottom face is clamped between z= 0 and z=ζ, where364

2< ζ < 4 (randomly sampled for each problem). All other boundaries are365

free, subject to zero normal stress. Hence the input vector for this problem366

requires values for x2,y2,x3,y3,ζ,fxand fy, along with the MVCs of367

the point at which the mesh spacing is required. In these examples, the368

parameters fxand fyare constrained to lie in the range (−106,106).369

3.1.2. Network information370

In this example our fully-connected network has six hidden layers with371

32, 64, 128, 64, 32 and 8 neurons respectively. Training data is generated372

based upon solving 3000 individual problems, each of which is obtained us-373

ing a random choice for each input parameter (selected uniformly from its374

range), leading to 10,740,746 individual input-output pairs. Of these, 10%375

are selected for validation and the remainder are used for training using a376

batch size of 128. The training takes 10 epochs, meaning that each item of377

data has been used an average of 10 times. Figure 13 shows the rates of378

convergence for the training, along with the corresponding validation curve.379

12

Figure 2: The geometry and boundary conditions for the Clamped beam, with constant

cross section along the z-axis. The gravity is uniformly distributed over the volume. The

surfaces bounded by four vertices with blue triangles are clamped.

3.1.3. Results380

Figure 3 demonstrates that the NN-guided meshes generally perform at381

least as well as the ground truth meshes (generated from explicitly-computed382

a posteriori error estimates) and, as expected, much better than uniform383

meshes. Two typical examples are shown in Figure 4, which compares NN-384

guided meshes (bottom) with their ground-truth counterparts (top). In each385

case the high mesh density near y= 0 and z=ζis easily captured. More386

signiﬁcantly however, high and low mesh density regions are captured well387

throughout the domain, with a smooth variation between these regions.388

3.2. Laminar material389

In this example we consider a variation of the previous problem for which390

the material parameters (M) are now permitted to vary but the geometry391

(G) and the boundary conditions (B) are kept ﬁxed.392

3.2.1. Problem speciﬁcation393

A beam of dimensions 1 ×1×5 is composed of two horizontal layers, as394

illustrated in Figure 5. Each layer has a Young’s modulus (Etop and Ebot)395

between 109and 1011, and a Poisson’s ratio (νtop and νbot) between 0.05396

and 0.45. The densities of the two materials are both 8000 and the interface397

between the layers is at a height y=h∈(0.2,0.8). Half of the bottom surface398

13

Figure 3: For the Clamped beam, FE energies of neural network (NN) generated meshes

versus uniform mesh FE energies and ground truth (GT) energies. The height of each bar

represents the proportion of experiment results in the energy range shown on the x-axis

(as a percentage of the ground truth energy).

(y= 0, 0 < z < 2.5) is clamped, as is the surface z= 0. On the surface399

z= 5 a traction of amplitude 10000 is applied in the xdirection, with all400

other boundaries free to displace under zero normal-stress conditions. Hence401

the input vector for this problem requires values for Etop,Ebot,νtop,νbot

402

and h, along with the coordinates of the point at which the mesh spacing403

is required. We actually use log10 (Etop) and log10 (Ebot) as the ﬁrst two404

input parameters.405

3.2.2. Network information406

In this example our fully-connected network has ﬁve hidden layers with 32,407

64, 32, 16 and 8 neurons respectively. Training data is generated based upon408

solving 3000 individual problems, each of which is obtained using a random409

choice for each input parameter (selected uniformly from its range), leading410

to 19,719,750 individual input-output pairs. Of these, 10% are selected for411

validation and the remainder are used for training using a batch size of 128.412

The training takes 15 epochs, and Figure 13 shows the rates of convergence413

for this training, along with the corresponding validation curve.414

3.2.3. Results415

Figure 6 demonstrates that, as in the previous example, the NN-guided416

meshes typically perform on a par with the ground truth meshes, and much417

better than uniform meshes. Two typical examples are shown in Figure 7:418

14

Figure 4: For the Clamped beam, ground truth meshes (top) and NN-guided meshes

(bottom) for two test cases.

in the case (a) and (c)419

(log10 (Etop),log10 (Ebot), νtop, νbot, h) = (10.82,9.17,0.34,0.20,0.34) ,

and for (b) and (d)420

(log10 (Etop),log10 (Ebot), νtop, νbot, h) = (9.17,10.33,0.44,0.21,0.41) .

In the ﬁrst example the top layer has the higher Young’s modulus, which421

leads to a higher mesh density in this layer (for both the NN-guided and422

Figure 5: The boundary conditions and loads of the laminar material where the height

of the interface is random

15

Figure 6: For the Laminar material, FE energies of neural network (NN) generated

meshes versus uniform mesh FE energies and ground truth (GT) energies. The height of

each bar represents the proportion of experiment results in the energy range shown on the

x-axis (as a percentage of the ground truth energy).

the ground-truth meshes). Conversely, in the second example the bottom423

material is stiﬀer than the top and we see a very diﬀerent distribution of the424

element size. In each case there is a strong correlation between the NN-guided425

mesh and the ground-truth case.426

3.3. hex-bolt with a hole427

We consider the problem of a hex-bolt (under torque), with diﬀerent cross428

sections (G). In this case the material parameters (M) are not varied (the429

speciﬁc inputs to the Lam´e solver in FreeFEM++ being: density = 8000,430

Young’s modulus = 210 ×109and Poisson’s ratio = 0.27).431

3.3.1. Problem speciﬁcation432

A regular hexagonal prism has an octagonal prism hole inside it where433

the height of the prism is h= 4 (Figure 8 left). On the cross section, the434

edge length of the regular hexagon is 4 and the octagon is coaxial with the435

hexagon. The eight vertices of the octagon lie on the same circle, whose436

radius varies r∈(0.2,1.0). The arc angles between vertices are random.437

Linear distributed pressures are applied to create a torque on the top (p=438

−10000x+ 10000) and bottom (p=−10000x−10000) surfaces. The eight439

surfaces of the hole are clamped. The input vectors for this problem include440

the position of the octagon’s eight vertices and the MVCs of the target point441

16

Figure 7: (a)(c) and (b)(d) are two problems in the laminar material experiments. (a)

and (b) are ground truth meshes and (c) and (d) are non-uniform meshes guided by the

neural network

expressed with respect to both the vertices of the outer hexagon and the442

inner octagon (combined with its zcoordinate, z∈(−1.0,0.0)).443

3.3.2. Network information444

In this example our fully-connected network has four hidden layers with445

32, 64, 16 and 8 neurons respectively. Training data is generated based upon446

solving 3000 individual problems, each of which is obtained using a random447

choice for each input parameter (selected uniformly from its range), leading448

to 10,748,618 individual input-output pairs. Of these, 10% are selected for449

validation and the remainder are used for training using a batch size of 128.450

The training takes 10 epochs and Figure 13 shows the convergence for this451

training, along with the corresponding validation curve.452

3.3.3. Results453

Figure 9 shows that the MeshingNet3D meshes are again better than454

uniform meshes and that the NN mesh energies are very close to those of455

the ground truth. As illustrated in Figure 10, the NN can successfully guide456

non-uniform mesh generation on very diﬀerent geometries. This example457

also illustrates the success of the proposed approach on non-simply-connected458

17

Figure 8: The boundary conditions and loads of the hex −bolt (left) and irregular

polyhedron (right). On hex −bolt, eight surfaces of the hole are clamped, linear dis-

tributed pressure is applied on top and bottom surfaces.

domains. Note that the second problem (on the right) in Figure 10 illustrates459

one of the worst performing cases for the NN mesh relative to the ground460

truth: here, the NN mesh is more uniform than the ground truth (though461

still a vast improvement on a standard uniform mesh).462

3.4. Irregular polyhedron463

We now consider the problem of mesh generation on arbitrary twelve-464

faced polyhedra, with a range of geometries (G) and variable boundary con-465

ditions (B). In this case the material parameters (M) are not varied (the466

speciﬁc inputs to the Lam´e solver in FreeFEM++ being: density = 8000,467

Young’s modulus = 210 ×109and Poisson’s ratio = 0.27).468

3.4.1. Problem speciﬁcation469

An irregular polyhedron with twelve triangular faces and eight vertices is470

illustrated in Fig 8 (right). The four “bottom” vertices are constrained to be471

co-planar and one of the two bottom triangular surfaces (i.e. the two triangles472

whose union is bounded by the four co-planar vertices) is clamped. In all473

training and testing problems the geometries are subject to the restriction474

that the four bottom vertices always lie in the same plane. A normal pressure475

of amplitude 10000 is applied on the two “top” surfaces (i.e. the triangular476

18

Figure 9: For hex−bolt with a hole, FE energies of neural network (NN) generated meshes

versus uniform mesh FE energies and ground truth (GT) energies. The height of each bar

represents the proportion of experiment results in the energy range shown on the x-axis

(as a percentage of the ground truth energy).

faces whose union is bounded by the other four vertices) and zero normal477

stress is applied on the other nine triangular faces. The input vectors for478

this problem deﬁne the Cartesian coordinates of the eight vertices and the479

corresponding MVCs of the point at which the mesh spacing is required.480

3.4.2. Network information481

In this example our fully-connected network has four hidden layers with482

32, 64, 32, 16, and 8 neurons respectively. Training data is generated based483

upon solving 3000 individual problems, each of which is obtained using a ran-484

dom choice for each input parameter, leading to 7,383,999 individual input-485

output pairs. Of these, 10% are selected for validation and the remainder are486

used for training using a batch size of 128. We use the network after training487

10 epochs and Figure 13 shows the convergence for this training, along with488

the corresponding validation curve.489

3.4.3. results490

From Figure 11 it is cleear that the MeshingNet3D meshes are signiﬁ-491

cantly better than uniform meshes and that the solution energies are rela-492

tively close to those of the ground truth: though in some cases the ground493

truth mesh is slightly superior. One such example is shown in Figure 12494

(three views of the same problem), where we see that the NN mesh appears495

to be more conservative in some aspects of its local reﬁnement. Nevertheless,496

19

even in this worst-case scenario, the MeshingNet3D mesh generally has the497

same regions of reﬁnement as the ground truth mesh.498

3.5. Discussion499

Across the four experiments described in this section we have shown re-500

sults over a range of geometries, boundary conditions and material parame-501

ters. For each problem the input layer of the NN is necessarily of a diﬀerent502

dimension, which is dependent on the problem speciﬁcation (along with the503

MVCs of the target point), whereas the output is always a single value rep-504

resenting the predicted mesh spacing at the target point. The number and505

size of the hidden layers is not a critical choice, but does naturally have some506

impact on the performance of the network.507

As an example, to illustrate this, Table 1 shows the performance of ﬁve508

diﬀerent networks when applied to the fourth of the test problems above.509

In each case the networks have been trained on the same data set, with510

validation losses having converged after 10 epochs. The networks are then511

used to compute meshes on the same testing set of 500 unseen problems512

and the ﬁnite element solutions computed on all meshes. The energy of513

each solution is normalised against the energy of the ﬁnite element solution514

computed on the “ground truth” mesh so as to allow a meaningful average515

to be taken across all 500 cases. This is the value shown in the “normalised516

average energy” column of Table 1: so, the lower this energy the better the517

meshes are on average. The results shown in Subsection 3.4 are generated518

using NN3 from the table but NN2 and NN4 produced meshes of very similar519

quality. The network denoted by NN1 appears to have too few degrees of520

freedom to be able to model the non-uniform mesh patterns satisfactorily,521

whereas the network denoted by NN5 likely has too many degrees of freedom522

for the size of our training data set.523

Note that our NNs are always “spindly”, with the greatest number of neu-524

rons in the inner layers. We ﬁnd from experiment that this kind of network525

appears to have the best performance for the set of tasks considered in this526

work. Given that our problems have a relatively small number of inputs and527

a very small number of outputs (typically one) this is perhaps not surpris-528

ing: to capture the highly nonlinear relationships between the inputs and the529

mesh spacing across the domain, signiﬁcant complexity must be introduced530

into the network between the input and output layers.531

Finally, we note that MeshingNet3D has the potential to make simu-532

lations more eﬃcient for designers who use pre-built 3D models provided533

20

NN NN structure training epochs normalised average energy

NN1 32-16-8 10 9.0×10−3

NN2 32-64-16-8 10 8.1×10−3

NN3 32-64-32-16-8 10 7.9×10−3

NN4 32-64-128-32-16-8 10 8.1×10−3

NN5 32-64-128-64-32-16-8 10 8.6×10−3

Table 1: Comparison of 5 diﬀerent fully connected NNs based upon normalised average

energies of the ﬁnite element solutions. NN3 gives the lowest average energy and therefore

provides the best mean performance.

within Computer Aided Design (CAD) software to accelerate design. From534

screws and bolts, to washers and bearings, CAD can not only deﬁne ge-535

ometries but also materials. Embedding pre-trained MeshingNet3D in these536

CAD libraries could save meshing cost and provide high-quality non-uniform537

meshes. Similarity, MeshingNet3D can help parametric design where the NN538

is pre-trained for each geometry topology: under the guidance of the NN an539

appropriate mesh is generated in response to each iteration of the design. To540

implement this eﬃciently the challenge will be in deﬁning a suitable family541

of boundary conditions as NN inputs, where forces due to interacting objects542

are unknown a priori. However, for components in a speciﬁc assembly, if543

contacts are deﬁned, the load may be inferred by data-driven methods.544

4. Conclusions545

We have proposed a new framework for the generation of non-uniform546

three-dimensional ﬁnite element meshes. This is designed to produce meshes547

of the same quality as those obtained using traditional approaches, based548

upon a posteriori error estimates and local mesh reﬁnement, but at a sub-549

stantially reduced computational cost. This has been implemented as Mesh-550

ingNet3D, building upon the 3D mesh generator Tetgen and the ﬁnite ele-551

ment package FreeFem++. By selecting the linear elasticity solver within552

FreeFem++ we have been able to undertake quantitative comparisons of553

diﬀerent meshes based upon the energy minimization property of the elasto-554

static equations. Speciﬁcally, we can compare any two meshes by solving the555

ﬁnite element system on each mesh and then computing the stored energy of556

the solutions: the lower one being superior.557

21

We have assessed the performance of MeshingNet3D on four diﬀerent558

problem families for which the optimal ﬁnite element mesh is generally highly559

non-uniform. In all cases we are able to demonstrate the capability to gen-560

erate meshes which are not only substantially better than uniform meshes561

for the same geometry, but which are comparable in quality to non-uniform562

meshes that are generated based upon the traditional (and expensive) ap-563

proach of undertaking a sequence of local adaptive steps involving ﬁnite el-564

ement solves and a posteriori error estimates. Perhaps not surprisingly, the565

beneﬁts of MeshingNet3D are most apparent on those problems for which566

the optimal ﬁnite element mesh is far from uniform.567

The main limitation of our approach is associated with the need to deﬁne568

a diﬀerent set of inputs for each family of problems that is to be considered.569

Hence, for each new family of problems being considered, it is necessary570

to deﬁne a set of inputs that fully reﬂects the richness of that family, and571

then to undertake training for a new network. Furthermore, as with most572

supervised learning approaches, there is a trade-oﬀ to be made between the573

level of generality of the family of problems that the user of MeshingNet3D574

wishes to consider and the amount of work that must be undertaken in the575

training phase of the algorithm. Nevertheless, in situations where many576

solutions are required for large numbers of related problems (such as design577

and optimization problems for example) this is likely to be a worthwhile578

expense. Finally, we note that, in cases where engineers may have limited579

conﬁdence in their ability to deﬁne the most appropriate inputs (to deﬁne580

the geometry or boundary conditions for example), data analysis techniques581

such as principle components analysis may be used to ﬁnd the most critical582

parameters.583

References584

[1] P. M. Gresho, R. L. Sani, Incompressible ﬂow and the ﬁnite element585

method. volume 1: Advection-diﬀusion and isothermal laminar ﬂow586

(1998).587

[2] O. C. Zienkiewicz, R. L. Taylor, The ﬁnite element method for solid and588

structural mechanics, Elsevier, 2005.589

[3] R. Stevenson, Optimality of a standard adaptive ﬁnite element method,590

Foundations of Computational Mathematics 7 (2007) 245–269.591

22

[4] R. Mahmood, P. K. Jimack, Locally optimal unstructured ﬁnite element592

meshes in 3 dimensions, Computers & structures 82 (2004) 2105–2116.593

[5] E. Weinan, B. Yu, The deep ritz method: a deep learning-based nu-594

merical algorithm for solving variational problems, Communications in595

Mathematics and Statistics 6 (2018) 1–12.596

[6] J. Sirignano, K. Spiliopoulos, Dgm: A deep learning algorithm for solv-597

ing partial diﬀerential equations, Journal of computational physics 375598

(2018) 1339–1364.599

[7] Z. Zhang, Y. Wang, P. K. Jimack, H. Wang, Meshingnet: A new600

mesh generation method based on deep learning, arXiv preprint601

arXiv:2004.07016 (2020).602

[8] J. Chan, Z. Wang, A. Modave, J.-F. Remacle, T. Warburton, Gpu-603

accelerated discontinuous galerkin methods on hybrid meshes, Journal604

of Computational Physics 318 (2016) 142–168.605

[9] H. Si, Tetgen, a delaunay-based quality tetrahedral mesh generator,606

ACM Transactions on Mathematical Software (TOMS) 41 (2015) 11.607

[10] C. Geuzaine, J.-F. Remacle, Gmsh: A 3-d ﬁnite element mesh generator608

with built-in pre-and post-processing facilities, International journal for609

numerical methods in engineering 79 (2009) 1309–1331.610

[11] G. Strang, G. J. Fix, An analysis of the ﬁnite element method (1973).611

[12] W. D¨orﬂer, A convergent adaptive algorithm for poisson’s equation,612

SIAM Journal on Numerical Analysis 33 (1996) 1106–1124.613

[13] T. Apel, O. Benedix, D. Sirch, B. Vexler, A priori mesh grading for an614

elliptic problem with dirac right-hand side, SIAM journal on numerical615

analysis 49 (2011) 992–1005.616

[14] M. Ainsworth, J. T. Oden, A posteriori error estimation in ﬁnite element617

analysis, volume 37, John Wiley & Sons, 2011.618

[15] R. E. Bank, A. Weiser, Some a posteriori error estimators for elliptic619

partial diﬀerential equations, Mathematics of computation 44 (1985)620

283–301.621

23

[16] O. C. Zienkiewicz, J. Z. Zhu, A simple error estimator and adaptive622

procedure for practical engineerng analysis, International journal for623

numerical methods in engineering 24 (1987) 337–357.624

[17] O. C. Zienkiewicz, J. Z. Zhu, The superconvergent patch recovery and a625

posteriori error estimates. part 1: The recovery technique, International626

Journal for Numerical Methods in Engineering 33 (1992) 1331–1364.627

[18] W. Speares, M. Berzins, A 3d unstructured mesh adaptation algorithm628

for time-dependent shock-dominated problems, International Journal629

for Numerical Methods in Fluids 25 (1997) 81–104.630

[19] L. Bottou, Large-scale machine learning with stochastic gradient de-631

scent, in: Proceedings of COMPSTAT’2010, Springer, 2010, pp. 177–632

186.633

[20] A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classiﬁcation with634

deep convolutional neural networks, in: Advances in neural information635

processing systems, pp. 1097–1105.636

[21] T. Q. Chen, Y. Rubanova, J. Bettencourt, D. K. Duvenaud, Neural637

ordinary diﬀerential equations, in: Advances in neural information pro-638

cessing systems, pp. 6571–6583.639

[22] Z. Long, Y. Lu, X. Ma, B. Dong, Pde-net: Learning pdes from data,640

arXiv preprint arXiv:1710.09668 (2017).641

[23] J. Han, A. Jentzen, E. Weinan, Solving high-dimensional partial dif-642

ferential equations using deep learning, Proceedings of the National643

Academy of Sciences 115 (2018) 8505–8510.644

[24] K. Hormann, M. S. Floater, Mean value coordinates for arbitrary planar645

polygons, ACM Transactions on Graphics (TOG) 25 (2006) 1424–1441.646

[25] M. S. Floater, Mean value coordinates, Computer aided geometric647

design 20 (2003) 19–27.648

[26] M. S. Floater, G. K´os, M. Reimers, Mean value coordinates in 3d,649

Computer Aided Geometric Design 22 (2005) 623–631.650

[27] S. L. Brunton, B. R. Noack, P. Koumoutsakos, Machine learning for651

ﬂuid mechanics, Annual Review of Fluid Mechanics 52 (2020) 477–508.652

24

[28] W. Tang, T. Shan, X. Dang, M. Li, F. Yang, S. Xu, J. Wu, Study on653

a poisson’s equation solver based on deep learning technique, in: 2017654

IEEE Electrical Design of Advanced Packaging and Systems Symposium655

(EDAPS), IEEE, pp. 1–3.656

[29] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural657

networks: A deep learning framework for solving forward and inverse658

problems involving nonlinear partial diﬀerential equations, Journal of659

Computational Physics 378 (2019) 686–707.660

[30] L. Sun, H. Gao, S. Pan, J.-X. Wang, Surrogate modeling for ﬂuid ﬂows661

based on physics-constrained deep learning without simulation data,662

Computer Methods in Applied Mechanics and Engineering 361 (2020)663

112732.664

[31] S. Iqbal, G. F. Carey, Neural nets for mesh assessment, Technical Re-665

port, TEXAS UNIV AT AUSTIN, 2005.666

[32] X. Chen, J. Liu, Y. Pang, J. Chen, L. Chi, C. Gong, Developing a new667

mesh quality evaluation method based on convolutional neural network,668

Engineering Applications of Computational Fluid Mechanics 14 (2020)669

391–400.670

[33] A. Bahreininejad, B. Topping, A. Khan, Finite element mesh partition-671

ing using neural networks, Advances in Engineering Software 27 (1996)672

103–115.673

[34] Y. Feng, Y. Feng, H. You, X. Zhao, Y. Gao, Meshnet: Mesh neural674

network for 3d shape representation, in: Proceedings of the AAAI Con-675

ference on Artiﬁcial Intelligence, volume 33, pp. 8279–8286.676

[35] W. Yifan, N. Aigerman, V. G. Kim, S. Chaudhuri, O. Sorkine-Hornung,677

Neural cages for detail-preserving 3d deformations, in: Proceedings of678

the IEEE/CVF Conference on Computer Vision and Pattern Recogni-679

tion, pp. 75–83.680

[36] L. Manevitz, M. Yousef, D. Givoli, Finite–element mesh generation681

using self–organizing neural networks, Computer-Aided Civil and In-682

frastructure Engineering 12 (1997) 233–250.683

25

[37] J. Bohn, M. Feischl, Recurrent neural networks as optimal mesh reﬁne-684

ment strategies, arXiv preprint arXiv:1909.04275 (2019).685

[38] B. Dolˇsak, A. Jezernik, I. Bratko, A knowledge base for ﬁnite element686

mesh design, Artiﬁcial intelligence in engineering 9 (1994) 19–27.687

[39] L. Manevitz, A. Bitar, D. Givoli, Neural network time series forecasting688

of ﬁnite-element mesh adaptation, Neurocomputing 63 (2005) 447–463.689

[40] R. Chedid, N. Najjar, Automatic ﬁnite-element mesh generation us-690

ing artiﬁcial neural networks-part i: Prediction of mesh density, IEEE691

Transactions on Magnetics 32 (1996) 5173–5178.692

[41] D. Dyck, D. Lowther, S. McFee, Determining an approximate ﬁnite ele-693

ment mesh density using neural network techniques, IEEE transactions694

on magnetics 28 (1992) 1767–1770.695

[42] F. Hecht, New development in freefem++, J. Numer. Math. 20 (2012)696

251–265.697

[43] F. Chollet, et al., Keras, https://keras.io, 2015.698

[44] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.699

Corrado, A. Davis, J. Dean, M. Devin, et al., Tensorﬂow: Large-scale700

machine learning on heterogeneous distributed systems, arXiv preprint701

arXiv:1603.04467 (2016).702

[45] V. Nair, G. E. Hinton, Rectiﬁed linear units improve restricted boltz-703

mann machines, in: ICML.704

[46] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization,705

arXiv preprint arXiv:1412.6980 (2014).706

26

Figure 10: hex-bolt experiment, ground truth meshes (top) and NN meshes (bottom) ,

the left and right are two problems that only have diﬀerent geometries

27

Figure 11: For irregular polyhedron, FE energies of neural network (NN) generated

meshes versus uniform mesh FE energies and ground truth (GT) energies. The height of

each bar represents the proportion of experiment results in the energy range shown on the

x-axis (as a percentage of the ground truth energy).

28

Figure 12: A ground truth mesh (a, c and e) and corresponding NN mesh (b, d and f)

selected from 500 testing problems, they are in front (a and b), right (c and d) and bottom

(e and f) views.

29

Figure 13: Training and validation loss of the four experiment

30