See all ›

3 CitationsSee all ›

16 ReferencesSee all ›

2 FiguresChapter

from book A Bucket Sort Algorithm for the Particle-In-Cell Method on Manycore Architectures (pp.356-365)

# FEniCS-HPC: Automated Predictive High-Performance Finite Element Computing with Applications in Aerodynamics

Abstract

Developing multiphysics finite element methods (FEM) and scalable HPC implementations can be very challenging in terms of software complexity and performance, even more so with the addition of goal-oriented adaptive mesh refinement. To manage the complexity we in this work present general adaptive stabilized methods with automated implementation in the FEniCS-HPC automated open source software framework. This allows taking the weak form of a partial differential equation (PDE) as input in near-mathematical notation and automatically generating the low-level implementation source code and auxiliary equations and quantities necessary for the adaptivity. We demonstrate new optimal strong scaling results for the whole adaptive framework applied to turbulent flow on massively parallel architectures down to 25000 vertices per core with ca. 5000 cores with the MPI-based PETSc backend and for assembly down to 500 vertices per core with ca. 20000 cores with the PGAS-based JANPACK backend. As a demonstration of the power of the combination of the scalability together with the adaptive methodology allowing prediction of gross quantities in turbulent flow we present an application in aerodynamics of a full DLR-F11 aircraft in connection with the HiLift-PW2 benchmarking workshop with good match to experiments.

Figures

FEniCS-HPC: Automated predictive

high-performance ﬁnite element computing with

applications in aerodynamics

Johan Hoﬀman1,2,3, Johan Jansson2,1,4, and Niclas Jansson1,5

1Computational Technology Laboratory, School of Computer Science and

Communication, KTH, Stockholm, Sweden

2BCAM - Basque Center for Applied Mathematics, Bilbao, Spain

3jhoffman@kth.se

4jjan@kth.se

5njansson@kth.se

Abstract.

Developing multiphysics ﬁnite element methods (FEM) and

scalable HPC implementations can be very challenging in terms of soft-

ware complexity and performance, even more so with the addition of

goal-oriented adaptive mesh reﬁnement. To manage the complexity we in

this work present general adaptive stabilized methods with automated

implementation in the FEniCS-HPC automated open source software

framework. This allows taking the weak form of a partial diﬀerential

equation (PDE) as input in near-mathematical notation and automati-

cally generating the low-level implementation source code and auxiliary

equations and quantities necessary for the adaptivity. We demonstrate

new optimal strong scaling results for the whole adaptive framework

applied to turbulent ﬂow on massively parallel architectures down to

25000 vertices per core with ca. 5000 cores with the MPI-based PETSc

backend and for assembly down to 500 vertices per core with ca. 20000

cores with the PGAS-based JANPACK backend. As a demonstration of

the power of the combination of the scalability together with the adaptive

methodology allowing prediction of gross quantities in turbulent ﬂow

we present an application in aerodynamics of a full DLR-F11 aircraft

in connection with the HiLift-PW2 benchmarking workshop with good

match to experiments.

Keywords: FEM, adaptive, turbulence

1 Introduction

As computational methods are applied to simulate even more advanced problems

of coupled physical processes and supercomputing hardware is developed towards

massively parallel heterogeneous systems, it is a major challenge to manage the

complexity and performance of methods, algorithms and software implemen-

tations. Adaptive methods based on quantitative error control pose additional

challenges. For simulation based on partial diﬀerential equation (PDE) models,

2

the ﬁnite element method (FEM) oﬀers a general approach to numerical discreti-

sation, which opens for automation of algorithms and software implementation.

In this paper we present the FEniCS-HPC open source software framework

with the goal to combine the generality of FEM with performance, by optimisation

of generic algorithms [4, 2, 13]. We demonstrate the performance of FEniCS-HPC

in an application to subsonic aerodynamics.

We give an overview of the methodology and the FEniCS-HPC framework,

key aspects of the framework include:

1. Automated discretization

where the weak form of a PDE in mathemat-

ical notation is translated into a system of algebraic equations using code

generation.

2. Automated error control

, ensures that the discretization error e = u -

U in a given quantity is smaller than a given tolerance by adaptive mesh

reﬁnement based on duality-based a posteriori error estimates. An a posteri

error estimate and error indicators are automatically generated from the

weak form of the PDE, by directly using the error representation.

3. Automated modeling

, which includes a residual based implicit turbulence

model, where the turbulent dissipation comes only from the numerical stabi-

lization, as well as treating the ﬂuid and solid in ﬂuid-structure interaction

(FSI) as one continuum with a phase indicator function tracked by a moving

mesh and implicitly modeling contact.

We demonstrate new optimal strong scaling results for the whole adaptive

framework applied to turbulent ﬂow on massively parallel architectures down to

25000 vertices per core with ca. 5000 cores with the MPI-based PETSc backend

and for assembly down to 500 vertices per core with ca. 20000 cores with the

PGAS-based JANPACK backend. We also present an application in aerodynamics

of a full DLR-F11 aircraft in connection with the HiLift-PW2 benchmarking

workshop with good match to experiments.

1.1 The FEniCS project and state of the art

The software described here is part of the FEniCS project [2], with the goal to

automate the scientiﬁc software process by relying on general implementations and

code generation, for robustness and to enable high speed of software development.

Deal.II [1] is a software framework with a similar goal, implementing general

PDE based on FEM in C++ where users write the “numerical integration

loop” for weak forms for computing the linear systems. The framework runs

on supercomputers with optimal strong scaling. Deal.II is based on quadrilater

(2D) and hexahedral (3D) meshes, whereas FEniCS is based on simplicial meshes

(triangles in 2D and tetrahedra in 3D).

Another FEM software framework with a similar goal is FreeFEM++ [3], which

has a high-level syntax close to mathematical notation, and has demonstrated

optimal strong scaling up to ca. 100 cores.

3

2 The FEniCS-HPC framework

FEniCS-HPC is a problem-solving environment (PSE) for automated solution

of PDE by the FEM with a high-level interface for the basic concepts of FEM:

weak forms, meshes, reﬁnement, sparse linear algebra, and with HPC concepts

such as partitioning, load balancing abstracted away.

The framework is based on components with clearly deﬁned responsibilities.

A compact description of the main components follows, with their dependencies

shown in the dependency diagram in Figure 1:

FIAT:

Automated generation of ﬁnite element spaces V and basis functions

φ∈V

on the reference cell and numerical integration with FInite element

Automated Tabulator (FIAT) [13, 12]

e= (K, V, L)

where

K

is a cell in a mesh

T

,

V

is a ﬁnite-dimensional function space,

L

is

a set of degrees of freedom.

FFC+UFL:

Automated evaluation of weak forms in mathematical notation on

one cell based on code generation with Uniﬁed Form Language (UFL) and

FEniCS Form Compiler (FFC) [13, 11], using the basis functions

φ∈V

from

FIAT. For example, in the case of the Laplacian operator

AK

ij =aK(φi, φj) = ZK

∇φi· ∇φjdx =ZK

lhs(r(φi, φj)dx)

where AKis the element stiﬀness matrix and r(·,·) is the weak residual.

DOLFIN-HPC:

Automated high performance assembly of weak forms and

interface to linear algebra of discrete systems and mesh reﬁnement on a

distributed mesh TΩ[10].

A= 0

for all cells K∈ TΩ

A+= AK

Ax =b

Unicorn:

Automated Uniﬁed Continuum modeling with Unicorn choosing a

speciﬁc weak residual form for incompressible balance equations of mass and

momentum with example visualizations of aircraft simulation below left and

turbulent FSI in vocal folds below right [4].

rUC ((v, q),(u, p)) = (v , ρ(∂tu+(u·∇)u)+∇·σ−g)+(q, ∇·u)+LS((v, q),(u, p))

where LS is a least-squares stabilizing term described in [7].

4

Fig. 1: FEniCS-HPC component dependency diagram.

A user of FEniCS-HPC writes the weak forms in the UFL language, compiles

it with FFC, and includes it in a high-level “solver” written in C++ in DOLFIN-

HPC to read in a mesh, assemble the forms, solve linear systems, reﬁne the mesh,

etc. The Unicorn solver for adaptive computation of turbulent ﬂow and FSI is

developed as part of FEniCS-HPC.

2.1 Solving PDE problems in FEniCS-HPC

Poisson’s equation

To solve Poisson’s equation in weak form

RΩ

(

∇u, ∇v

)

−

(

f, u

) = 0

∀v∈V

in the framework, we ﬁrst deﬁne the weak form in a UFL

“form ﬁle”, closely mapping mathematical notation (see Figure 2). The form ﬁle

is then compiled to low-level C++ source code for assembling the local element

matrix and vector with FFC. Finally we use DOLFIN-HPC to write a high-level

“solver” in C++, composing the diﬀerent abstractions, where a mesh is deﬁned,

the global matrix and vector are assembled by interfacing to the generated source

code, the linear system is solved by an abstract parallel linear algebra interface

(using PETSc as back-end by default), and then the solution function is saved to

disk. The source code for an example solver is presented in Figure 2.

Q=FiniteElement( " CG " , "t et r a he d r o n " , 1 )

v=TestFunction(Q)# te st ba si s f u nc ti on

u=TrialFunction(Q)# tr i al b as i s fu n ct io n

f=Coefficient(Q)# f un ct io n

# Bi li n ea r an d l in ea r f or ms

a=do t (grad(v) , gr ad (u) ) * dx

L=v*f*dx

// Def in e m es h , BC s and co ef fi ci en ts

Po i ss o nB o un d ar y bo un d ar y ;

PoissonBoundaryValue u0(mesh);

SourceFunction f(mesh);

Di r ic h l e tB C bc (u0 ,mesh ,b ou n da ry ) ;

// D ef in e PD E

PoissonBilinearForm a;

PoissonLinearForm L(f);

Li n ea r PD E pd e (a,L,mesh,b c );

// S ol ve P DE

Fu nc t io n u;

pd e .so l ve (u) ;

// S av e so l ut io n to f i le

File file( ‘‘ poisson .pvd’ ’) ;

file << u;

Fig. 2: Poisson solver in FEniCS-HPC with the weak form in the UFL language

(left) and the solver in C++ using DOLFIN-HPC (right).

5

The incompressible Navier-Stokes equations

We formulate the General

Galerkin (G2) method for incompressible Navier-Stokes equations

(1)

in UFL by

a direct input of the weak residual. We can automatically derive the Jacobian

in a quasi-Newton ﬁxed-point formulation and also automatically linearize and

generate the adjoint problem needed for adaptive error control. These examples

are presented in Figure 3

V=VectorElement( " CG " , "t et r a he d r o n " , 1 )

Q=FiniteElement( " CG " , "t et r a he d r o n " , 1 )

v=TestFunction(V); q=TestFunction (Q)

u_ =TrialFunction(V) ; p_ =TrialFunction (Q)

u=Coefficient(V); p=Coefficient (Q)

u0 =Coefficient(V) ; um = 0. 5* ( u+u0 )

# Mo me n tu m an d c on ti n ui t y we ak re s id ua l s

r_ m = ( i nn er (u-u 0 ,v)/ k+ \

(( nu *i n ne r (grad(u m ) , grad(v)) + \

in ne r (grad(p) + grad(um ) * um ,v) )) )* d x +LS_u*d x

r_ c =in ne r (d iv (u) , q))*d x +LS_p*d x

# N e wt o n ’s me t ho d Ju_ i + 1 = Ju _i - F ( u_ i )

a=de ri v at i ve (r _m ,u,u_ )

L=action(a,u) - r_m

# Ad jo in t p ro b le m ( st a ti on a ry p a rt ) fo r r_ m

a_adjoint =adjoint(de r iv a ti v e (r_ m -in ne r (u,v) / k*dx ,u) )

L_adjoint_c =de ri v at i ve (action (r_c ,p) , u,v)

L_adjoint =in ne r (psi_m ,v)* d x -L_adjoint_c

Fig. 3: Example of weak forms in UFL notation for the cG(1)cG(1) method for

incompressible Navier-Stokes equations (left) together with the adjoint problem

(right).

3 Parallelization strategy and performance

The parallelization is based on a fully distributed mesh approach, where everything

from preprocessing, assembly of linear systems, postprocessing and reﬁnement

is performed in parallel, without representing the entire problem or any pre-

/postprocessing step on a single core

Inital data distribution is deﬁned by the graph partitioning of the correspond-

ing dual graph of the mesh. Each core is assigned a set of whole elements and

the vertex overlap between cores is represented as ghosted entities.

3.1 Parallel assembly

The assembling of the global matrix is performed in a straightforward fashion.

Each core computes the local matrix of the local elements and add them to the

global matrix. Since we assign whole elements to each core, we can minimize

data dependency during assembly. Furthermore, we renumber all the degrees

of freedom such that a minimal amount of communication is required when

modifying entries in the sparse matrix.

6

3.2 Solution of discrete system

The FEM discretization generates a non-linear algebraic equation system to be

solved for each time step. In Unicorn we solve this by iterating between the

velocity and pressure equations by a Picard or quasi-Newton iteration [6].

Each iteration in turn generates a linear system to be solved. We use simple

Krylov solvers and preconditioners which scale well to many cores, typically

BiCGSTAB with a block-Jacobi preconditioner, where each block is solved with

ILU(0).

3.3 Mesh reﬁnement

Local mesh reﬁnement is based around a parallelization of the well known recursive

longest edge bisection method [15]. The parallelization splits up the reﬁnement

into two phases. First a local serial reﬁnement phase bisects all elements marked

for reﬁnement on each core (concurrently) leaving several hanging nodes on the

shared interface between cores. The second phase propagates these hanging nodes

onto adjacent cores.

The algorithm iterates between local reﬁnement and global propagation until

all cores are free of hanging nodes. For an eﬃcent implementation, one has to

detect when all cores are idling at the same time. Our implementation uses a fully

distributed termination detection scheme, which includes termination detection

in the global propagation step by using recusive doubling or hypercube exchange

type communication patterns [10]. Also, the termination detection algorithm does

not have a central point of control, hence no bottlenecks, less message contention,

and no problems with load imbalance.

Dynamic load balancing

In order to sustain good load balance across several

adaptive iterations, dynamic load balancing is needed. DOLFIN-HPC is equipped

with a scratch and remap type load balancer, based on the widely used PLUM

scheme [14], where the new partitions are assigned in an optimal way by solving

the maximally weighted bipartite graph problem. We have improved the scheme

such that it scales linearly to thousands of cores [10, 8].

Furthermore, we have extended the load balancer with an a priori workload

estimation. With a dry run of the reﬁnement algorithm, we add weights to a

dual graph of the mesh, corresponding to the workload after reﬁnement. Finally,

we repartition the unreﬁned mesh according to the weighted dual graph and

redistribute the new partitions before the reﬁnement.

4 Strong scalability

To be able to take advantage of available supercomputers today the entire solver

in FEniCS-HPC needs to demonstrate good strong scaling to at least several

thousands of cores. For planned “exascale” systems with many million cores,

strong scalability has to be attained for at least hundreds of thousands of cores.

7

In this section we analyze scaling results using the PETSc parallel linear

algebra backend based on pure MPI and the JANPACK backend based on PGAS.

In Figure 4 we present strong scalability results with the PETSc pure MPI

backend for the full G2 method for turbulent incompressible Navier-Stokes

equations

(1)

(assemble linear systems and solve the momentum and continuity

equations) in 3D on a mesh with 147M vertices on the Hornet Cray XC40

computer. We observe near-optimal scaling to ca. 4.6 kcores for all the main

algorithms (assembly and linear solves). Going from 4.6 kcores to 9.2 kcores we

start to see a degradation in the scaling with a speedup of ca. 0.7, and from 9.2

kcores to 18.4 kcores the speedup is 0.5. It’s clear that it’s mainly the assembly

that shows degraded scaling.

In Figure 5 we present results for assembling four diﬀerent equations using

the JANPACK backend, where FEniCS-HPC is running in a hybrid MPI+PGAS

mode. We observe that for large number of cores, the low latency one-sided

communication of PGAS languages in combination with our new sparse matrix

format [9] greatly improves the scalability.

Fig. 4: Strong scalabil-

ity test for the full G2

method for incompress-

ible turbulent Navier-

Stokes equations (as-

semble linear systems

and solve momentum

and continuity) in 3D

on a Cray XC40.

5 Unicorn simulation of a full aircraft

In the Unicorn component we implement the full G2 method and ﬁx the weak

residual to the cG(1)cG(1) stabilized space-time method for incompressible

Navier-Stokes equations (or a general stress for FSI)

In a cG(1)cG(1) method [7] we seek an approximate space-time solution

ˆ

U

= (

U, P

) which is continuous piecewise linear in space and time (equivalent

to the implicit Crank-Nicolson method). With

I

a time interval with subinter-

vals

In

= (

tn−1, tn

),

Wn

a standard spatial ﬁnite element space of continuous

piecewise linear functions, and

Wn

0

the functions in

Wn

which are zero on the

boundary

Γ

, the cG(1)cG(1) method for constant density incompressible ﬂow

with homogeneous Dirichlet boundary conditions for the velocity takes the form:

for

n

= 1

, ..., N

, ﬁnd (

Un, P n

)

≡

(

U

(

tn

)

, P

(

tn

)) with

Un∈Vn

0≡

[

Wn

0

]

3

and

8

0.01

0.1

1

10

102103104105

runtime (seconds)

cores

2D Convection-diﬀusion 214M cells

0.01

0.1

1

102103104105

runtime (seconds)

cores

3D Poisson 317M cells

0.01

0.1

1

10

102103104105

runtime (seconds)

cores

3D Navier-Stokes 80M cells

0.01

0.1

1

102103104105

runtime (seconds)

cores

3D Linear Elasticity 14M cells

PETSc

JANPACK

PETSc

JANPACK

PETSc

JANPACK

PETSc

JANPACK

Fig. 5: Sparse matrix assembly timings for four diﬀerent equations on a Cray

XC40.

Pn∈Wn, such that

r((U, P ),(v, q )) = ((Un−Un−1)k−1

n+ ( ¯

Un· ∇)¯

Un, v) + (2νǫ(¯

Un), ǫ(v))

−(P, ∇ · v) + (∇ · ¯

Un, q) + LS = 0, , ∀ˆv= (v, q)∈Vn

0×Wn(1)

where

¯

Un

= 1

/

2(

Un

+

Un−1

) is piecewise constant in time over

In

and LS a

least-squares stabilizing term described in [7].

We formulate a new general adjoint-based method for adaptive error control

based on the following error representation and adjoint weak bilinear and linear

forms with the error

ˆe

=

ˆu−ˆ

U

, adjoint solution

ˆ

φ

, output quantity

ψ

and the

hat signifying the full velocity-pressure vector ˆ

U= (U, P ), with rG=r−LS:

(ˆe, ψ) = r′(ˆe, ˆ

φ) = rG(ˆ

U;ˆ

φ)aadjoint (v, ˆ

φ) = r′(v, ˆ

φ)Ladjoint (v) = (v, ψ ) (2)

We have used our adaptive ﬁnite element methodology for turbulent ﬂow and

FEniCS-HPC software to solve the incompressible Navier-Stokes equations of

the ﬂow past a full high-lift aircraft model (DLR-F11) with complex geometry at

realistic Reynolds number for take-oﬀ and landing. This work is an extension of

our contributed simulation results to the 2

nd

AIAA CFD High-Lift Prediction

Workshop (HiLiftPW-2), in San Diego, California, in 2013 [5].

In the following results we focus on the angle of attack

α

= 18

.

5

◦

. To quantify

mesh-convergence we plot the coeﬃcients and their relative error compared to

the experimental values (serving as the reference) versus the number of vertices

in the meshes, and plot meshes and volume renderings of quantities related to

the adaptivity in Figure 6.

We see that our adaptive computational results come very close to the

experimental results on the ﬁnest mesh, with a relative error under 1% for cl and

cd. For other angles we observe similar results presented in [5].

9

Fig. 6: Plots for the aircraft simulation at

α

= 18

.

5

◦

. Lift coeﬃcient,

cl

, and

drag coeﬃcient,

cd

, vs. angle of attack,

α

, for the diﬀerent meshes from the

iterative adaptive method (left). Slice aligned with the angle of attack showing

the tetrahedra of the starting mesh versus the ﬁnest adaptive mesh (top right).

Volume rendering of the velocity residual and adjoint velocity magnitude (bottom

right).

6 Summary

We have given an overview of the general FEniCS-HPC software framework for

automated solution of PDE, taking the weak form as input in near-mathematical

notation, with automated discretization and a new simple method for adaptive

error control, suitable for parallel implementation. On the Hornet Cray XC40

supercomputer we demonstrate new optimal strong scaling results for the whole

adaptive framework applied to turbulent ﬂow on massively parallel architectures

down to 25000 vertices per core with ca. 5000 cores with the MPI-based PETSc

backend and for assembly down to 500 vertices per core with ca. 20000 cores

with the PGAS-based JANPACK backend.

Using the Unicorn component in FEniCS-HPC we have simulated the aero-

dynamics of a full DLR-F11 aircraft in connection with the HiLift-PW2 bench-

marking workshop. We ﬁnd that the simulation results compare very well with

experimental data; moreover, we show mesh-convergence by the adaptive method,

while using a low number of spatial degrees of freedom.

Acknowledgments

This research has been supported by EU-FET grant EUNISON 308874, the

European Research Council, the Swedish Foundation for Strategic Research, the

Swedish Research Council, the Basque Excellence Research Center (BERC 2014-

2017) program by the Basque Government, the Spanish Ministry of Economy and

10

Competitiveness MINECO: BCAM Severo Ochoa accreditation SEV-2013-0323

and the Project of the Spanish MINECO: MTM2013-40824.

We acknowledge PRACE for awarding us access to the supercomputer re-

sources Hermit, Hornet and SuperMUC based in Germany at The High Perfor-

mance Computing Center Stuttgart (HLRS) and Leibniz Supercomputing Center

(LRZ), from the Swedish National Infrastructure for Computing (SNIC) at PDC –

Center for High-Performance Computing and on resources provided by the “Red

Espa˜nola de Supercomputaci´on” and the “Barcelona Supercomputing Center -

Centro Nacional de Supercomputaci´on”.

We would also like to acknowledge the FEniCS and FEniCS-HPC developers

globally.

References

1.

W. Bangerth, R. Hartmann, and G. Kanschat. deal.II — a general-purpose object-

oriented ﬁnite element library. ACM Trans. Math. Softw., 33(4), 2007.

2. FEniCS. FEniCS project, 2003. http://www.fenicsproject.org.

3. F. Hecht. New development in freefem++. J. Numer. Math., 20, 2012.

4.

J. Hoﬀman, J. Jansson, R. Vilela de Abreu, N. C. Degirmenci, N. Jansson, K. M¨uller,

M. Nazarov, and J. H. Sp¨uhler. Unicorn: Parallel adaptive ﬁnite element simulation

of turbulent ﬂow and ﬂuid-structure interaction for deforming domains and complex

geometry. Comput. Fluids, 80(0):310 – 319, 2013.

5.

J. Hoﬀman, J. Jansson, N. Jansson, and R. Vilela De Abreu. Towards a parameter-

free method for high reynolds number turbulent ﬂow simulation based on adaptive

ﬁnite element approximation. Computer Methods in Applied Mechanics and Engi-

neering, 288(0):60 – 74, 2015.

6.

J. Hoﬀman, J. Jansson, and M. St¨ockli. Uniﬁed continuum modeling of ﬂuid-

structure interaction. Math. Mod. Meth. Appl. S., 2011.

7.

Johan Hoﬀman and Claes Johnson. Computational Turbulent Incompressible Flow,

volume 4 of Applied Mathematics: Body and Soul. Springer, 2007.

8.

Niclas Jansson. High Performance Adaptive Finite Element Methods: With Appli-

cations in Aerodynamics. PhD thesis, KTH Royal Institute of Technology, 2013.

9.

Niclas Jansson. Optimizing Sparse Matrix Assembly in Finite Element Solvers with

One-sided Communication. In High Performance Computing for Computational

Science – VECPAR 2012, volume 7851 of Lecture Notes in Computer Science.

Springer Berlin Heidelberg, 2013.

10.

Niclas Jansson, Johan Hoﬀman, and Johan Jansson. Framework for Massively

Parallel Adaptive Finite Element Computational Fluid Dynamics on Tetrahedral

Meshes. SIAM J. Sci. Comput., 34(1):C24–C41, 2012.

11.

R. C. Kirby and A. Logg. A compiler for variational forms. ACM Transactions on

Mathematical Software, 32(3):417–444, 2006.

12.

Robert C Kirby. Algorithm 839: Fiat, a new paradigm for computing ﬁnite element

basis functions. ACM Transactions on Mathematical Software (TOMS), 2004.

13.

Anders Logg, Kent-Andre Mardal, Garth N. Wells, et al. Automated Solution of

Diﬀerential Equations by the Finite Element Method. Springer, 2012.

14.

Leonid Oliker. PLUM parallel load balancing for unstructured adaptive meshes.

Technical Report RIACS-TR-98-01, RIACS, NASA Ames Research Center, 1998.

15.

MC Rivara. New longest-edge algorithms for the reﬁnement and/or improvement

of unstructured triangulations. Int. J. Numer. Meth. Eng., 1997.

Project

Now wrapping up, the project has ended, see www.eunison.eu for output and details. I was the project's coordinator.

Article

We describe a free software/open source continuum mechanics solver Unicorn [1] as part of the FEniCS [2, 3] software project for automation of computational modeling, with aspects such as Unified Continuum (UC) modeling for canonical representation/discretization of continuum mechanics model-ing, abstraction of parallel low-level finite element assembly functions through a high-level... [Show full abstract]

Conference Paper

We present a framework for coupled multiphysics in computational fluid dynamics, targeting massively parallel systems. Our strategy is based on general problem formulations in the form of partial differential equations and the finite element method, which open for automation, and optimization of a set of fundamental algorithms. We describe these algorithms, including finite element matrix... [Show full abstract]

Chapter

We present an adaptive finite element method for time-resolved simulation of aerodynamics without any turbulence-model parameters, which is applied to a benchmark problem from the HiLiftPW-3 workshop to compute the flow past a JAXA Standard Model (JSM) aircraft model at realistic Reynolds numbers. The mesh is automatically constructed by the method as part of an adaptive algorithm based on a... [Show full abstract]

Conference Paper

We develop a PUFEM–Partition of Unity Finite Element Method to impose slip velocity boundary conditions on conforming internal interfaces for a fluid-structure interaction model. The method facilitates a straightforward implementation on the FEniCS/FEniCS-HPC platform. We show two results for 2D model problems with the implementation on FEniCS: (1) optimal convergence rate is shown for a... [Show full abstract]