Download full-text PDF

FEniCS-HPC: Automated Predictive High-Performance Finite Element Computing with Applications in Aerodynamics

Conference Paper · September 2015with336 Reads
DOI 10.1007/978-3-319-32149-3_34
Johan Hoffman at KTH Royal Institute of Technology
  • 25.85
  • KTH Royal Institute of Technology
Johan Jansson at KTH Royal Institute of Technology
  • 20.56
  • KTH Royal Institute of Technology
Niclas Jansson at KTH Royal Institute of Technology
  • 14.82
  • KTH Royal Institute of Technology
Abstract
Developing multiphysics finite element methods (FEM) and scalable HPC implementations can be very challenging in terms of software complexity and performance, even more so with the addition of goal-oriented adaptive mesh refinement. To manage the complexity we in this work present general adaptive stabilized methods with automated implementation in the FEniCS-HPC automated open source software framework. This allows taking the weak form of a partial differential equation (PDE) as input in near-mathematical notation and automatically generating the low-level implementation source code and auxiliary equations and quantities necessary for the adaptivity. We demonstrate new optimal strong scaling results for the whole adaptive framework applied to turbulent flow on massively parallel architectures down to 25000 vertices per core with ca. 5000 cores with the MPI-based PETSc backend and for assembly down to 500 vertices per core with ca. 20000 cores with the PGAS-based JANPACK backend. As a demonstration of the power of the combination of the scalability together with the adaptive methodology allowing prediction of gross quantities in turbulent flow we present an application in aerodynamics of a full DLR-F11 aircraft in connection with the HiLift-PW2 benchmarking workshop with good match to experiments.
Figures
Fig. 1: FEniCS-HPC component dependency diagram.  
FEniCS-HPC component dependency diagram.  
Fig. 6: Plots for the aircraft simulation at α = 18.5 @BULLET . Lift coefficient, cl, and drag coefficient, cd, vs. angle of attack, α, for the different meshes from the iterative adaptive method (left). Slice aligned with the angle of attack showing the tetrahedra of the starting mesh versus the finest adaptive mesh (top right). Volume rendering of the velocity residual and adjoint velocity magnitude (bottom right).  
Plots for the aircraft simulation at α = 18.5 @BULLET . Lift coefficient, cl, and drag coefficient, cd, vs. angle of attack, α, for the different meshes from the iterative adaptive method (left). Slice aligned with the angle of attack showing the tetrahedra of the starting mesh versus the finest adaptive mesh (top right). Volume rendering of the velocity residual and adjoint velocity magnitude (bottom right).  
FEniCS-HPC: Automated predictive
high-performance finite element computing with
applications in aerodynamics
Johan Hoffman1,2,3, Johan Jansson2,1,4, and Niclas Jansson1,5
1Computational Technology Laboratory, School of Computer Science and
Communication, KTH, Stockholm, Sweden
2BCAM - Basque Center for Applied Mathematics, Bilbao, Spain
3jhoffman@kth.se
4jjan@kth.se
5njansson@kth.se
Abstract.
Developing multiphysics finite element methods (FEM) and
scalable HPC implementations can be very challenging in terms of soft-
ware complexity and performance, even more so with the addition of
goal-oriented adaptive mesh refinement. To manage the complexity we in
this work present general adaptive stabilized methods with automated
implementation in the FEniCS-HPC automated open source software
framework. This allows taking the weak form of a partial differential
equation (PDE) as input in near-mathematical notation and automati-
cally generating the low-level implementation source code and auxiliary
equations and quantities necessary for the adaptivity. We demonstrate
new optimal strong scaling results for the whole adaptive framework
applied to turbulent flow on massively parallel architectures down to
25000 vertices per core with ca. 5000 cores with the MPI-based PETSc
backend and for assembly down to 500 vertices per core with ca. 20000
cores with the PGAS-based JANPACK backend. As a demonstration of
the power of the combination of the scalability together with the adaptive
methodology allowing prediction of gross quantities in turbulent flow
we present an application in aerodynamics of a full DLR-F11 aircraft
in connection with the HiLift-PW2 benchmarking workshop with good
match to experiments.
Keywords: FEM, adaptive, turbulence
1 Introduction
As computational methods are applied to simulate even more advanced problems
of coupled physical processes and supercomputing hardware is developed towards
massively parallel heterogeneous systems, it is a major challenge to manage the
complexity and performance of methods, algorithms and software implemen-
tations. Adaptive methods based on quantitative error control pose additional
challenges. For simulation based on partial differential equation (PDE) models,
2
the finite element method (FEM) offers a general approach to numerical discreti-
sation, which opens for automation of algorithms and software implementation.
In this paper we present the FEniCS-HPC open source software framework
with the goal to combine the generality of FEM with performance, by optimisation
of generic algorithms [4, 2, 13]. We demonstrate the performance of FEniCS-HPC
in an application to subsonic aerodynamics.
We give an overview of the methodology and the FEniCS-HPC framework,
key aspects of the framework include:
1. Automated discretization
where the weak form of a PDE in mathemat-
ical notation is translated into a system of algebraic equations using code
generation.
2. Automated error control
, ensures that the discretization error e = u -
U in a given quantity is smaller than a given tolerance by adaptive mesh
refinement based on duality-based a posteriori error estimates. An a posteri
error estimate and error indicators are automatically generated from the
weak form of the PDE, by directly using the error representation.
3. Automated modeling
, which includes a residual based implicit turbulence
model, where the turbulent dissipation comes only from the numerical stabi-
lization, as well as treating the fluid and solid in fluid-structure interaction
(FSI) as one continuum with a phase indicator function tracked by a moving
mesh and implicitly modeling contact.
We demonstrate new optimal strong scaling results for the whole adaptive
framework applied to turbulent flow on massively parallel architectures down to
25000 vertices per core with ca. 5000 cores with the MPI-based PETSc backend
and for assembly down to 500 vertices per core with ca. 20000 cores with the
PGAS-based JANPACK backend. We also present an application in aerodynamics
of a full DLR-F11 aircraft in connection with the HiLift-PW2 benchmarking
workshop with good match to experiments.
1.1 The FEniCS project and state of the art
The software described here is part of the FEniCS project [2], with the goal to
automate the scientific software process by relying on general implementations and
code generation, for robustness and to enable high speed of software development.
Deal.II [1] is a software framework with a similar goal, implementing general
PDE based on FEM in C++ where users write the “numerical integration
loop” for weak forms for computing the linear systems. The framework runs
on supercomputers with optimal strong scaling. Deal.II is based on quadrilater
(2D) and hexahedral (3D) meshes, whereas FEniCS is based on simplicial meshes
(triangles in 2D and tetrahedra in 3D).
Another FEM software framework with a similar goal is FreeFEM++ [3], which
has a high-level syntax close to mathematical notation, and has demonstrated
optimal strong scaling up to ca. 100 cores.
3
2 The FEniCS-HPC framework
FEniCS-HPC is a problem-solving environment (PSE) for automated solution
of PDE by the FEM with a high-level interface for the basic concepts of FEM:
weak forms, meshes, refinement, sparse linear algebra, and with HPC concepts
such as partitioning, load balancing abstracted away.
The framework is based on components with clearly defined responsibilities.
A compact description of the main components follows, with their dependencies
shown in the dependency diagram in Figure 1:
FIAT:
Automated generation of finite element spaces V and basis functions
φV
on the reference cell and numerical integration with FInite element
Automated Tabulator (FIAT) [13, 12]
e= (K, V, L)
where
K
is a cell in a mesh
T
,
V
is a finite-dimensional function space,
L
is
a set of degrees of freedom.
FFC+UFL:
Automated evaluation of weak forms in mathematical notation on
one cell based on code generation with Unified Form Language (UFL) and
FEniCS Form Compiler (FFC) [13, 11], using the basis functions
φV
from
FIAT. For example, in the case of the Laplacian operator
AK
ij =aK(φi, φj) = ZK
φi· ∇φjdx =ZK
lhs(r(φi, φj)dx)
where AKis the element stiffness matrix and r(·,·) is the weak residual.
DOLFIN-HPC:
Automated high performance assembly of weak forms and
interface to linear algebra of discrete systems and mesh refinement on a
distributed mesh T[10].
A= 0
for all cells K∈ T
A+= AK
Ax =b
Unicorn:
Automated Unified Continuum modeling with Unicorn choosing a
specific weak residual form for incompressible balance equations of mass and
momentum with example visualizations of aircraft simulation below left and
turbulent FSI in vocal folds below right [4].
rUC ((v, q),(u, p)) = (v , ρ(tu+(u·∇)u)+∇·σg)+(q, ∇·u)+LS((v, q),(u, p))
where LS is a least-squares stabilizing term described in [7].
4
Fig. 1: FEniCS-HPC component dependency diagram.
A user of FEniCS-HPC writes the weak forms in the UFL language, compiles
it with FFC, and includes it in a high-level “solver” written in C++ in DOLFIN-
HPC to read in a mesh, assemble the forms, solve linear systems, refine the mesh,
etc. The Unicorn solver for adaptive computation of turbulent flow and FSI is
developed as part of FEniCS-HPC.
2.1 Solving PDE problems in FEniCS-HPC
Poisson’s equation
To solve Poisson’s equation in weak form
R
(
u, v
)
(
f, u
) = 0
vV
in the framework, we first define the weak form in a UFL
“form file”, closely mapping mathematical notation (see Figure 2). The form file
is then compiled to low-level C++ source code for assembling the local element
matrix and vector with FFC. Finally we use DOLFIN-HPC to write a high-level
“solver” in C++, composing the different abstractions, where a mesh is defined,
the global matrix and vector are assembled by interfacing to the generated source
code, the linear system is solved by an abstract parallel linear algebra interface
(using PETSc as back-end by default), and then the solution function is saved to
disk. The source code for an example solver is presented in Figure 2.
Q=FiniteElement( " CG " , "t et r a he d r o n " , 1 )
v=TestFunction(Q)# te st ba si s f u nc ti on
u=TrialFunction(Q)# tr i al b as i s fu n ct io n
f=Coefficient(Q)# f un ct io n
# Bi li n ea r an d l in ea r f or ms
a=do t (grad(v) , gr ad (u) ) * dx
L=v*f*dx
// Def in e m es h , BC s and co ef fi ci en ts
Po i ss o nB o un d ar y bo un d ar y ;
PoissonBoundaryValue u0(mesh);
SourceFunction f(mesh);
Di r ic h l e tB C bc (u0 ,mesh ,b ou n da ry ) ;
// D ef in e PD E
PoissonBilinearForm a;
PoissonLinearForm L(f);
Li n ea r PD E pd e (a,L,mesh,b c );
// S ol ve P DE
Fu nc t io n u;
pd e .so l ve (u) ;
// S av e so l ut io n to f i le
File file( ‘‘ poisson .pvd ’) ;
file << u;
Fig. 2: Poisson solver in FEniCS-HPC with the weak form in the UFL language
(left) and the solver in C++ using DOLFIN-HPC (right).
5
The incompressible Navier-Stokes equations
We formulate the General
Galerkin (G2) method for incompressible Navier-Stokes equations
(1)
in UFL by
a direct input of the weak residual. We can automatically derive the Jacobian
in a quasi-Newton fixed-point formulation and also automatically linearize and
generate the adjoint problem needed for adaptive error control. These examples
are presented in Figure 3
V=VectorElement( " CG " , "t et r a he d r o n " , 1 )
Q=FiniteElement( " CG " , "t et r a he d r o n " , 1 )
v=TestFunction(V); q=TestFunction (Q)
u_ =TrialFunction(V) ; p_ =TrialFunction (Q)
u=Coefficient(V); p=Coefficient (Q)
u0 =Coefficient(V) ; um = 0. 5* ( u+u0 )
# Mo me n tu m an d c on ti n ui t y we ak re s id ua l s
r_ m = ( i nn er (u-u 0 ,v)/ k+ \
(( nu *i n ne r (grad(u m ) , grad(v)) + \
in ne r (grad(p) + grad(um ) * um ,v) )) )* d x +LS_u*d x
r_ c =in ne r (d iv (u) , q))*d x +LS_p*d x
# N e wt o n ’s me t ho d Ju_ i + 1 = Ju _i - F ( u_ i )
a=de ri v at i ve (r _m ,u,u_ )
L=action(a,u) - r_m
# Ad jo in t p ro b le m ( st a ti on a ry p a rt ) fo r r_ m
a_adjoint =adjoint(de r iv a ti v e (r_ m -in ne r (u,v) / k*dx ,u) )
L_adjoint_c =de ri v at i ve (action (r_c ,p) , u,v)
L_adjoint =in ne r (psi_m ,v)* d x -L_adjoint_c
Fig. 3: Example of weak forms in UFL notation for the cG(1)cG(1) method for
incompressible Navier-Stokes equations (left) together with the adjoint problem
(right).
3 Parallelization strategy and performance
The parallelization is based on a fully distributed mesh approach, where everything
from preprocessing, assembly of linear systems, postprocessing and refinement
is performed in parallel, without representing the entire problem or any pre-
/postprocessing step on a single core
Inital data distribution is defined by the graph partitioning of the correspond-
ing dual graph of the mesh. Each core is assigned a set of whole elements and
the vertex overlap between cores is represented as ghosted entities.
3.1 Parallel assembly
The assembling of the global matrix is performed in a straightforward fashion.
Each core computes the local matrix of the local elements and add them to the
global matrix. Since we assign whole elements to each core, we can minimize
data dependency during assembly. Furthermore, we renumber all the degrees
of freedom such that a minimal amount of communication is required when
modifying entries in the sparse matrix.
6
3.2 Solution of discrete system
The FEM discretization generates a non-linear algebraic equation system to be
solved for each time step. In Unicorn we solve this by iterating between the
velocity and pressure equations by a Picard or quasi-Newton iteration [6].
Each iteration in turn generates a linear system to be solved. We use simple
Krylov solvers and preconditioners which scale well to many cores, typically
BiCGSTAB with a block-Jacobi preconditioner, where each block is solved with
ILU(0).
3.3 Mesh refinement
Local mesh refinement is based around a parallelization of the well known recursive
longest edge bisection method [15]. The parallelization splits up the refinement
into two phases. First a local serial refinement phase bisects all elements marked
for refinement on each core (concurrently) leaving several hanging nodes on the
shared interface between cores. The second phase propagates these hanging nodes
onto adjacent cores.
The algorithm iterates between local refinement and global propagation until
all cores are free of hanging nodes. For an efficent implementation, one has to
detect when all cores are idling at the same time. Our implementation uses a fully
distributed termination detection scheme, which includes termination detection
in the global propagation step by using recusive doubling or hypercube exchange
type communication patterns [10]. Also, the termination detection algorithm does
not have a central point of control, hence no bottlenecks, less message contention,
and no problems with load imbalance.
Dynamic load balancing
In order to sustain good load balance across several
adaptive iterations, dynamic load balancing is needed. DOLFIN-HPC is equipped
with a scratch and remap type load balancer, based on the widely used PLUM
scheme [14], where the new partitions are assigned in an optimal way by solving
the maximally weighted bipartite graph problem. We have improved the scheme
such that it scales linearly to thousands of cores [10, 8].
Furthermore, we have extended the load balancer with an a priori workload
estimation. With a dry run of the refinement algorithm, we add weights to a
dual graph of the mesh, corresponding to the workload after refinement. Finally,
we repartition the unrefined mesh according to the weighted dual graph and
redistribute the new partitions before the refinement.
4 Strong scalability
To be able to take advantage of available supercomputers today the entire solver
in FEniCS-HPC needs to demonstrate good strong scaling to at least several
thousands of cores. For planned “exascale” systems with many million cores,
strong scalability has to be attained for at least hundreds of thousands of cores.
7
In this section we analyze scaling results using the PETSc parallel linear
algebra backend based on pure MPI and the JANPACK backend based on PGAS.
In Figure 4 we present strong scalability results with the PETSc pure MPI
backend for the full G2 method for turbulent incompressible Navier-Stokes
equations
(1)
(assemble linear systems and solve the momentum and continuity
equations) in 3D on a mesh with 147M vertices on the Hornet Cray XC40
computer. We observe near-optimal scaling to ca. 4.6 kcores for all the main
algorithms (assembly and linear solves). Going from 4.6 kcores to 9.2 kcores we
start to see a degradation in the scaling with a speedup of ca. 0.7, and from 9.2
kcores to 18.4 kcores the speedup is 0.5. It’s clear that it’s mainly the assembly
that shows degraded scaling.
In Figure 5 we present results for assembling four different equations using
the JANPACK backend, where FEniCS-HPC is running in a hybrid MPI+PGAS
mode. We observe that for large number of cores, the low latency one-sided
communication of PGAS languages in combination with our new sparse matrix
format [9] greatly improves the scalability.
Fig. 4: Strong scalabil-
ity test for the full G2
method for incompress-
ible turbulent Navier-
Stokes equations (as-
semble linear systems
and solve momentum
and continuity) in 3D
on a Cray XC40.
5 Unicorn simulation of a full aircraft
In the Unicorn component we implement the full G2 method and fix the weak
residual to the cG(1)cG(1) stabilized space-time method for incompressible
Navier-Stokes equations (or a general stress for FSI)
In a cG(1)cG(1) method [7] we seek an approximate space-time solution
ˆ
U
= (
U, P
) which is continuous piecewise linear in space and time (equivalent
to the implicit Crank-Nicolson method). With
I
a time interval with subinter-
vals
In
= (
tn1, tn
),
Wn
a standard spatial finite element space of continuous
piecewise linear functions, and
Wn
0
the functions in
Wn
which are zero on the
boundary
Γ
, the cG(1)cG(1) method for constant density incompressible flow
with homogeneous Dirichlet boundary conditions for the velocity takes the form:
for
n
= 1
, ..., N
, find (
Un, P n
)
(
U
(
tn
)
, P
(
tn
)) with
UnVn
0
[
Wn
0
]
3
and
8
0.01
0.1
1
10
102103104105
runtime (seconds)
cores
2D Convection-diffusion 214M cells
0.01
0.1
1
102103104105
runtime (seconds)
cores
3D Poisson 317M cells
0.01
0.1
1
10
102103104105
runtime (seconds)
cores
3D Navier-Stokes 80M cells
0.01
0.1
1
102103104105
runtime (seconds)
cores
3D Linear Elasticity 14M cells
PETSc
JANPACK
PETSc
JANPACK
PETSc
JANPACK
PETSc
JANPACK
Fig. 5: Sparse matrix assembly timings for four different equations on a Cray
XC40.
PnWn, such that
r((U, P ),(v, q )) = ((UnUn1)k1
n+ ( ¯
Un· ∇)¯
Un, v) + (2νǫ(¯
Un), ǫ(v))
(P, ∇ · v) + (∇ · ¯
Un, q) + LS = 0, , ˆv= (v, q)Vn
0×Wn(1)
where
¯
Un
= 1
/
2(
Un
+
Un1
) is piecewise constant in time over
In
and LS a
least-squares stabilizing term described in [7].
We formulate a new general adjoint-based method for adaptive error control
based on the following error representation and adjoint weak bilinear and linear
forms with the error
ˆe
=
ˆuˆ
U
, adjoint solution
ˆ
φ
, output quantity
ψ
and the
hat signifying the full velocity-pressure vector ˆ
U= (U, P ), with rG=rLS:
e, ψ) = r(ˆe, ˆ
φ) = rG(ˆ
U;ˆ
φ)aadjoint (v, ˆ
φ) = r(v, ˆ
φ)Ladjoint (v) = (v, ψ ) (2)
We have used our adaptive finite element methodology for turbulent flow and
FEniCS-HPC software to solve the incompressible Navier-Stokes equations of
the flow past a full high-lift aircraft model (DLR-F11) with complex geometry at
realistic Reynolds number for take-off and landing. This work is an extension of
our contributed simulation results to the 2
nd
AIAA CFD High-Lift Prediction
Workshop (HiLiftPW-2), in San Diego, California, in 2013 [5].
In the following results we focus on the angle of attack
α
= 18
.
5
. To quantify
mesh-convergence we plot the coefficients and their relative error compared to
the experimental values (serving as the reference) versus the number of vertices
in the meshes, and plot meshes and volume renderings of quantities related to
the adaptivity in Figure 6.
We see that our adaptive computational results come very close to the
experimental results on the finest mesh, with a relative error under 1% for cl and
cd. For other angles we observe similar results presented in [5].
9
Fig. 6: Plots for the aircraft simulation at
α
= 18
.
5
. Lift coefficient,
cl
, and
drag coefficient,
cd
, vs. angle of attack,
α
, for the different meshes from the
iterative adaptive method (left). Slice aligned with the angle of attack showing
the tetrahedra of the starting mesh versus the finest adaptive mesh (top right).
Volume rendering of the velocity residual and adjoint velocity magnitude (bottom
right).
6 Summary
We have given an overview of the general FEniCS-HPC software framework for
automated solution of PDE, taking the weak form as input in near-mathematical
notation, with automated discretization and a new simple method for adaptive
error control, suitable for parallel implementation. On the Hornet Cray XC40
supercomputer we demonstrate new optimal strong scaling results for the whole
adaptive framework applied to turbulent flow on massively parallel architectures
down to 25000 vertices per core with ca. 5000 cores with the MPI-based PETSc
backend and for assembly down to 500 vertices per core with ca. 20000 cores
with the PGAS-based JANPACK backend.
Using the Unicorn component in FEniCS-HPC we have simulated the aero-
dynamics of a full DLR-F11 aircraft in connection with the HiLift-PW2 bench-
marking workshop. We find that the simulation results compare very well with
experimental data; moreover, we show mesh-convergence by the adaptive method,
while using a low number of spatial degrees of freedom.
Acknowledgments
This research has been supported by EU-FET grant EUNISON 308874, the
European Research Council, the Swedish Foundation for Strategic Research, the
Swedish Research Council, the Basque Excellence Research Center (BERC 2014-
2017) program by the Basque Government, the Spanish Ministry of Economy and
10
Competitiveness MINECO: BCAM Severo Ochoa accreditation SEV-2013-0323
and the Project of the Spanish MINECO: MTM2013-40824.
We acknowledge PRACE for awarding us access to the supercomputer re-
sources Hermit, Hornet and SuperMUC based in Germany at The High Perfor-
mance Computing Center Stuttgart (HLRS) and Leibniz Supercomputing Center
(LRZ), from the Swedish National Infrastructure for Computing (SNIC) at PDC –
Center for High-Performance Computing and on resources provided by the “Red
Espa˜nola de Supercomputaci´on” and the “Barcelona Supercomputing Center -
Centro Nacional de Supercomputaci´on”.
We would also like to acknowledge the FEniCS and FEniCS-HPC developers
globally.
References
1.
W. Bangerth, R. Hartmann, and G. Kanschat. deal.II — a general-purpose object-
oriented finite element library. ACM Trans. Math. Softw., 33(4), 2007.
2. FEniCS. FEniCS project, 2003. http://www.fenicsproject.org.
3. F. Hecht. New development in freefem++. J. Numer. Math., 20, 2012.
4.
J. Hoffman, J. Jansson, R. Vilela de Abreu, N. C. Degirmenci, N. Jansson, K. M¨uller,
M. Nazarov, and J. H. Sp¨uhler. Unicorn: Parallel adaptive finite element simulation
of turbulent flow and fluid-structure interaction for deforming domains and complex
geometry. Comput. Fluids, 80(0):310 – 319, 2013.
5.
J. Hoffman, J. Jansson, N. Jansson, and R. Vilela De Abreu. Towards a parameter-
free method for high reynolds number turbulent flow simulation based on adaptive
finite element approximation. Computer Methods in Applied Mechanics and Engi-
neering, 288(0):60 – 74, 2015.
6.
J. Hoffman, J. Jansson, and M. St¨ockli. Unified continuum modeling of fluid-
structure interaction. Math. Mod. Meth. Appl. S., 2011.
7.
Johan Hoffman and Claes Johnson. Computational Turbulent Incompressible Flow,
volume 4 of Applied Mathematics: Body and Soul. Springer, 2007.
8.
Niclas Jansson. High Performance Adaptive Finite Element Methods: With Appli-
cations in Aerodynamics. PhD thesis, KTH Royal Institute of Technology, 2013.
9.
Niclas Jansson. Optimizing Sparse Matrix Assembly in Finite Element Solvers with
One-sided Communication. In High Performance Computing for Computational
Science – VECPAR 2012, volume 7851 of Lecture Notes in Computer Science.
Springer Berlin Heidelberg, 2013.
10.
Niclas Jansson, Johan Hoffman, and Johan Jansson. Framework for Massively
Parallel Adaptive Finite Element Computational Fluid Dynamics on Tetrahedral
Meshes. SIAM J. Sci. Comput., 34(1):C24–C41, 2012.
11.
R. C. Kirby and A. Logg. A compiler for variational forms. ACM Transactions on
Mathematical Software, 32(3):417–444, 2006.
12.
Robert C Kirby. Algorithm 839: Fiat, a new paradigm for computing finite element
basis functions. ACM Transactions on Mathematical Software (TOMS), 2004.
13.
Anders Logg, Kent-Andre Mardal, Garth N. Wells, et al. Automated Solution of
Differential Equations by the Finite Element Method. Springer, 2012.
14.
Leonid Oliker. PLUM parallel load balancing for unstructured adaptive meshes.
Technical Report RIACS-TR-98-01, RIACS, NASA Ames Research Center, 1998.
15.
MC Rivara. New longest-edge algorithms for the refinement and/or improvement
of unstructured triangulations. Int. J. Numer. Meth. Eng., 1997.

Comment

May 13, 2016
Amirkabir University of Technology
A valuable achievement in HPC. Promising work for future of computational simulations.
Project
Now wrapping up, the project has ended, see www.eunison.eu for output and details. I was the project's coordinator.
Article
    We describe a free software/open source continuum mechanics solver Unicorn [1] as part of the FEniCS [2, 3] software project for automation of computational modeling, with aspects such as Unified Continuum (UC) modeling for canonical representation/discretization of continuum mechanics model-ing, abstraction of parallel low-level finite element assembly functions through a high-level... [Show full abstract]
    Conference Paper
      We present a framework for coupled multiphysics in computational fluid dynamics, targeting massively parallel systems. Our strategy is based on general problem formulations in the form of partial differential equations and the finite element method, which open for automation, and optimization of a set of fundamental algorithms. We describe these algorithms, including finite element matrix... [Show full abstract]
      Chapter
      April 2018
        We present an adaptive finite element method for time-resolved simulation of aerodynamics without any turbulence-model parameters, which is applied to a benchmark problem from the HiLiftPW-3 workshop to compute the flow past a JAXA Standard Model (JSM) aircraft model at realistic Reynolds numbers. The mesh is automatically constructed by the method as part of an adaptive algorithm based on a... [Show full abstract]
        Conference Paper
        June 2018
          We develop a PUFEM–Partition of Unity Finite Element Method to impose slip velocity boundary conditions on conforming internal interfaces for a fluid-structure interaction model. The method facilitates a straightforward implementation on the FEniCS/FEniCS-HPC platform. We show two results for 2D model problems with the implementation on FEniCS: (1) optimal convergence rate is shown for a... [Show full abstract]
          Discover more