ArticlePDF Available

Abstract and Figures

Sliding meshes are a powerful method to treat deformed domains in computational fluid dynamics, where different parts of the domain are in relative motion. In this paper, we present an efficient implementation of a sliding mesh method into a discontinuous Galerkin compressible Navier-Stokes solver and its application to a large eddy simulation of a 1-1/2 stage turbine. The method is based on the mortar method and is high-order accurate. It can handle three-dimensional sliding mesh interfaces with various interface shapes. For plane interfaces, which are the most common case, conservativity and free-stream preservation are ensured. We put an emphasis on efficient parallel implementation. Our implementation generates little computational and storage overhead. Inter-node communication via MPI in a dynamically changing mesh topology is reduced to a bare minimum by ensuring a priori information about communication partners and data sorting. We provide performance and scaling results showing the capability of the implementation strategy. Apart from analytical validation computations and convergence results, we present a wall-resolved implicit LES of the 1-1/2 stage Aachen turbine test case as a large scale practical application example.
Content may be subject to copyright.
Highlights
An Ecient Sliding Mesh Interface Method for High-Order Discontinuous Galerkin Schemes
Jakob D¨
urrw¨
achter, Marius Kurz, Patrick Kopper, Daniel Kempf, Claus-Dieter Munz, Andrea Beck
An ecient parallelization strategy for a high-order accurate sliding mesh method
Investigation of the method’s scaling behavior on high performance computing systems
A wall-resolved large eddy simulation of a 1-1/2 stage turbine
An Ecient Sliding Mesh Interface Method for High-Order Discontinuous
Galerkin Schemes
Jakob D¨
urrw¨
achtera,1,, Marius Kurza,1, Patrick Kopperb, Daniel Kempfa, Claus-Dieter Munza, Andrea Beckc
aInstitute of Aerodynamics and Gas Dynamics, University of Stuttgart, Pfaenwaldring 21, 70569 Stuttgart, Germany
bInstitute of Aircraft Propulsion Systems, University of Stuttgart, Pfaenwaldring 6, 70569 Stuttgart, Germany
cLaboratory of Fluid Dynamics and Technical Flows, University of Magdeburg “Otto von Guericke”, Universittsplatz 2, 39106 Madgeburg,
Germany
Abstract
Sliding meshes are a powerful method to treat deformed domains in computational fluid dynamics, where dierent
parts of the domain are in relative motion. In this paper, we present an ecient implementation of a sliding mesh
method into a discontinuous Galerkin compressible Navier-Stokes solver and its application to a large eddy simulation
of a 1-1/2 stage turbine. The method is based on the mortar method and is high-order accurate. It can handle
three-dimensional sliding mesh interfaces with various interface shapes. For plane interfaces, which are the most
common case, conservativity and free-stream preservation are ensured. We put an emphasis on ecient parallel
implementation. Our implementation generates little computational and storage overhead. Inter-node communication
via MPI in a dynamically changing mesh topology is reduced to a bare minimum by ensuring a priori information
about communication partners and data sorting. We provide performance and scaling results showing the capability
of the implementation strategy. Apart from analytical validation computations and convergence results, we present a
wall-resolved implicit LES of the 1-1/2 stage Aachen turbine test case as a large scale practical application example.
Keywords: Sliding mesh, Discontinuous Galerkin, High-order methods, High-performance computing, Large eddy
simulation, Turbine flow
1. Introduction
High-order methods have significantly gained popularity over the last decade, due to their potential advantages in
terms of accuracy and eciency for many applications [1]. A variety of high-order, element-based schemes has been
proposed in literature, such as reconstruction-based approaches like the WENO schemes [2, 3], h/p finite element
schemes [4, 5], the flux reconstruction method [6] and the spectral dierence method [7, 8], with the latter two also
being closely related to the discontinuous Galerkin methods [9, 10]. In particular the discontinuous Galerkin spectral
element method (DGSEM) [11, 12] has shown its suitability for simulations of unsteady turbulent flows in large-scale
applications and high-performance computing (HPC) as consequence of its high-order of accuracy as well as its ex-
cellent scaling properties [1, 13, 14, 15, 16]. While the development of novel formulations and ecient schemes is
still ongoing, there is sucient evidence from published results that DG and related methods have reached a certain
level of maturity. Thus, adding and exploring new features to the existing formulations - while retaining the original
properties - has now become another focus of ongoing development.
Many fields of engineering interest, as e.g. turbomachinery, wind turbines and rotorcraft, are characterized by mov-
ing geometries and large periodic displacements. A common technique to incorporate this movement into numerical
Corresponding author
Email addresses: jd@iag.uni-stuttgart.de (Jakob D¨
urrw¨
achter), m.kurz@iag.uni-stuttgart.de (Marius Kurz),
kopper@ila.uni-stuttgart.de (Patrick Kopper), kempf@iag.uni-stuttgart.de (Daniel Kempf), munz@iag.uni-stuttgart.de
(Claus-Dieter Munz), beck@iag.uni-stuttgart.de (Andrea Beck)
1J. D¨
urrw¨
achter and M. Kurz share first authorship.
Preprint submitted to Computers &Fluids December 11, 2020
schemes is the arbitrary Lagrangian-Eulerian (ALE) approach [17], which introduces a time-dependent mapping from
an arbitrarily deformed domain to the undeformed reference. Mesh movement can also be used for fully Lagrangian
formulations [18] and has in this context also been considered on curvilinear meshes [19, 20]. Since the mesh topology
and connectivity remain unchanged for the ALE approach, it is often limited to small or moderate relative displace-
ments to retain valid grid cells. In order to accommodate larger displacements, the mesh topology has to be dynamic.
To this end, several approaches based on the ALE method have been developed: A quite recent high-order accu-
rate family of methods allows for rather general topology changes and large displacements. It allows for re-meshing
through a continuous mesh movement between timesteps, which leads to straight polyhedral space-time meshes (two-
dimensional in space), which are unstructured both in space and time [21, 22]. Hanging nodes and sliding lines [23]
have also been incorporated into this approach [24]. In the overset mesh (Chimera) method [25, 26, 27, 28, 29], sev-
eral independent meshes inside the computational domain are overlaid, which can be moved independently and are
coupled by means of the overlapping elements and associated inter-mesh interpolation operators. The sliding mesh
approach, on the other hand, is less general, which allows it to be conceptually simpler, easier to implement and po-
tentially more computationally ecient. In the sliding mesh method [30, 31, 32, 33, 34], the computational domain is
divided into non-overlapping sub-domains, which can slide along a common interface while preserving the mesh ge-
ometry inside each sub-domain. This approach can be seen as a special case of the Chimera method, where the overlap
area is restricted to a linear (in 2D) or planar (in 3D) shared region, and the movement is prescribed accordingly. For
both approaches, special care must be taken to construct high-order variants, which guarantee global conservation and
overall error scaling behavior, especially if the consistent geometry representation of curved interfaces has to be taken
into account [35, 36, 37, 38, 39, 40, 41]. Schematics of the discussed methods are shown in Figure 1.
In this work, we thus focus on the challenge of combining such an approach with a high-order discontinuous Galerkin
solver. This not only requires a conservative, high-order accurate method at the sub-domain interface, but also the
design of an ecient and dynamic parallelization strategy which retains the scaling of the static version. These eorts
then give us the ability to conduct high-order sliding mesh simulations at an industrial scale, and provide a novel tool
for investigating the highly non-linear and unsteady flow physics in these scenarios. The implementation considered
in the present work is open source and can be retrieved from GitHub2.
Figure 1: Schematics of common moving mesh techniques: the arbitrary Lagrangian-Eulerian approach (ALE) (left), the sliding mesh method
(middle) and the overset mesh or Chimera method (right)
The method was used to investigate an interesting and challenging industrial application, the Aachen 1-1/2 stage axial
flow turbine [42, 43, 44]. It serves to study both complex turbulent flow phenomena and stage interaction, the pre-
diction of which are crucial to a turbine’s performance. Multi-stage turbomachines consist of several alternating rows
of static and rotating blades. The large deformations due to the unbounded relative displacement between static and
passing rotating blade rows pose serious challenges to numerical codes, which can, however, be handled well with
the sliding mesh method. The Aachen 1-1/2 stage turbine is a subsonic rig featuring a stator-rotor-stator configuration
with identical blade geometries for both stators, thereby oering the opportunity to study the influences of transient
behavior and wake interaction in the same setup.
The outline of this paper is as follows: The governing equations on the moving domain are given in Section 2 and
the DGSEM scheme as well as the treatment of non-conforming meshes are discussed in Section 3. Section 4 gives
details on the proposed implementation and parallelization strategy for the sliding mesh method. In Section 5, we
demonstrate the error convergence of the method, its scaling properties and the suitability for large-scale applications.
To this end, an implicit, wall-resolved large eddy simulation (LES) of the 1-1/2 stage turbine test case Aachen Turbine
2https://github.com/flexi-framework/flexi-extensions/tree/sliding-mesh
2
is presented and discussed. Section 6 concludes the paper, and we give an outlook on further developments.
2. Governing Equations
In this work, we consider the three-dimensional compressible Navier-Stokes equations, which can be written in
conservative form as
Ut+x·Fc(U)− ∇x·FvU,xU=0,(1)
with the vector of conserved variables in usual notation U=ρ, ρv1, ρv2, ρv3, ρeTand its time derivative Ut, the
convective fluxes Fc, the viscous fluxes Fvand the dierential operator xwith respect to the physical coordinates
x=[x1,x2,x3]T. To account for the moving frame of reference in case of mesh movement, the fluxes can be written
in an arbitrary Lagrangian-Eulerian (ALE) formulation [45], which yields the fluxes with columns i=1,2,3 as
Fc
i=
ρvi
ρv1vi+δ1ip
ρv2vi+δ2ip
ρv3vi+δ3ip
ρevi+pvi
vg
i
ρ
ρv1
ρv2
ρv3
ρe
,Fv
i=
0
τ1i
τ2i
τ3i
τi jvjqi
.(2)
Here, vg=hvg
1,vg
2,vg
3iTdenotes the velocity of the grid in Cartesian coordinates, which induces an additional flux
contribution, and δdenotes the Kronecker delta. The stress tensor τi j and the heat flux qican be written as
τi j =µ vi
xj
+vj
xi!+λ δi j
vk
xk
,(3)
qi=kT
xi
,(4)
where kdenotes the heat conductivity and Tthe static temperature. Moreover, Stokes’ hypothesis states λ=2
3µwith
µas dynamic viscosity. The equation system is closed with the perfect gas assumption which yields the equation of
state
p=ρRT =ρ(γ1)"e1
2v2
1+v2
2+v2
3#(5)
with the ratio of specific heats γand the specific gas constant R.
3. Numerical Methods
3.1. Discontinuous Galerkin Spectral Element Method
For the discontinuous Galerkin spectral element method (DGSEM), the computational domain is subdivided into
non-overlapping elements. Each element is mapped into the reference element E[1,1]3using a polynomial
mapping x=xξwith degree Ngeo as described in more detail in [46]. By defining the Jacobian of the mapping J=
det xi/∂ξjand introducing the contravariant fluxes F(again, see [46] for details), Eq. (1) can be written in the
reference space as
JξUt+ξ· F U,ξU=0.(6)
The projection of Eq. (6) onto the polynomial test space, spanned by test functions φξin the reference element E,
and subsequent integration by parts yields the weak formulation as
ZE
JξUtφξdE+ZEF · NφξdSZE
FU·ξφξdE=0,(7)
3
where Nis the outward pointing face normal vector. For DGSEM, the solution Uand the fluxes Fare approximated
by polynomial basis functions, which are the tensor product of one-dimensional nodal Lagrangian polynomials `ξk,
that satisfy the cardinal property δi j on a given set of interpolation points nξk
jowith j=0, ..., Nand kindicating the
spatial dimension. The solution and the fluxes in three dimensions can therefore be written as:
U
N
X
i,j,k=0
ˆ
Ui jk`iξ1`jξ2`kξ3,(8)
F ≈
N
X
i,j,k=0
ˆ
Fi jk`iξ1`jξ2`kξ3.(9)
The test functions are chosen identical to the basis functions and the integrals in Eq. (7) are evaluated numerically
with collocation of the integration points and interpolation points, for which the Legendre-Gauss and the Legendre-
Gauss-Lobatto points are common choices. The choice of shared nodes for both operators leads to a highly ecient
numerical scheme with significantly reduced operation counts in 2D and 3D cases. The gradients of the solution vec-
tor for evaluation of the viscous fluxes are obtained with the lifting method by Bassi and Rebay commonly referred
to as their ”first method” [47]. The single elements are only coupled weakly by the fluxes across the elements’ faces
in the surface integral, which are approximated by a Riemann solver. An appropriate Runge-Kutta method is used to
advance the solution in time.
Throughout this work all computations were carried out using Legendre-Gauss-Lobatto interpolation points in con-
junction with the Split-DG formulation by Pirozolli [48] and Roe’s approximate Riemann solver [49] with an entropy
fix by Harten and Hyman [50]. Unless stated otherwise, a fourth-order low-storage Runge-Kutta method [51] is em-
ployed for time integration. A complete description of the method, its implementation and parallelization in the code
framework FLEXI as well as validation and application examples can be found in [14].
3.2. Non-conforming Meshes
The sliding mesh approach, representing the focus of this work, naturally introduces non-conforming element in-
terfaces (sometimes also called hanging nodes) at the sub-domain interface. Even if the initial topology is conforming,
the relative movement of adjacent meshes creates a time-dependent interface architecture, in which element neighbors
constantly change. In this section, we briefly summarize the static case of such a non-conforming mesh, and extend it
to the moving case in Section 3.3.
A common approach for the coupling of non-conforming domains in spectral element schemes is the mortar method,
originally proposed by Mavriplis in [52] for the incompressible Navier-Stokes equations and applied to compressible
flows by Kopriva in [53, 54]. It was recently shown by Laughton et al. [55], that this mortar approach yields superior
accuracy in comparison to interpolation-based methods, especially for underresolved flows. In the mortar method, the
non-conforming interface is subdivided into two-dimensional mortars in such a way that each mortar has only one
adjacent element face on each side of the interface, as depicted in Figure 2. The sub-domains do not interact directly
with each other across the interface. Instead, the solution on each element face at the interface is first projected onto
its adjacent mortars. The unique fluxes across the interface are then computed on the mortars and the two mortar
fluxes corresponding to each element face are projected back onto these respective element faces on both sides of the
interface. A more detailed description of the method implemented in FLEXI can be found in e.g. [14].
For the configuration depicted in Figure 2, the solution points of the mortars and the element face line up along the
dotted lines in Figure 2(b). This is intended by design - it stems from the fact that we choose the same solution
representation on the mortars as for the usual elements. This choice entails that for the chosen tensor product basis,
the problem decomposes into individual one-dimensional operations along the dotted lines, as shown in Figure 2(c).
Therefore, only the one-dimensional case along one representative line will be considered in the following.
Projection Domain Mortar
Analogous to Eq. (8), the polynomial approximation of the solution on the domain faces kis given by
Uk
N
X
i=0
ˆ
Uk
i`i(ξ),(10)
4
(a) (b) (c)
Figure 2: Left: Schematic view of the used mortar method in three dimensions. Middle: The distribution of the solution points of the element face 1
and the mortars Ξ1,Ξ2with dotted lines to indicate alignment. Right: The resulting quasi-one-dimensional mortar configuration. Additional spacing
between the elements is added for clarification and the mortars are shown hatched.
where ξdenotes the one-dimensional coordinate on the element side. To express the solution on the mortars Ξk, we
define a local coordinate zk[1,1]such that
ξ=
σ+1σ
2z1+1for ξ > σ
1+1+σ
2z2+1for ξσ
,(11)
where σdenotes the position of the hanging node in reference space as depicted in Figure 2(c). As shown in [53],
an unweighted L2-projection from the side onto the mortars is sucient to ensure conservation for straight-edged
elements. Using the same basis functions as for the solution Uin Eq. (10) on each mortar, the left and right solution
of the mortar Ξkcan be written as
QΞk,L/R
N
X
i=0
ˆ
QΞk,L/R
i`izk.(12)
Inserting the polynomial representations Eq. (10) and Eq. (12) into the L2-projection of the solution U1on side 1
onto mortar Ξ1then reads
Z1
1
N
X
i=0
ˆ
Q1,L
i`iz1
N
X
i=0
ˆ
U1
i`i(ξ)
`jz1dz1for j=0, ..., N.(13)
Introducing the definitions
Mi j :=Z1
1
`iz1`jz1dz1for i,j=0, ..., N,(14)
S1Ξ1
i j :=Z1
1
`iξ(z1)`jz1dz1for i,j=0, ..., N,(15)
finally leads to the projection operation as
ˆ
Q1,L=M1S1Ξ1ˆ
U1:=P1Ξ1ˆ
U1,(16)
where P1Ξ1is defined as the projection matrix from face 1to mortar Ξ1. The projection from face 1onto
mortar Ξ2can be obtained accordingly. In addition, it seems worth noting that the mass matrix Mdepends only on
the used basis functions and is therefore independent of the interface configuration. Furthermore, the projection from
the element faces onto the mortars is formally exact, provided that the same polynomial degree Nfor the solution on
mortars and element faces is employed. Since the lifting routine employs a DGSEM discretization for the gradients,
these can be obtained by the same procedure as is described here. The resulting gradients on the element faces can be
projected onto the mortars analogously using Eq. (16).
5
Projection Mortar Domain
Once the left and the right solution on the mortars Ξ1and Ξ2are obtained, the left and right fluxes FL/R,Ξ1/2can
be evaluated. The resulting fluxes are projected back onto the domain face 1using an L2-projection in the form of
Zσ
1F1(ξ)− F Ξ1,Lz1`j(ξ)dξ+Z1
σF1(ξ)− F Ξ2,Lz2`j(ξ)dξ=0 for j=0, ..., N.(17)
By inserting the polynomial representation Eq. (9) of the fluxes and reordering we obtain
ˆ
F=PΞ1ˆ
FΞ1,L+PΞ2ˆ
FΞ2,L:=1σ
2M1SΞ1ˆ
FΞ1,L+1+σ
2M1SΞ2ˆ
FΞ2,L,(18)
where Mis identical to the matrix defined in Eq. (14) and the matrices SΞ1and SΞ2are the transposes of SΞ1
and SΞ2in Eq. (15), respectively.
3.3. Sliding Mesh Interface
We briefly lay out the sliding mesh idea as proposed and described in detail in [35]. At a sliding mesh interface,
two sub-domains perform a sliding relative motion along the interface. For simplicity, it can be assumed that one
sub-domain is static while the other is moving (although in principle, both sub-domains can be moving). We first
consider two-dimensional cases, i.e. where the interface is a 1D object, and we restrict ourselves to periodic interfaces.
For straight interfaces, periodicity can be ensured with periodic boundary conditions. The alternative are circular
interfaces, where one sub-domain is inside the circle, and the other outside, and the relative motion is a rotation.
Starting from an initial configuration where element faces are conforming and equi-spaced along the interface, the
Figure 3: Schematic of the straight sliding mesh interface. The gap between the domains allows to show the mortar configuration at the interface
(gray). The striped mortars are identical for periodic boundaries at the interface.
relative motion leads to a non-conforming pattern, as shown in Figure 3. We note that our algorithm can also work
with non-equispaced interface elements, but for all the applications envisioned by us, there is no reason to suggest that
such a mesh topology is necessary. We will thus stick to the described spacing for the rest of this work. Extending the
idea for the static case from Section 3.2 to the moving case follows these steps:
1. Introduce a dynamic definition of the interface configuration.
2. Introduce two-sided mortars between the non-conforming sub-domains as shown in Figure 3.
3. Interpolate the solution from the faces of both sub-domains onto these mortars.
4. Solve the Riemann problem on the mortars to obtain numerical fluxes.
5. Project the numerical flux of the mortars back onto the faces of both sub-domains.
Due to the relative motion, the interface is now necessarily dynamic, i.e. once the definition has been updated to reflect
that situation in item 1, the rest of the algorithm can follow the static procedure for this instance in time, and compute
the appropriate projection and interpolation operators. The dynamic definition of the interface has to account for two
aspects: Firstly, the size of the overlap regions of elements changes, i.e. the mortar definition has to be adjusted
accordingly. Secondly, the neighboring information across the interface is now also dynamic and changes during
the computation. The restriction to equidistant spacing at the interface has two consequences: Each element face is
always represented by two mortars (with the exception of the singular moments of conforming sub-domains, which
6
(a) (b) (c) (d)
Figure 4: Possible mesh geometries in three spatial dimensions for the described sliding mesh method. The moving sub-domains are scaled or
shifted to reveal the structured mesh at the interfaces. From left to right the interface geometries allow for translational movement, rotation with a
conical interface, rotation with a cyclindrical radial interface and rotation with a plane annular axial interface.
are in practice handled by one mortar of the size of the element faces, and the other of size 0). Moreover, the position
of the hanging nodes σin reference space is the same for all faces, such that the same interpolation and projection
matrices can be used for all faces along the interface.
For three-dimensional domains, the interfaces become two-dimensional objects. A coordinate perpendicular to the
interface movement can be introduced (in principle, the relative movement is not confined to one coordinate, but
in the present work, we restrict ourselves to this case). Perpendicular to the movement, several element layers can
be introduced, but the face mesh at the interface has to be structured. For circular and straight interfaces, three-
dimensional equivalents are shown in Figure 4 along with an axial interface of two annular sub-domains with relative
rotation. Note that for the mortar method as well as the DG scheme itself, information is only exchanged via the
surface fluxes. Hence, elements with only a single vertex (or a single edge in 3D) situated at the interface do not
interfere with the sliding mesh interface and thus do not need any special treatment. Such a non-interfering vertex/edge
at the interface can be seen in Figure 4(a).
The geometric conservation law (GCL) states that mesh movement does not induce artificial perturbations in a constant
solution [17]. For our ALE formulation, it was ensured that the underlying scheme itself satisfies the GCL by solving
the discrete GCL with a DGSEM discretization, as is discussed more detailed in [17, 56]. Since the sliding mesh
interface allows per definition only for tangential movements, it should satisfy the GCL by construction. Following
[22], it was verified numerically that the GCL is indeed fulfilled exactly for planar sliding mesh interfaces, such as
Figure 4(a) and Figure 4(d). For curved interfaces (e.g. Figure 4(b) and Figure 4(c)), the approximation of the cirular
geometry by polynomials causes minor surface normal velocity components, which are dierent for non-conforming
elements, and thus lead to perturbations in a constant solution. These are, however, miniscule for typical mesh
resolutions as is demonstrated in Section 5.1, and they diminish with the geometric order of accuracy Ngeo +1 if mesh
resolution is increased. This issue is a subject of ongoing research.
4. Parallel Sliding Mesh Implementation
With the mathematical operators for the sliding mesh interface in place, we now present a strategy for an ecient
implementation in our in-house code FLEXI and possibly other element-based high-order schemes. It is designed
to minimize the thread-level computational overhead, but more importantly to keep communication as ecient as
possible. To this end, global communication is avoided altogether and local communication is kept to a minimum by
passing only data and no metadata like indices or identifiers.
In Section 4.1, a very brief introduction to the MPI-based parallelization strategy of our baseline code is given.
The basic approach to the sliding mesh implementation is given in Section 4.2. The most challenging aspect of the
parallelization is to generate information about size and sorting of a set of data for each passed message in a setting
where the communication partners are dynamically changing. To facilitate the description of our approach to this
dynamic configuration, some index definitions are introduced in Section 4.3. On this basis, the data sorting and index
mapping is described in Section 4.4. An illustrative example for the index mapping is given in Appendix A.
7
4.1. Prerequisites: FLEXI Parallelization
In order to formulate requirements to the sliding mesh implementation, some principles of the underlying FLEXI
code are first briefly laid out. FLEXI uses a pure distributed memory (MPI) parallelization. In the mesh building pro-
cess using our in-house High-Order Pre-Processor (HOPR) [57], elements are sorted along a space filling curve [14].
During mesh decomposition in FLEXI, one or several complete DG elements are assigned to each process following
the sorting along the space filling curve. This allows to obtain compact sub-domains for each process. No element is
split between two processors. At each interface between two elements, one of the elements is defined as primary and
the other as replica with respect to that interface. We note that the connection between elements in a DG scheme is
achieved by the numerical flux function akin to a finite volume scheme. If the two elements adjacent to an interface
are handled by two dierent processes, two communication steps are to be carried out during each computation of a
DG operator to compute a common interface flux: First, the solution Uon the boundary is passed from the replica
to the primary element. The Riemann flux is calculated on the primary element and the result is passed back to the
replica element (two more analogous communication steps are required for the lifting procedure for the computation
of viscous terms). The solution Uand the fluxes Fat the boundary are stored in separate arrays for primary and
replica (yielding four arrays Uprimary,Ureplica ,Fprimary and Freplica). In these arrays, the values are sorted according
to an element interface index iface, which is assigned to each element interface during the initialization phase of the
simulation. iface is process-local.
For all element interfaces forming a process interface (i.e. where the neighboring element is handled by another
process), these interface indices iface are ordered nestedly by two criteria:
The outer sorting is by the rank of the neighboring process, i.e. all faces where the opposing elements are
handled by one specific other rank are grouped together. This ensures that the data sent to this other process is
contiguous in memory.
For each set of faces shared between two processes, it has to be ensured that the order of those faces within the
set is the same on both processes, e.g. that the face which is the first in the set on one processor is the first in
that set on the other processor as well.
In FLEXI mesh files, each face has a uniqiue global ID. It is read in for each face by every process. Simply
sorting each set of faces shared between two processes by this global face ID ensures that the data communicated
between two processes is sorted consistently.
A similar strategy will employed for the sliding mesh interface.
In order to achieve high compute throughputs without having to wait for communications to finish, latency hiding
is employed in FLEXI: The operations necessary in preparation of a communication step are always carried out first
at the earliest possible instance. Also, the communication is initiated as early as possible in a non-blocking manner.
The communication window is then filled with local arithmetic operations to give the message passing as much time
as possible to complete.
4.2. Sliding Mesh: Implementation Basics
We define the sliding mesh interfaces when building the mesh in the pre-processing stage of the simulation. In
principle, every slice in the mesh with the equi-spaced structured topology described in Section 3.3 can be defined a
sliding mesh interface with little computational overhead. However, in our envisaged applications, only one or a few
interfaces are needed.
For each sliding mesh interface, there is an adjacent static and a moving mesh sub-domain. The MPI domain decom-
position occurs in two steps: First, each process is assigned to either the sliding or the moving domain, so that no
process handles elements on both sides of a sliding mesh interface. This choice eases implementation, but also in-
creases eciency, as a processor sub-domain across a sliding mesh interface would get torn apart and lose its compact
shape due to the movement. Within each of these sub-domains, elements are already sorted along a space filling curve
during mesh generation and the elements handled by each process are assigned accordingly.
The elements belonging to the static domain are defined to be the primary elements for the sliding mesh interface.
The solution Uis first interpolated from the element faces to the mortars on both sides of the interface. The additional
mortar arrays Usm
primary and Usm
replica exist for this purpose, alongside the additional arrays Fsm
primary and Fsm
replica. The
8
1
2
3
4
1
2
34
4
3
3
2
1
0
1
0
1
Figure 5: Mortar structure and coordinate definitions for a straight
interface with periodic boundaries. The left (red) sub-domain is
moving vertically, resulting in a displacement . The right (blue)
sub-domain is static. As shown, ˜ηk=ηkfor the parallel coor-
dinates.
Figure 6: Index definition example for a straight interface. Left
(red) sub-domain is moving, right (blue) is static. Plane view: the
normal coordinate and the according index are omitted for clarity.
Note that the mortars use the index ikof the static sub-domain.
communication procedure is similar to the one for conforming faces: The solution Usm
replica is passed from the moving
(replica) to the static (primary) domain, where the Riemann flux is evaluated. The flux Fsm
replica is passed back to the
moving (replica) process and on both sides of the interface, the two mortar fluxes are projected onto the DG basis for
the respective element face. Communication hiding is employed in the same manner as for the standard conforming
element interfaces.
4.3. Parallelization: Index Definitions
In the following, some index definitions are introduced to ease the description of index mapping and sorting.
Particularly, it will be necessary to uniquely address each mortar, each static and each moving face by a set of indices.
The following definitions are also illustrated in Figures 5 and 6. Variables on the static side (and in a static un-
displaced frame of reference) are noted without an accent, while variables on the moving side (and in a frame of
reference displaced with the moving sub-domain) are marked with a tilde ˜·. Let us first consider the static and un-
displaced frame of reference: We define the coordinates ηkand ηat the sliding mesh interface, where the subscripts
kand indicate their direction relative to the mesh movement (cf. Figure 5). The faces at the interface are placed in
a structured grid, so two indices ikand ican be assigned to each static element face at the interface, numbering them
along ηkand η.
On the moving side, a parallel coordinate ˜ηkdisplaced with the moving sub-domain and an according index ˜
ikare
introduced, such that ˜ηk=ηk, where is the displacement of the moving domain. This is illustrated in Figure 5.
Since there is no displacement in the perpendicular direction, the coordinate ηand the index ican be used for the
moving domain, too, and no additional corresponding variables for the moving side have to be introduced. Faces on
the moving side are uniquely defined by the two indices ˜
ikand i. The displacement can be expressed in terms of
the number of surpassed faces nand the fraction of the currently surpassed face s, i.e.
∆ = (n+s)lk,nZ,(19)
where lkis the face length along the direction of sub-domain movement.
The mortars inherit the index from the static domain ik(as well as i). In order to uniquely define each mortar, a
third index isub ∈ {0,1}is introduced to distinguish between the two mortars adjacent to an element face. It is defined
as
isub =
0 for ηkiklk<slk,
1 else. (20)
9
Following this definition, for each face on the static side, the mortar with the smaller ηk(i.e. the ”lower” mortar in
Figure 6) has index isub =0 and the ”upper” one has isub =1, while the order is inverse on the moving side. The
indices ˜
ik,ikand isub of a mortar and the adjacent faces are finally linked via the relation
˜
ik=ikn+isub 1,(21)
which can be exemplarily verified in Figure 6 for n=1. The normal index iis of course the same for a mortar and
its adjacent element faces.
4.4. Parallelization: Mortar Sorting and Index Mapping
Communication in the presence of changing mesh topology and changing communication partners poses unique
challenges. In order to avoid additional communication, several requirements have to be met:
1. The communication partners’ ranks as well as the size of the communicated data sets need to be known a priori
by both sides.
2. The data communicated from one process to another should be contiguous in memory on both processes.
3. The data communicated from one process to another has to be sorted by universal criteria, such that the receiving
process knows beforehand how the data is sorted.
We start by addressing the first requirement. To this end, the ranks handling the elements belonging to each
static face ik,iand each moving face ˜
˜
ik,iare communicated globally during the initialization phase prior to the
actual simulation and are stored as two mapping arrays r(ik,i) and ˜r(˜
ik,i). This is the only global communication
procedure in the proposed implementation. Subsequently, all dynamic information regarding the configuration of the
interface and the partners in a communication step across the interface can be deduced without the need for further
message passing.
We now have all necessary ingredients in place to meet the above three criteria:
1. For each mortar, the ranks handling both adjacent faces are known via rand ˜r(we use Eq. (21) to translate
between ˜
ikand ik).
2. Sorting the mortars on each process by its communication partner yields contiguous data chunks to be passed.
3. The index triple (ik,i,isub) for each mortar is a globally unique criterion for the inner sorting of the communi-
cated data sets.
The sorting procedure now works as follows: An index array Ais set up, which contains a tuple of five entries
for each mortar adjacent to the faces of the own rank. On each process of the static domain, these entries are: the
FLEXI face indices iface, the ranks of the opposing moving domain ranks ˜r, the movement-parallel indices ik, the
normal indices i, and the mortar index isub. The only dierence on the moving domain is that here, the ranks of
the static processes rare stored instead of ˜r. These index arrays are then nestedly sorted by the four indices r(or ˜r,
respectively), ik,iand isub (from outer to inner in that order) using a quicksort algorithm. The FLEXI face index iface
is passively sorted by the other variables. The resulting order determines the sorting of mortar data on both the static
and the moving side. The order of the entries in Adefines a process-local index imortar , which determines the data
sorting in the arrays Usm
primary,Usm
replica,Fsm
primary and Fsm
replica.
An index array mis set up, where imortar is given for each iface and isub. It is used to store data ordered during the
interpolation and the projection steps of the mortar procedure.
The sorting and mapping process is illustrated with the help of an example in Appendix A.
Updates of the index arrays Aand mare only carried out whenever the communication structure changes, i.e.
whenever the moving sub-domain displacement surpasses a full face length. This is, in fact, the only necessary
procedure to account for the changed mesh topology. At all other time stages, mremains the same and only s
changes, so only the interpolation and projection operators have to be updated.
For extension to multiple sliding mesh interfaces, this strategy can be performed individually for each interface.
The mortar sorting can then be kept unique by introducing a new (outermost) sorting index to distinguish between
interfaces.
10
5. Results
In this section, the high-order accuracy of the implemented sliding mesh method is verified by convergence tests
for a curved interface in Section 5.1 and a straight interface in Section 5.2. We then investigate the parallel perfor-
mance of the novel method and compare it against the static baseline scheme in Section 5.3, before the method is
applied to a large scale LES test case in Section 5.4.
5.1. Isentropic Vortex
To verify that the implemented method is indeed high-order accurate for DGSEM and especially for curved inter-
faces which are approximated by high-order polynomials, we follow Zhang and Liang [35] and investigate the order
of accuracy of the method using the transport of an isentropic Euler vortex. For this two-dimensional test case, an
isentropic vortex is superimposed on a constant freestream. For details on the exact solution and the notation we refer
the reader to [35]. The vortex parameters are chosen as =1, rc=1, which can be interpreted as vortex intensity and
vortex size, respectively. At the beginning t=0, the vortex is located in the center of the domain. The freestream is
initialized with ρ=1, v=1, θ=arctan 1
2,Ma=0.3 as the freestream density, velocity magnitude, flow angle
and the freestream Mach number, respectively. The ratio of specific heats is set to γ=1.4 and the freestream pressure
is set consistent with the Mach number via the ideal gas relation.
Figure 7: Left: The coarsest mesh with 1854 elements at t=4.0, the sliding mesh interface is highlighted red. Right: Exact solution of the
isentropic vortex for the density ρat t=4.0.
Three dierent unstructured meshes with the number of elements ranging between 1854 and 15003 elements are used,
with the coarsest one depicted in Figure 7 together with the exact solution. The meshes are unstructured and quadratic
with a side length of L=20 and periodic boundary conditions. The inner sub-domain rotates with an angular velocity
of ω=0.1. All errors in Table 1 are reported at t=4.0 when the center of the vortex reaches the sliding mesh inter-
face. For all cases, the explicit time step is reduced artificially to highlight the behavior of the spatial discretization
error. The method indeed shows the expected convergence behavior for all investigated orders.
N=2 N=3 N=4 N=5
#Elem L2-Error Order L2-Error Order L2-Error Order L2-Error Order
1854 8.58e-05 — 8.60e-06 — 1.09e-06 — 1.51e-07 —
7094 1.52e-05 2.58 8.40e-07 3.47 4.94e-08 4.61 3.13e-09 5.78
15003 4.35e-06 3.34 1.65e-07 4.35 7.71e-09 4.96 3.85e-10 5.60
Table 1: L2-Errors of density ρfor the isentropic vortex at t=4.0 and the resultant orders of accuracy for several polynomial degrees N.
11
5.2. Manufactured Solution
To verify the high-order accuracy of the implemented method for fully three-dimensional flows, a smooth manu-
factured solution from [58] is investigated. The corresponding parameters are set to ω=1, α=0.1, R=287.058,
µ=0.001. This manufactured solution describes an oblique sine wave advected with a constant speed, as shown in
Figure 8. The computational domain is set to x[0,2]3with a Cartesian mesh and periodic boundary conditions,
while the central sub-domain (i.e. the sub-domain x2h2
3,4
3i) moves with a velocity of vg=[1,0,0]T, as depicted in
Figure 8. Starting from a cube with 33elements, the number of elements in each spatial direction is doubled in every
refinement step.
Figure 8: Left: Computational mesh with 63elements at t=0.3, the sliding mesh interfaces are highlighted in red. Right: Exact solution of the
manufactured solution for the density ρat t=0.3.
The time step is chosen small enough to inhibit any influence of the time integration scheme on the observed errors.
The L2-errors for the density ρat t=1.0 and the resultant orders of accuracy are reported in Table 2. As before, the
sliding mesh method retains the high-order accuracy of the DGSEM scheme as expected.
N=2 N=3 N=4 N=5
#Elem L2-Error Order L2-Error Order L2-Error Order L2-Error Order
334.21e-02 – 5.08e-03 – 3.16e-04 – 2.98e-05 –
633.82e-03 3.46 1.60e-04 4.99 9.80e-06 5.01 6.86e-07 5.44
1234.93e-04 2.95 1.02e-05 3.97 3.51e-07 4.81 1.07e-08 6.00
2436.93e-05 2.83 6.79e-07 3.91 1.13e-08 4.95 1.64e-10 6.03
4838.47e-06 3.03 4.44e-08 3.93 3.05e-10 5.21 2.36e-12 6.12
Table 2: L2-Errors of density ρfor the manufactured solution at t=1.0 and the resultant orders of accuracy for several polynomial degrees N.
5.3. Scaling Tests
Having established that the sliding mesh method produces accurate results and the interface treatment does not
introduce spurious errors, we now investigate the results of the implementation and parallelization strategy described
in Section 4. Favourable scaling behavior is essential to exploit today’s massively parallel hardware resources, which
in turn are necessary for LES of complex applications found in industry. The baseline open source code FLEXI
has been shown to scale eciently up to over 100,000 cores [14, 59], which allows to investigate the impact of the
implemented sliding mesh method on its scaling eciency. To this end, scaling tests were performed on the supercom-
puter Hazel Hen at the High-Performance Computing Center Stuttgart (HLRS). The Cray XC40-system consists of
7712 nodes, each equipped with two Intel Xeon E5-2680 v3 and 128GB of main memory. The computational meshes
for the tests are based on a cubical Cartesian mesh as shown in Figure 8 with x[0,1]3and sliding mesh interfaces
parallel to the x1x3-plane. For the refined meshes, the amount of elements of the baseline mesh with 6 ×6×6 elements
is successively doubled in x1-,x2- and x3-direction respectively, up to the finest mesh with 96 ×48 ×48 elements.
The simulation is initialized with a constant freestream and the mesh velocity is set to vg=[0,1,0]T. Every simulation
12
Figure 9: Strong and weak scaling of the sliding mesh implementation for dierent quantities of elements and computing cores on the HLRS
supercomputer Hazel Hen. In all plots, the mean over all five runs is given with the minimum and maximum as error bars. Left: the performance
index (PID) over the number of degrees of freedom (DOF) per core. Middle: strong scaling as parallel eciency over the total number of cores
for the dierent meshes. The results for the baseline open source code (OS-Flexi) without sliding mesh for the finest grid with 63·210 =221,184
elements is shown dashed for comparison. Right: weak scaling as parallel eciency for loads ranging between 36 and 288 elements per core.
is run exactly 100 time steps using a five-stage Runge-Kutta method, while only the computational time excluding
I/O and initialization is considered for the performance analysis. The communication partners at the sliding mesh
interface change at most three times during the computation. Starting from 1 computing node (24 cores) the amount
of nodes for each mesh is then doubled until the maximum of 512 nodes (12.288 cores) is reached. However, only
cases with at least 1944 DOF per core are considered, corresponding to 9 elements per core for the polynomial de-
gree N=5, which is used for the entire scaling tests. Each run is repeated five times to account for potential statistical
influences like the overall network load caused by other jobs on the system. As a consistent performance measure the
performance index PID is used, which is defined as
PID =wall-clock-time ·#cores
#DOF ·#time steps ·#RK-stages ,(22)
and indicates the averaged computation walltime per spatial degree of freedom on each core for the computation of
one Runge-Kutta stage. The results of the scaling tests are shown in three dierent plots in Figure 9. While the left
plot shows the PID over the computational load per core, the plot in the center shows the parallel eciency over the
total amount of cores as strong scaling. The parallel eciency is defined as ratio between the PID of the respective
number of cores and the PID of the baseline simulation using the minimum of 24 cores. The results for the baseline
code without sliding mesh for the finest mesh with 96 ×48 ×48 elements are also plotted dashed for comparison. The
right plot shows the weak scaling for four distinct loads per core.
For more than 104DOF per core the sliding mesh implementation shows a decrease in performance by about 15% in
comparison to the baseline code. This is mainly due to the increased workload of the implemented ALE formulation
and the additional overhead introduced by the sliding mesh method. For a decreasing amount of load per core, the
memory consumption per core decreases as well, which allows a greater share of data being placed in the CPU cache.
The baseline code can exploit these caching eects and shows significant performance increases for small loads per
core. In contrast, the overall performance of the sliding mesh code decreases significantly for low loads and large
amounts of cores. The first reason for this is the load imbalance introduced by the sliding mesh interface. With
an increasing amount of cores, the additional work at the interface is distributed to a smaller share of the cores,
which leads to higher work imbalances and to a decrease of parallel eciency. The other reason is the increased
communication eort at the sliding mesh interfaces, since the number of communication partners at the interface
potentially doubles in comparison to a conforming mesh. As described in Section 4, FLEXI relies on non-blocking
communication which is hidden by local work. For very small loads however, the processors at the interface do not
have enough local work to hide the additional communication eectively. This leads to poor scaling performance for
very small loads per core. For practical applications however, the authors have usually encountered loads of more then
13
x+y+z+
max mean std max mean std max mean std
Rotor T=0.0 36.6 23.8 8.1 2.2 1.4 0.4 26.6 16.3 4.4
T=0.5 38.2 24.2 8.1 2.5 1.4 0.4 29.4 16.7 4.5
Stator 2 T=0.0 42.8 24.1 9.5 2.2 1.3 0.4 29.1 17.6 5.1
T=0.5 44.5 25.4 9.3 2.2 1.4 0.4 29.2 18.7 5.3
Table 3: The dimensionless wall distances for the wall-nearest grid cells of the rotor blade and the second stator vane in viscous wall units. The wall
distances are normalized with the factor (N+1) to account for the multiple DOF in each direction inside a DG element. Given are the respective
maximum, mean and the standard deviation of the phase-averaged wall distances at two distinct phase angles T=0.0 and T=0.5.
10,000 DOF per core, for which the proposed method shows excellent weak and strong scaling. The performance
loss of about 15% compared to the baseline code is well within acceptable limits, and could likely be further reduced
by an a priori static load balancing.
5.4. LES of the Aachen Turbine
To demonstrate the suitability of the proposed sliding mesh implementation for large-scale simulations with the
presented high-order DG scheme, it is applied to a wall-resolved implicit LES of the 1-1/2 stage Aachen turbine test
case [42, 43, 44], which is investigated extensively in literature, e.g. [60, 61, 62, 63].
Test Case Definition
The 1-1/2 stage Aachen turbine consists of stator vanes with modified VKI design and rotor blades with a Traupel
profile [64] in a stator-rotor-stator setup. All blades are untwisted and the inner and outer diameter of the turbine are
constant. The leading edges of the two stators are not in line, but rotated circumferentially by 3. Following [62],
the original blade count of 36-41-36 is modified to a uniform blade count, which allows to reduce the computational
domain to a periodic sector with one single blade pitch per cascade. However, in contrast to [62], the blade count
is modified to 38-38-38 while retaining the blades’ original profile geometry. To further reduce the computational
cost, the turbine is approximated by a planar cascade, which neglects any influence of the casing and the eects of the
rotor’s tip clearance. The modified geometry is obtained by extruding the two-dimensional profile geometries in x3-
direction to a length of 6 mm, which corresponds to 10% of the rotor’s chord length and approximately 10.9% of the
rotor’s original blade span. The remaining geometric quantities are obtained with respect to the turbine’s mean radius
of r=272.5mm and considering the modified blade count. For the investigated operation point with a rotational speed
of 3500 rpm this leads to a planar velocity of 99.88 m/s. The inflow Mach number is Ma 0.1 and the Reynolds
number with respect to the outflow velocity and the chord length of the stator vane is Re 800,000. More details on
the test case and the turbine geometry can be found in e.g. [60].
Mesh Generation
The mesh is based on a two-dimensional unstructured mesh with structured O-type meshes around the blades,
which is extruded in x3-direction with 24 equi-spaced elements, resulting in 966,288 hexahedral elements for the
entire mesh. The sliding mesh interfaces are centered between the trailing and leading edge of consecutive blades.
The resolution at the walls is comparable to wall-resolved LES in literature [65, 66] with the dimensionless wall
distances for the present simulation given in Table 3 in viscous wall units. Our in-house high-order pre-processor
HOPR [57] code is then used to generate a fifth-order geometry approximation of the curved blade geometry and an
RBF method is used to expand the curving into the surrounding volume, as described more detailed in [14].
The boundary conditions are set periodic in x2- and x3-direction and the blade walls are modelled as adiabatic no-slip
walls. The measured inflow state at the mean diameter of the turbine from the test case’s experimental data is imposed
as Dirichlet boundary condition at the inflow position. Similarly, the measured pressure behind the second stator is
used to impose a pressure outflow condition [67]. Both states are given in Table 4. A sponge zone with ramping
function is employed before the outflow to avoid spurious reflections at the outflow, as depicted in Figure 10. More
details on the used sponge zone and ramping function can be found in [68]. While the rotor blade moves continuously
14
Figure 10: Cross section of the computational mesh at two distinct time instants. The upper figure shows the intial conforming configuration
(referred to as T=0) and the lower figure shows the mesh with a relative displacement of a half period (T=1
2). The sliding mesh interfaces are
highlighted in red and the sponge zone at the outflow is shaded blue. The magnified section exhibits details of the transition between the structured
O-grids around the blades, the equidistant mesh at the sliding mesh interface and the unstructured mesh in the remaining domain.
ρp v1v2v3
Inflow 1.7765 kg/m3157305.88 Pa 37.200 m/s2.010 m/s 0.0 m/s
Outflow 1.3651 kg/m3110357.08 Pa 55.992 m/s 160.019 m/s 0.0 m/s
Table 4: The measured inflow and outflow conditions at the mean diameter of the turbine from the test case’s experimental data. The inflow was
measured 143 mm in front of the leading edge of the first stator and the outflow state is given 8.8 mm behind the trailing edge of the second stator.
The velocity v3is set to zero at the inflow and outflow on account of the planar cascade assumption.
upwards in this planar simulation, the periodic boundary conditions cause the blade to reappear from beneath, as is
indicated in Figure 10.
Computational Setup
For the simulation, a sixth-order scheme is employed, which results in approximately 208 million spatial degrees
of freedom. A fourth-order Runge-Kutta method by Niegemann et al. [69] is used, since its optimized stability region
allows for larger time steps. The specific gas constant is set to R=287.058 J
kg·K, the viscosity to µ=1.8·105kg
m·s,
the ratio of specific heats to γ=1.4 and the Prandtl number is chosen as Pr =0.72. The LES is filtered implicitly by
the discretization with the discretization error also acting as an implicit subgrid scale model, following e.g. [15].
The computation was carried out on Hazel Hen at the HLRS with a varying amount of up to 4800 cores, resulting in
a minimum of 41,300 DOF per core. The PID was approx. 1.3µs per DOF for all considered cases and shows good
agreement with the results of the scaling test. After attaining a quasi-periodic solution, 10 periods of the turbine flow
were computed with a computational cost of about 27,000 CPU-hours per period.
Results
The time-averaged surface pressure of the vanes and the rotor blade are given in Figure 11 together with the results
of Yao et al. [62] for comparison, who conducted an URANS simulation of a three-dimensional blade passage based
on a 36-36-36 blade configuration. The results show good qualitative agreement even though the LES predicts an
overall higher pressure level than the URANS simulation. Furthermore, the LES shows excellent agreement with the
available experimental data at the blades’ trailing edges.
15
00.25 0.5 0.75 1
1
1.2
1.4
1.6
Dimensionless surface position
p/p
(a) Upstream vane
00.25 0.5 0.75 1
Dimensionless surface position
(b) Rotor
00.25 0.5 0.75 1
Dimensionless surface position
URANS [62]
LES
Experiment [42, 43, 44]
(c) Downstream vane
Figure 11: Comparison of static pressure. The results of the LES (black dashed) are compared with the unsteady pressure envelopes by Yao et
al. [62] (blue) and the pressure at the trailing edges from experimental data [42, 43, 44] (black squares).
The instantaneous flow fields for two distinct phase angles T=0 and T=1
2are given in Figure 12, with Tas the
relative rotor position and T=1 corresponding to one completed period of the rotor. The flow around the first stator
vane remains laminar with transition just before the trailing edge on the suction side, due to the strong favourable
pressure gradient. The induced vortex shedding causes aeroacoustic noise, which is also convected upstream, as
exhibited by the numerical pseudo-schlieren. The vane’s wake impinges on the rotor and wraps around the rotor’s
leading edge. The wake’s axis is then rotated counter-clockwise before it is passively advected by the freestream, as
detailed in [70]. The wake acts as perturbation in the rotor passage, creating unsteady pressure distributions on the
rotor surface, e.g. [71]. To quantify these eects and their impact on the rotor, the averaged lift force acting on the
rotor blade as well as its spectrum is given in Figure 13. The data is obtained by evaluating the lift force and its Fourier
transform on intervals with 2 periods each and averaging the obtained results. The minimum lift force is reached at
around T 0.8 when the wake contacts the suction side of the rotor. In contrast, the rotor lift force increases as the
point of impact shifts towards the rotor’s pressure side, reaching the maximum lift force at T 0.3. Interestingly,
only the third harmonic of the blade passing frequency (BPF) is distinguishable, while the second and fourth harmonic
show no considerable contribution to the lift force. The increasing amplitudes for frequencies at around 10 BPFs are
caused by the vortex shedding of the first stator vane, as their frequencies conincide.
As this initial analysis of the resulting flow field reveals, a number of non-linear interactions are triggered by the
complex stator/rotor interactions, which warrant further detailed analysis. Of particular interest will be the boundary
layer state and its interaction with the wakes, as well as the resulting load transients. As the focus of this paper is
on the methodology and the performance of the described sliding mesh implementation, a more throughout analysis
of the flow physics is subject of a separate publication [73]. However, this test case here already highlights the
potential of the presented high-order sliding mesh method for scale-resolving simulations in turbomachinery and
related applications.
6. Conclusion
In this work, we have proposed an ecient implementation and parallelization strategy of a mortar-based sliding
mesh method for high-order discontinuous Galerkin methods. The method retains the high-order accuracy of the DG
method as well as the excellent strong and weak scaling properties of the baseline code, and is thus well-prepared to
tackle problems at an industrial scale.
The challenge in designing a parallelization strategy lies in the dynamic communication structure. At the heart of
our proposed approach lies the avoidance of additional global communication as well as the passing of metadata.
Instead, only essential solution data is communicated, while process-local mapping arrays handle the identification
of communication partners at each timestep. The presented, globally unique sorting strategy keeps the message
data contiguous in memory and avoids local rearranging of the data on arrival. After validating the accuracy and
convergence property of our approach, we demonstrate that due to the careful design of the parallelization scheme,
16
T=0 : Configuration with low rotor lift T=1
2: Configuration with high rotor lift
Figure 12: Instantaneous flow field visualized with iso-surfaces of the λ2-criterion [72] colored by the Mach number in front of pseudo-schlieren
computed at x3=0 mm with an oset of half a period between the left and right figures. The lower figures show a close-up of the passage between
two rotor blades at the respective time instants.
00.511.52
2.5
2.6
2.7
2.8
Periods []
Rotor Lift [ N]
100101102
104
103
102
101
BPF Third
harmonic
Vortex shedding
of first stator
Normalized Frequency []
Amplitude [ N]
Figure 13: The lift force acting on the rotor blade. The results are averaged by dividing the obtained lift force into intervals of two rotor periods.
These intervals are then averaged to obtain the temporal evolution of the lift force on the left. Shown on the right is the lift force in the frequency
domain as average of the discrete Fourier transforms of the individual time intervals. Highlighted are the blade passing frequency (BPF), its third
harmonic and the frequency of the first stator vane’s vortex shedding. The frequency is normalized with the BPF.
the sliding mesh implementation achieves excellent strong and weak scaling. We note that the optimum load per core
is shifted slightly towards higher loads, and that the performance deteriorates for very low loads per core. This is to
be expected and will be improved in the future with additional load balancing. Compared to the baseline scheme, a
performance loss of only about 15% is incurred by the novel method, which is acceptable and allows us to conduct
17
large scale simulations with sliding mesh interfaces on the available supercomputers. We present an example of such
an application for the case of a turbine flow with stator-rotor-stator interaction. To the authors’ best knowledge, this is
the first time a high-order sliding mesh method for DG has been applied to large scale problems of industrial relevance.
In the future, we plan to apply this framework to a range of interesting cases, where high unsteadiness and non-linear
interactions pose challenging problems for traditional models like the RANS equations. A typical example of such
applications can be found in turbomachinery components, where our simulation framework can contribute to the
understanding of complex 3D flows. Beyond the pure fluid phase however, coupling the sliding mesh approach with a
particle tracking method as developed in [74] will establish simulation capabilities that can investigate particle-laden
flows in rotating geometry, which are rarely tractable with an experimental approach.
Acknowledgment
The research presented in this paper was funded by Deutsche Forschungsgemeinschaft (DFG, German Research
Foundation) under Germany’s Excellence Strategy - EXC 2075 - 390740016 and by Friedrich und Elisabeth Boysen-
Stiftung as part of the project BOY-143.
The authors gratefully acknowledge the support and the computing time on ”Hazel Hen” provided by the HLRS
through the project ”hpcdg”.
The measurements on the test case ”Aachen Turbine” were carried out at the Institute of Jet Propulsion and
Turbomachinery at RWTH Aachen University, Germany.
Appendix A. Illustrative Mortar Sorting Example
Here, the construction of the mapping arrays ˜r,Aand mlaid out in Section 4.4 is illustrated with an example.
The example setup is shown in Figure A.14 and described in the following: The considered process on the static
subdomain handles five sliding mesh sides (thick black lines). Each of them contains two sliding mesh mortars (fine
black lines). The moving subdomain moves from left to right. At the considered time instant, the adjacent sides in
the moving subdomain are handled by the processes with ranks 4 and 7, their indices are given by ˜r(Sub-Figure a).
The moving sides are outlined with red dotted lines. They are not aligned with the static sides. Furthermore, in this
example, the five sliding mesh sides handled by the considered static process have global static indices ikranging from
3 to 5 (Sub-Figure b) and iranging from 1 to 2 (Sub-Figure c). Within each side, the mortar sub-index is ascending
along the direction of movement as stated in Eq. (20) (Sub-Figure d). The side indices assigned to each side by FLEXI
are process-local and arbitrary, but continuous (Sub-Figure e).
7 4 4
7 47 7 4 4
4
(a) ˜r
3 3 4
4 53 3 4 5
4
(b) ik
2 2 2
1 11 1 1 1
2
(c) i
0 1 0
0 00 1 1 1
1
(d) isub
8 8 6
5 79 9 5 7
6
(e) iface
Figure A.14: Example indices of sliding mesh sides of a process in the static subdomain. The movement of the neighboring moving domain is
from left to right.
The ranks of all processes on the moving domain adjacent to the interface interface are stored in the array ˜r, which
is sent to every static process adjacent to the interface. The column index of ˜ris ˜
ikand the row index is i. For each
18
static side, the static process knows the static global indices ikand i. It calculates the column indices ˜
ikfrom ikusing
Eq. (21) and reads the ranks shown in Sub-Figure a from the according entries of ˜r, which are
˜r=
· · · 7 7 4 4 · · ·
· · · 744...· · ·
....
.
..
.
..
.
..
.
....
.(A.1)
Note that compared to the figure, the orientation of iis flipped.
The indices of all mortars handled by one process are stored in A. The columns of Acontain the dierent mortars,
the rows contain the dierent types of indices. For the considered example, the array Aafter sorting looks as follows:
A=
4444447777
3444553334
2122111121
1101010100
8566779985
˜r
ik
i
isub
iface
(A.2)
This is the result of sorting the mortars (that is the columns) hierarchically by the entries of the upper four rows: The
upper row ˜rdetermines the highest (outermost) sorting criterion and the fourth row isub the lowest. The entries of the
last row iface do not contain a sorting criterion and are transported passively.
The columns of Aare now indexed from left to right, that is from 1 to 10. This index is called imortar. It is depicted
in Figure A.15. For example, the upper left mortar belongs to the ninth column of A. As can be verified in the dierent
sub-figures of Figure A.14, the column entries match the indices of the upper left mortar.
9 1 3
10 57 8 2 6
4
Figure A.15: Mortar index imortar as a result of sorting the columns of A.
While the upper four rows of Aare needed to create a globally unique sorting, the lower two rows are needed
locally for the mapping from local sides to global sorting. To this end, the mapping array mis filled. It represents the
inverse mapping of the last two rows of A. Its rows correspond to isub (ranging from 0 to 1) and its columns to iface.
Note that the column index does not necessarily start at 1, but in our example ranges from 5 to 9. The entries of mare
the mortar indices imortar, such that in the considered example,
m= 103597
2 4618!.(A.3)
As a verifying example, the upper left entry of mwith row index 0 and column index 5 has the value imortar =10,
while in the tenth column of A,isub =0 and iface =5.
As described in Section 4.4, the array mis now used to store the solution and Flux on the mortars in their own arrays
Usm
primary,Usm
replica,Fsm
primary and Fsm
replica using only one index imortar to distinguish between the mortars. Its has the desired
properties described in Section 4.4 in that contiguous chunks of data are sent and received during communication and
the order of the data is globally defined and thus identical for sending and receiving process. To this end, a similar
sorting procedure is carried out on the moving subdomain, too.
References
[1] Z. Wang, K. Fidkowski, R. Abgrall, F. Bassi, D. Caraeni, A. Cary, H. Deconinck, R. Hartmann, K. Hillewaert, H. Huynh, N. Kroll, G. May,
P.-O. Persson, B. van Leer, M. Visbal, High-order CFD methods: current status and perspective, International Journal for Numerical Methods
in Fluids 72 (8) (2013) 811–845.
19
[2] X.-D. Liu, S. Osher, T. Chan, Weighted essentially non-oscillatory schemes, Journal of Computational Physics 115 (1) (1994) 200–212.
[3] D. S. Balsara, C.-W. Shu, Monotonicity preserving weighted essentially non-oscillatory schemes with increasingly high order of accuracy,
Journal of Computational Physics 160 (2) (2000) 405 – 452.
[4] J. W. L. Paul F. Fischer, S. G. Kerkemeier, nek5000 Web page, http://nek5000.mcs.anl.gov (2008).
[5] C. Cantwell, D. Moxey, A. Comerford, A. Bolis, G. Rocco, G. Mengaldo, D. De Grazia, S. Yakovlev, J.-E. Lombard, D. Ekelschot, B. Jordi,
H. Xu, Y. Mohamied, C. Eskilsson, B. Nelson, P. Vos, C. Biotto, R. Kirby, S. Sherwin, Nektar++: An open-source spectral/hp element
framework, Computer Physics Communications 192 (2015) 205–219.
[6] H. T. Huynh, A flux reconstruction approach to high-order schemes including discontinuous Galerkin methods, in: 18th AIAA Computational
Fluid Dynamics Conference, 2007, p. 4079.
[7] Y. Liu, M. Vinokur, Z. Wang, Spectral dierence method for unstructured grids I: Basic formulation, Journal of Computational Physics
216 (2) (2006) 780 – 801.
[8] Z. J. Wang, Y. Liu, G. May, A. Jameson, Spectral dierence method for unstructured grids II: Extension to the Euler equations, Journal of
Scientific Computing 32 (1) (2007) 45–71.
[9] G. May, On the connection between the spectral dierence method and the discontinuous Galerkin method, Communications in Computa-
tional Physics 9 (4) (2011) 10711080.
[10] D. De Grazia, G. Mengaldo, D. Moxey, P. E. Vincent, S. J. Sherwin, Connections between the discontinuous Galerkin method and high-order
flux reconstruction schemes, International Journal for Numerical Methods in Fluids 75 (12) (2014) 860–877.
[11] J. S. Hesthaven, T. Warburton, Nodal discontinuous Galerkin methods: algorithms, analysis, and applications, Springer Science & Business
Media, 2007.
[12] F. Hindenlang, G. J. Gassner, C. Altmann, A. Beck, M. Staudenmaier, C.-D. Munz, Explicit discontinuous Galerkin methods for unsteady
problems, Computers & Fluids 61 (2012) 86 – 93, high Fidelity Flow Simulations Onera Scientific Day.
[13] M. Atak, A. Beck, T. Bolemann, D. Flad, H. Frank, C.-D. Munz, High fidelity scale-resolving computational fluid dynamics using the high
order discontinuous Galerkin spectral element method, in: W. E. Nagel, D. H. Kr¨
oner, M. M. Resch (Eds.), High Performance Computing in
Science and Engineering ’15, Springer International Publishing, Cham, 2016, pp. 511–530.
[14] N. Krais, A. Beck, T. Bolemann, H. Frank, D. Flad, G. Gassner, F. Hindenlang, M. Homann, T. Kuhn, M. Sonntag, C.-D. Munz, FLEXI:
A high order discontinuous Galerkin framework for hyperbolic-parabolic conservation laws, Computers & Mathematics with Applications
(2020).
[15] A. D. Beck, T. Bolemann, D. Flad, H. Frank, G. J. Gassner, F. Hindenlang, C.-D. Munz, High-order discontinuous Galerkin spectral element
methods for transitional and turbulent flow simulations, International Journal for Numerical Methods in Fluids 76 (8) (2014) 522–548.
[16] T. Bolemann, A. Beck, D. Flad, H. Frank, V. Mayer, C.-D. Munz, High-order discontinuous Galerkin schemes for large-eddy simulations
of moderate Reynolds number flows, in: IDIHOM: Industrialization of High-Order Methods-A Top-Down Approach, Springer, 2015, pp.
435–456.
[17] C. A. A. Minoli, D. A. Kopriva, Discontinuous Galerkin spectral element approximations on moving meshes, Journal of Computational
Physics 230 (5) (2011) 1876 – 1902.
[18] X. Liu, N. R. Morgan, D. E. Burton, A Lagrangian discontinuous Galerkin hydrodynamic method, Computers & Fluids 163 (2018) 68–85.
[19] V. A. Dobrev, T. V. Kolev, R. N. Rieben, High-Order Curvilinear Finite Element Methods for Lagrangian Hydrodynamics, SIAM Journal on
Scientific Computing (Sep 2012).
[20] R. W. Anderson, V. A. Dobrev, T. V. Kolev, R. N. Rieben, V. Z. Tomov, High-Order Multi-Material ALE Hydrodynamics, SIAM Journal on
Scientific Computing (Jan 2018).
[21] L. Wang, P.-O. Persson, A high-order discontinuous Galerkin method with unstructured space–time meshes for two-dimensional compressible
flows on domains with large deformations, Computers & Fluids 118 (2015) 53–68.
[22] E. Gaburro, W. Boscheri, S. Chiocchetti, C. Klingenberg, V. Springel, M. Dumbser, High order direct Arbitrary-Lagrangian-Eulerian schemes
on moving Voronoi meshes with topology changes, Journal of Computational Physics 407 (2020) 109167.
[23] E. J. Caramana, The implementation of slide lines as a combined force and velocity boundary condition, Journal of Computational Physics
228 (11) (2009) 3911–3916.
[24] E. Gaburro, A unified framework for the solution of hyperbolic PDE systems using high order direct Aarbitrary-Lagrangian-Eulerian schemes
on moving unstructured meshes with topology change, Archives of Computational Methods in Engineering (2020) 1–73.
[25] G. Wang, F. Duchaine, D. Papadogiannis, I. Duran, S. Moreau, L. Y. Gicquel, An overset grid method for large eddy simulation of turboma-
chinery stages, Journal of Computational Physics 274 (2014) 333–355.
[26] J. Ahmad, E. P. Duque, Helicopter rotor blade computation in unsteady flows using moving overset grids, Journal of Aircraft 33 (1) (1996)
54–60.
[27] H. Pomin, S. Wagner, Navier-Stokes analysis of helicopter rotor aerodynamics in hover and forward flight, Journal of Aircraft 39 (5) (2002)
813–821.
[28] V. Sankaran, A. Wissink, A. Datta, J. Sitaraman, M. Potsdam, B. Jayaraman, A. Katz, S. Kamkar, B. Roget, D. Mavriplis, H. Saberi, W.-B.
Chen, W. Johnson, R. Strawn, Overview of the Helios Version 2.0 Computational Platform for Rotorcraft Simulations.
[29] F. Zahle, N. N. Sørensen, J. Johansen, Wind turbine rotor-tower interaction using an incompressible overset grid method, Wind Energy: An
International Journal for Progress and Applications in Wind Power Conversion Technology 12 (6) (2009) 594–619.
[30] E. van der Weide, G. Kalitzin, J. Schluter, J. Alonso, Unsteady turbomachinery computations using massively parallel platforms, in: 44th
AIAA Aerospace Sciences Meeting and Exhibit, 2006, p. 421.
[31] A. Bakker, R. D. LaRoche, M.-H. Wang, R. V. Calabrese, Sliding mesh simulation of laminar flow in stirred reactors, Chemical Engineering
Research and Design 75 (1) (1997) 42–44.
[32] Z. Jaworski, M. Wyszynski, I. Moore, A. Nienow, Sliding mesh computational fluid dynamics-a predictive tool in stirred tank design, Pro-
ceedings of the Institution of Mechanical Engineers, Part E: Journal of Process Mechanical Engineering 211 (3) (1997) 149–156.
[33] K. Ng, N. Fentiman, K. Lee, M. Yianneskis, Assessment of sliding mesh CFD predictions and LDA measurements of the flow in a tank stirred
by a rushton impeller, Chemical Engineering Research and Design 76 (6) (1998) 737–747.
20
[34] J. McNaughton, I. Afgan, D. Apsley, S. Rolfo, T. Stallard, P. Stansby, A simple sliding-mesh interface procedure and its application to the
CFD simulation of a tidal-stream turbine, International Journal for Numerical Methods in Fluids 74 (4) (2014) 250–269.
[35] B. Zhang, C. Liang, A simple, ecient, and high-order accurate curved sliding-mesh interface approach to spectral dierence method on
coupled rotating and stationary domains, Journal of Computational Physics 295 (2015) 147–160.
[36] B. Zhang, C. Liang, J. Yang, Y. Rong, A 2d parallel high-order sliding and deforming spectral dierence method, Computers & Fluids 139
(2016) 184 – 196, 13th USNCCM International Symposium of High-Order Methods for Computational Fluid Dynamics - A special issue
dedicated to the 60th birthday of Professor David Kopriva.
[37] Z. Qiu, B. Zhang, C. Liang, M. Xu, A high-order solver for simulating vortex-induced vibrations using the sliding-mesh spectral dierence
method and hybrid grids, International Journal for Numerical Methods in Fluids 90 (4) (2019) 171–194.
[38] E. Ferrer, R. H. Willden, A high order discontinuous Galerkin-Fourier incompressible 3d Navier-Stokes solver with rotating sliding meshes,
Journal of Computational Physics 231 (21) (2012) 7037 – 7056.
[39] L. Ramrez, C. Foulqui, X. Nogueira, S. Khelladi, J.-C. Chassaing, I. Colominas, New high-resolution-preserving sliding mesh techniques for
higher-order finite volume schemes, Computers & Fluids 118 (2015) 114 – 130.
[40] M. Wurst, M. Keßler, E. Kr¨
amer, A high-order discontinuous Galerkin Chimera method for laminar and turbulent flows, Computers & Fluids
121 (2015) 102 – 113.
[41] M. J. Brazell, J. Sitaraman, D. J. Mavriplis, An overset mesh approach for 3d mixed element high-order discretizations, Journal of Computa-
tional Physics 322 (2016) 33 – 51.
[42] H. Gallus, ERCOFTAC test case 6: axial flow turbine stage, in: Seminar and Workshop on 3D Turbomachinery flow prediction III, Les Arcs,
France, 1995.
[43] R. Walraevens, H. Gallus, Testcase 6–1-1/2 stage axial flow turbine, ERCOFTAC Testcase 6 (1997) 201–212.
[44] T. Volmar, B. Brouillet, H. Benetschik, H. Gallus, Test case 6: 1-1/2 stage axial flow turbine-unsteady computation, in: ERCOFTAC Turbo-
machinery Seminar and Workshop, 1998.
[45] C. Hirt, A. Amsden, J. Cook, An arbitrary Lagrangian-Eulerian computing method for all flow speeds, Journal of Computational Physics
14 (3) (1974) 227–253.
[46] D. A. Kopriva, Implementing spectral methods for partial dierential equations: Algorithms for scientists and engineers (2009).
[47] F. Bassi, S. Rebay, A high-order accurate discontinuous finite element method for the numerical solution of the compressible Navier-Stokes
equations, Journal of Computational Physics 131 (2) (1997) 267–279.
[48] S. Pirozzoli, Numerical methods for high-speed flows, Annual Review of Fluid Mechanics 43 (2011) 163–194.
[49] P. L. Roe, Approximate Riemann solvers, parameter vectors, and dierence schemes, Journal of Computational Physics 43 (2) (1981) 357–
372.
[50] A. Harten, J. M. Hyman, Self adjusting grid methods for one-dimensional hyperbolic conservation laws, Journal of Computational Physics
50 (2) (1983) 235 – 269.
[51] M. H. Carpenter, C. A. Kennedy, Fourth-order 2n-storage Runge-Kutta schemes, Tech. Rep. NASA-TM-109112, NASA Langley Research
Center (June 1994).
[52] C. Mavriplis, Nonconforming discretizations and a posteriori error estimators for adaptive spectral element techniques, Ph.D. thesis, Mas-
sachusetts Institute of Technology (1989).
[53] D. A. Kopriva, A conservative staggered-grid Chebyshev multidomain method for compressible flows. II. A semi-structured method, Journal
of Computational Physics 128 (2) (1996) 475–488.
[54] D. A. Kopriva, A staggered-grid multidomain spectral method for the compressible Navier-Stokes equations, Journal of Computational
Physics 143 (1) (1998) 125–158.
[55] E. Laughton, G. Tabor, D. Moxey, A comparison of interpolation techniques for non-conformal high-order discontinuous Galerkin methods
(2020). arXiv:2007.15534.
[56] N. Krais, G. Schncke, T. Bolemann, G. J. Gassner, Split form ALE discontinuous Galerkin methods with applications to under-resolved
turbulent low-Mach number flows, Journal of Computational Physics 421 (2020) 109726. doi:https://doi.org/10.1016/j.jcp.
2020.109726.
URL http://www.sciencedirect.com/science/article/pii/S0021999120305003
[57] F. Hindenlang, T. Bolemann, C.-D. Munz, Mesh curving techniques for high order discontinuous Galerkin simulations, in: IDIHOM: Indus-
trialization of high-order methods-a top-down approach, Springer, 2015, pp. 133–152.
[58] G. J. Gassner, F. L¨
orcher, C.-D. Munz, J. S. Hesthaven, Polymorphic nodal elements and their application in discontinuous Galerkin methods,
Journal of Computational Physics 228 (5) (2009) 1573–1590.
[59] C. Altmann, A. D. Beck, F. Hindenlang, M. Staudenmaier, G. J. Gassner, C.-D. Munz, An ecient high performance parallelization of a
discontinuous Galerkin spectral element method, in: Facing the Multicore-Challenge III, Springer, 2013, pp. 37–47.
[60] R. E. Walraevens, H. E. Gallus, A. R. Jung, J. F. Mayer, H. Stetter, Experimental and computational study of the unsteady flow in a 1.5 stage
axial turbine with emphasis on the secondary flow in the second stator, in: ASME 1998 International Gas Turbine and Aeroengine Congress
and Exhibition, American Society of Mechanical Engineers, 1998, pp. V001T01A069–V001T01A069.
[61] T. W. Volmar, B. Brouillet, H. E. Gallus, H. Benetschik, Time-accurate three-dimensional Navier-Stokes analysis of one-and-one-half stage
axial-flow turbine, Journal of Propulsion and Power 16 (2) (2000) 327–335.
[62] J. Yao, R. L. Davis, J. J. Alonso, A. Jameson, Massively parallel simulation of the unsteady flow in an axial turbine stage, Journal of Propulsion
and Power 18 (2) (2002) 465–471.
[63] Unsteady Simulation of a 1.5 Stage Turbine Using an Implicitly Coupled Nonlinear Harmonic Balance Method, Vol. Volume 8: Turboma-
chinery, Parts A, B, and C of Turbo Expo: Power for Land, Sea, and Air.
[64] C. Utz, Experimentelle Untersuchung der Str¨
omungsverluste in einer mehrstufigen Axialturbine, Ph.D. thesis, ETH Z ¨
urich, Switzerland
(1972). doi:10.3929/ETHZ-A- 000088584.
[65] N. Gourdain, Prediction of the unsteady turbulent flow in an axial compressor stage. part 1: Comparison of unsteady RANS and LES with
experiments, Computers & Fluids 106 (2015) 119 – 129.
21
[66] W. Rodi, DNS and LES of some engineering flows, Fluid Dynamics Research 38 (2-3) (2006) 145–173.
[67] J.-R. Carlson, Inflow/outflow boundary conditions with application to fun3d, Tech. Rep. NASA-TM2011-217181, NASA Langley Research
Center (October 2011).
[68] D. Flad, A. D. Beck, G. Gassner, C.-D. Munz, A discontinuous Galerkin spectral element method for the direct numerical simulation of
aeroacoustics, in: 20th AIAA/CEAS Aeroacoustics Conference, 2014, p. 2740.
[69] J. Niegemann, R. Diehl, K. Busch, Ecient low-storage Runge-Kutta schemes with optimized stability regions, Journal of Computational
Physics 231 (2) (2012) 364 – 372.
[70] X. Wu, P. A. Durbin, Evidence of longitudinal vortices evolved from distorted wakes in a turbine passage, Journal of Fluid Mechanics 446
(2001) 199228.
[71] J. D. Coull, H. P. Hodson, Unsteady boundary-layer transition in low-pressure turbines, Journal of Fluid Mechanics 681 (2011) 370410.
[72] J. Jeong, F. Hussain, On the identification of a vortex, Journal of Fluid Mechanics 285 (1995) 6994.
[73] P. Kopper, M. Kurz, C. Wenzel, J. D ¨
urrw¨
achter, C. Koch, A. Beck, Boundary layer dynamics in wall-resolved LES across multiple turbine
stages, manuscript submitted for publication (2020).
[74] A. Beck, P. Ortwein, P. Kopper, N. Krais, D. Kempf, C. Koch, Towards high-fidelity erosion prediction: On time-accurate particle tracking in
turbomachinery, International Journal of Heat and Fluid Flow 79 (2019) 108457.
22
... T sliding mesh method is commonly employed in turbomachinery applications to simulate the unsteady interaction between adjacent rows, in conjunction with arbitrary Lagrangian-Eulerian [1] (ALE) formulations of the Navier-Stokes equations [2] or employing the formulation of those equations in a relative frame of reference [3]. It is a method that allows relative displacements between domains keeping the mesh of each domain unchanged, which results in a very efficient method to tackle the simulation of the interaction between neighbor turbomachinery rows. ...
... One method that guarantees both excellent conservation properties and order preservation is the mortar element method [6,7]. There are many examples of its application to the study of configurations with rotating domains [2,8,9]. In this paper we present an implementation of a sliding-mesh based on the mortar element method for multiple GPUs. ...
... Even though the method works for any kind of elements and projections, in this work we are restricting ourselves to the much simpler case where both non-conformal planes contain only quadrilateral elements with the same polynomial order and with a uniform mesh spacing, as depicted in figure 1. This approach, also followed by Dürrwächter et al. [2], loses flexibility in defining the meshes at non-conformal planes in exchange for an improved computational efficiency. ...
Conference Paper
Sliding meshes are one of the most common methods to simulate the interaction between fluid domains in relative movement. A usual method to exchange data in the resulting non-conformal surface meshes is the mortar element method, which is high order accurate. We present an efficient algorithm to implement this method into a flux reconstruction compressible Navier-Stokes equations solver that runs on multiple GPUs. It is designed to handle non-conformal quadrilateral meshes that must be identical at both sliding surfaces. Moreover, the mesh spacing must be uniform in the sliding direction. We present an implementation of the GPU kernels to exchange data between the surface meshes and the mortar elements that minimizes the computational cost. We also provide details of the parallel implementation, which is based on MPI. All possible parallel connectivities are pre-computed to ease the treatment of the changing connectivity. Even though that generates a storage overhead, the improved parallel efficiency justifies this approach. The use of non-blocking asynchronous GPU to GPU parallel data transfers and the maximization of the overlap between computation and communication allow hiding the extra cost of parallel communications for large enough cases. The proposed algorithm adds roughly 1% computational overhead in serial executions, and its parallel scaling is good up to 10 6 DOFs per parallel domain on ITP Aero's GPU cluster, with an extra cost around 8% in the most demanding cases. The method is used to perform a wall-resolved implicit Large Eddy Simulation of the interaction between upstream passing bars and a Low Pressure Turbine airfoil. The comparison with experimental data shows excellent predicting capabilities.
... For example, only small progress [5,6] has been made in obtaining a general filter function that commutes with different spatial discretization operators for non-uniform unstructured meshes. In ILES, an implicit filter is determined by computational grids and numerical schemes, which makes it more flexible and universal in practical applications [7,8,9,10,11,12,13,14,15,16,17,18]. Nevertheless, the lack of an explicit filter function often makes ILES extremely complicated to conduct a rigorous analysis similar to ELES, which in turn prevents ILES from being a more widespread approach. ...
... For different numerical schemes, their discretization errors and resulting behaviors in ILES vary from each other, and some methods cannot even produce a reasonable result without further treatment [19]. In recent years, a high order discontinuous Galerkin spectral element method (DGSEM) [20,21] and its stable variants [22,23,24,13,16] have been particularly favored in ILES due to their desirable dissipation/dispersion properties and high computational efficiency in large-scale computations [25,26,27,28,16,17,18]. Apart from the success of DGSEM in practical ILES applications, several previous numerical experiments in the aspects of solution nodes choices [29], volume integration accuracy [30,24] and numerical advection/viscous flux choices [31,13] have been done to provide a deep insight into the ILES working mechanism behind. ...
... Flad and Gassner [13] implemented the split-form DGSEM into the open source code FLEXI later on. The split-form DGSEM has been widely applied to a series of practical application [16,17,18,26,27,28], but it is found to be of a low solution accuracy while a highly coarse mesh is used [13,39,40]. Within this test, we will check whether our proposed SGS model is applicable to a split-form DGSEM. ...
Preprint
Full-text available
Although implicit large eddy simulations (ILES) based on the high-order discontinuous Galerkin spectral element methods (DGSEM) have been successfully applied to many complex turbulent flows, the implicit filtering and closure mechanisms behind are still not fully clear. One major obstacle is the lack of a complete numerical analysis framework similar to a conventional explicit large eddy simulation (ELES). In this work, with an observation that the best possible numerical solution within a mesh cell is a local L 2 projection of the exact solution, we derive the intrinsic filter of DGSEM for ILES. Before deriving a subgrid-scale (SGS) term, an equivalent differential equation of DGSEM is explicitly derived. With the developed ILES analysis framework, DGSEM for ILES can be associated with a specific DGSEM for ELES, and a discretization-consistent SGS model based on the hypothesis of scale similarity is proposed. The novel SGS model is easy to implement, and most importantly, it can significantly improve the accuracy of DGSEM ILES. For validation, a commonly-used decaying homogeneous isotropic turbulence (DHIT) problem is solved numerically in our paper. The proposed SGS model is applied with two types of DGSEM to DHIT test cases with different Reynolds numbers in order to demonstrate its wide application capability. As a preliminary step to continuously develop the SGS model, we propose to improve its flexibility and accuracy through multiplying by a constant factor.
... For example, only small progress [5,6] has been made in obtaining a general filter function that commutes with different spatial discretization operators for non-uniform unstructured meshes. In ILES, an implicit filter is determined by computational grids and numerical schemes, which makes it more flexible and universal in practical applications [7,8,9,10,11,12,13,14,15,16,17,18,19]. Nevertheless, the lack of an explicit filter function often makes ILES extremely complicated to conduct a rigorous analysis similar to ELES, which in turn prevents ILES from being a more widespread approach. ...
... However, without further treatment such as imposing extra artificial viscosity, a standard high-order DGSEM tends to be unstable in under-resolved flows [40,41]. In order to stabilize under-resolved flows without adding excessive numerical dissipation, Gassner et al. [42] introduced a split-form DGSEM, which has been widely applied to a series of ILES applications [16,18,19,43,44,45]. Hence, the split-form DGSEM is chosen as the baseline of our study in the present paper. ...
Preprint
Full-text available
Although implicit large eddy simulations (ILES) based on the high order discontinuous Galerkin spectral element methods (DGSEM) have been successfully applied to many complex turbulent flows, the implicit filtering and closing (or modeling) mechanisms behind are still not fully clear. One major obstacle is the lack of a complete numerical analysis framework similar to a conventional explicit large eddy simulation. In this work, based on an equivalent way to construct DGSEM, we try to investigate the underlying LES filtering and closing procedures associate with DGSEM discretizations. With the identification of an intrinsic filter of DGSEM for LES, the resulting filtered differential equation (FDE) and subgrid-scale (SGS) term are derived accordingly. Our analysis shows that, in previous ILES based on split-form DGSEM, the SGS term is modeled solely relying on the discretization error without taking the effect of the intrinsic filter into consideration. A potential improvement for ILES can be therefore achieved by accounting for the effect of the intrinsic filter on the SGS term and employing a physical modeling approach to reduces the demand of DGSEM discretizations. In the present paper, ground on the derived FDE, an intrinsic filter-based scale similarity (IF-SS) model is proposed to approximate the SGS term. The proposed IF-SS model can be easily implemented into an existing split-form DGSEM solver, and most importantly, it improves the accuracy of split-form DGSEM for LES, especially on a highly coarse mesh. For validation, a commonly-used decaying homogeneous isotropic turbulence problem is solved numerically in our paper. Several a priori tests are conducted to investigate the properties of the intrinsic filter and the IF-SS model. Through comparing the correlation coefficient between the IF-SS model and the target SGS term in an a priori way, a free parameter of the second filtering in the IF-SS model is determined. Afterwards, several a posteriori tests are followed to further demonstrate the effectiveness of the IF-SS model in LES. To illustrate the superiority of the IF-SS model over classical eddy-viscosity models for high order DGSEM, a popular dynamic Smagorinsky model is included for comparison. At last, a viscous Taylor-Green vortex test case is solved to show the capability of the IF-SS model in the prediction of laminar-to-turbulence transition.
... In this setting, the solution from each side is first projected to shared mortar elements, followed by evaluating flux terms on the mortar space, and finally projected back to each side. This approach can also be applied to DG, as shown in [30,31,32,33], where L 2 projection is required to retain spectral convergence. In all previous publications, the projection in the mortar element method is performed between the solution degree of freedom, so the inverse mass matrix is explicitly presented in the formulation, which is expensive to compute. ...
Preprint
High-order discontinuous Galerkin spectral element methods (DGSEM) have received growing attention and development, especially in the regime of computational fluid dynamics in recent years. The inherent flexibility of the discontinuous Galerkin approach in handling non-conforming interfaces, such as those encountered in moving geometries or hp-refinement, presents a significant advantage for real-world simulations. Despite the well-established mathematical framework of DG methods, practical implementation challenges persist to boost performance and capability. Most previous studies only focus on certain choices of element shape or basis type in a structured mesh, although they have demonstrated the capability of DGSEM in complex flow simulations. This work discusses the low-cost and unified interface flux evaluation approaches for general spectral elements in unstructured meshes, alongside their implementations in the open-source spectral element framework, Nektar++. The initial motivation arises from the discretization of Helmholtz equations by the symmetric interior penalty method, in which the system matrix can easily become non-symmetric if the flux is not properly evaluated on non-conforming interfaces. We focus on the polynomial non-conforming case in this work but extending to the geometric non-conforming case is theoretically possible. Comparisons of different approaches, trade-offs, and performance of benchmark of our initial matrix-free implementation are also included, contributing to the broader discourse on high-performance spectral element method implementations.
... It employs the discontinuous Galerkin spectral element method (DGSEM), which facilitates high-order accuracy and supports fully unstructured hexahedral meshes. Developed by the Numerics Research Group (NRG) at the University of Stuttgart's Institute of Aerodynamics and Gasdynamics, FLEXI has demonstrated exceptional scalability -efficiently operating on large-scale applications on over 500 000 computing cores [4,5,9] and recently has also been adapted for GPU systems [24]. Note that unlike other studies, our sampled environments start from different points on the vortex shedding cycle, thus making the learned policy more robust to starting conditions, and providing ample variety of state-action-reward tuples for training. ...
Preprint
Full-text available
Reinforcement learning (RL) has recently gained traction for active flow control tasks, with initial applications exploring drag mitigation via flow field augmentation around a two-dimensional cylinder. RL has since been extended to more complex turbulent flows and has shown significant potential in learning complex control strategies. However, such applications remain computation-ally challenging owing to its sample inefficiency and associated simulation costs. This fact is worsened by the lack of generalization capabilities of these trained policy networks, often being implicitly tied to the input configurations of their training conditions. In this work, we propose the use of graph neural networks (GNNs) to address this particular limitation, effectively increasing the range of applicability and getting more value out of the upfront RL training cost. GNNs can naturally process unstructured, three-dimensional flow data, preserving spatial relationships without the constraints of a Cartesian grid. Additionally, they incorporate rotational, reflectional, and permutation invariance into the learned control policies, thus improving generalization and thereby removing the shortcomings of commonly used convolutional neural networks (CNNs) or multilayer perceptron (MLP) architectures. To demonstrate the effectiveness of this approach, we revisit the well-established two-dimensional cylinder benchmark problem for active flow control. The RL training is implemented using Relexi, a high-performance RL framework, with flow simulations conducted in parallel using the high-order discontinuous Galerkin framework FLEXI. Our results show that GNN-based control policies achieve comparable performance to existing methods while benefiting from improved generalization properties. This work establishes GNNs as a promising architecture for RL-based flow control and highlights the capabilities of Relexi and FLEXI for large-scale RL applications in fluid dynamics.
... This special variation of the DG method is based on a nodal tensor-product basis with collocated integration and interpolation points on unstructured and even non-conforming hexahedral elements, allowing for very efficient dimension-by-dimension element-wise operations. A split-form DGSEM [29] is particularly favored due to its high robustness in the simulation of under-resolved turbulent flows [30][31][32][33][34][35]. Recently, WMLES approaches based on an equilibrium wall-stress model were implemented and validated in the context of the split-form DGSEM [36]. ...
Preprint
Full-text available
The use of a tensor-product basis function in a standard discontinuous Galerkin spectral element method (DGSEM) restricts itself to a hexahedral mesh in three dimensions. Besides, a DGSEM poses strict requirements on the mesh quality, e.g. the Jacobian determinant due to the element mapping should be positive. In a wall-modeled large eddy simulation (WMLES), the preference of an approximately-isotropic mesh near wall makes a tet-to-hex meshing technique (splitting one tetrahedron to four hexahedra) well fit into a DGSEM. In the present study, a tet-to-hex meshing approach is employed to increase the geometric adaptability of a DGSEM for WMLES. Moreover, an accelerated implicit time integration scheme is used to reduce the computational cost. After validated by a benchmark turbulent channel flow problem, the NASA High-Lift Common Research Model (CRM-HL) at an angle of attack of 19.57° from the 4th AIAA High-Lift Prediction Workshop is computed. The results are encouraging: the third order WMLES for the CRM-HL costs around 2.0 million CPU hours, and shows favorable agreement with the experiment in terms of integrated loads and flow-separation patterns. In addition, an obvious superiority of a high order WMLES over a low order WMLES in accuracy and efficiency can be seen.
... Its robustness and accuracy in LES were demonstrated by Flad and Gassner [36] based on a canonical decaying homogeneous isotropic turbulence test case. Later on, the split-form DGSEM was successfully applied to a series of practical LES applications [37,38,39,40,41,42,43,44]. Hence, the split-form DGSEM is chosen as the baseline of our study in the present paper. ...
Preprint
Full-text available
According to the past numerical experiments, with a regular LES mesh, implicit large eddy simulations (ILES) based on the split-form discontinuous Galerkin spectral element methods (DGSEM) are capable of ensuring a good accuracy in wall-bounded turbulent flows. However, as mesh resolution reduces, the solution quality of ILES declines rapidly. A common practice is to use an eddy-viscosity subgrid-scale (SGS) model. Apart from its capability of modeling the effect of unresolved small scales, an eddy-viscosity model is beneficial for computational stability due to its dissipative character. Nevertheless, numerous numerical experiments in wall-bounded turbulent flows show that LES using DGSEM with an eddy-viscosity model fails to produce a satisfying result, especially on a highly coarse mesh. In this work, on the basis of an identified intrinsic filter of split-form DGSEM for LES [1], we propose an intrinsic filter-based mixed (IF-M) model to improve LES modeling accuracy. Within an IF-M model, an intrinsic filter-based scale-similarity (IF-SS) model [1] is used in combination with an eddy-viscosity model. As an example, the popular Vreman model is used in particular. The central idea is to explore their complementary characteristics in SGS modeling. In the meanwhile, a theoretical analysis framework for the IF-M model is developed for the investigation of its properties and working mechanisms behind. At last, based on a canonical incompressible turbulent channel flow, a series of a priori and a posteriori tests are conducted to show the modeling accuracy of the IF-M model.
Article
CFD investigation is conducted in the present study to analyse the effect of wind lenses on the aerodynamic performances of a vertical-axis wind turbine (VAWT). the case of a 2D Savonius bi- blade is considered, where different configurations of wind lenses are studied and compared. To simulate the fluid flow over a rotating wind turbine, unsteady 2D numerical computations are carried out to solve the Reynolds Averaged Navier Stokes equations (URANS), where the turbulence is modeled by the SST k− ω model. The results demonstrate an increase in the power coefficient across various tip speed ratios when wind lenses are utilised. Additionally, the study analyses the wake patterns generated by both conventional open turbines and turbines integrated with wind lenses.
Article
The aim of this study is to evaluate the aerodynamic efficiency of a Savonius vertical-axis wind turbine. The approach used relies on resolving the Unsteady Reynolds Averaged Navier-Stokes equations (URANS), the turbulence being modeled by the k-ω SST model. The flow around the wind turbine is simulated using the arbitrary sliding interfaces technique. First, the study investigates the impact of blade shape on wind turbine efficiency by examining seven Savonius rotors constructed with distinct blade configurations. The results indicate that the highest aerodynamic performance is provided by the rotor with the elliptical blades, with a notable increase in the power coefficient of about 80% in comparison to the classic semi-circular profile. To further enhance the efficiency of the Savonius wind turbine, a twin-rotor configuration using the elliptical blades was studied. The results indicate a further enhancement in the power coefficient, reaching 110% compared to a single rotor with semicircular blades.
Thesis
With the arrival of the digital era of manufacturing, numerical simulation technology plays an important role as a powerful research tool in all aspects of the optimization design of internal combustion engines. The working process of internal combustion engines is a complex process of coupling of flow, heat transfer and chemical reaction, and there are also the effects of multiple moving components such as valves and piston, making the three-dimensional thermal performance simulation of internal combustion engines more complicated. In the context of localization and self-reliance of CAE software, the development of CFD numerical simulation software and its key generic algorithms for internal combustion engines is of strategic significance for aerospace, aeronautics, and marine engineering. In this study, a numerical calculation method for in-cylinder flow and spray combustion under the effect of inlet and exhaust gas flow and its full-flow general parallel code are developed based on in-house platform GTEA (General Transport Equation Analyzer), a general equation solver independently developed by the laboratory. 随着制造业数字化时代的到来,数值仿真技术作为一种强有力的研究手段在内燃机优化设计各个环节中发挥着重要作用。内燃机的缸内过程是流动、传热和化学反应耦合的复杂作用过程,而且还存在气阀及活塞等多组运动部件的影响,使得内燃机三维热力性能仿真更加复杂。在CAE工业软件国产自主化的迫切需求下,开发内燃机相关的CFD数值仿真软件及其关键性通用算法对航空、航天、航海工程具有重要的战略意义。 本文基于课题组自主研发的通用方程求解器GTEA(General Transport Equation Analyzer),发展了一种进排气影响下的缸内流动与喷雾燃烧数值计算方法及其全流程通用并行数值模拟程序。
Article
Full-text available
The capability to incorporate moving geometric features within models for complex simulations is a common requirement in many fields. Fluid mechanics within aeronautical applications, for example, routinely feature rotating (e.g. turbines, wheels and fan blades) or sliding components (e.g. in compressor or turbine cascade simulations). With an increasing trend towards the high-fidelity modelling of these cases, in particular combined with the use of high-order discontinuous Galerkin methods, there is therefore a requirement to understand how different numerical treatments of the interfaces between the static mesh and the sliding/rotating part impact on overall solution quality. In this article, we compare two different approaches to handle this non-conformal interface. The first is the so-called mortar approach, where flux integrals along edges are split according to the positioning of the non-conformal grid. The second is a less-documented point-to-point interpolation method, where the interior and exterior quantities for flux evaluations are interpolated from elements lying on the opposing side of the interface. Although the mortar approach has significant advantages in terms of its numerical properties, in that it preserves the local conservation properties of DG methods, in the context of complex 3D meshes it poses notable implementation difficulties which the point-to-point method handles more readily. In this paper we examine the numerical properties of each method, focusing not only on observing convergence orders for smooth solutions, but also how each method performs in under-resolved simulations of linear and nonlinear hyperbolic problems, to inform the use of these methods in implicit large-eddy simulations.
Article
Full-text available
In this work, we review the family of direct Arbitrary-Lagrangian–Eulerian (ALE) finite vlume (FV) and discontinuous Galerkin (DG) schemes on moving meshes that at each time step are rearranged by explicitly allowing topology changes, in order to guarantee a robust mesh evolution even for high shear flow and very long evolution times. Two different techniques are presented: a local nonconforming approach for dealing with sliding lines, and a global regeneration of Voronoi tessellations for treating general unpredicted movements. Corresponding elements at consecutive times are connected in space-time to construct closed space-time control volumes, whose bottom and top faces may be polygons with a different number of nodes, with different neighbors, and even degenerate space-time sliver elements. Our final ALE FV-DG scheme is obtained by integrating, over these arbitrary shaped space-time control volumes, the space-time conservation formulation of the governing hyperbolic PDE system: so, we directly evolve the solution in time avoiding any remapping stage, being conservative and satisfying the GCL by construction. Arbitrary high order of accuracy in space and time is achieved through a fully discrete one-step predictor–corrector ADER approach, also integrated with well balancing techniques to further improve the accuracy and to maintain exactly even at discrete level many physical invariants of the studied system. A large set of different numerical tests has been carried out in order to check the accuracy and the robustness of our methods for both smooth and discontinuous problems, in particular in the case of vortical flows.
Article
Full-text available
We present a new family of very high order accurate direct Arbitrary-Lagrangian-Eulerian (ALE) Finite Volume (FV) and Discontinuous Galerkin (DG) schemes for the solution of nonlinear hyperbolic PDE systems on moving two-dimensional Voronoi meshes that are regenerated at each time step and which explicitly allow topology changes in time. The Voronoi tessellations are obtained from a set of generator points that move with the local fluid velocity. We employ an AREPO-type approach (Springel, MNRAS 2010), which rapidly rebuilds a new high quality mesh exploiting the previous one, but rearranging the element shapes and neighbors in order to guarantee that the mesh evolution is robust even for vortex flows and for very long simulation times. The old and new Voronoi elements associated to the same generator point are connected in space--time to construct closed space--time control volumes, whose bottom and top faces may be polygons with a different number of sides. We also have to incorporate some degenerate space--time sliver elements, which are needed in order to fill the space--time holes that arise because of the topology changes in the mesh between time t^n and time t^(n+1). The final ALE FV-DG scheme is obtained by a novel redesign of the high order accurate fully discrete direct ALE schemes of Boscheri and Dumbser, which have been extended here to general moving Voronoi meshes and space--time sliver elements. Our new numerical scheme is based on the integration over arbitrary shaped closed space--time control volumes combined with a fully-discrete space--time conservation formulation of the governing hyperbolic PDE system. In this way the discrete solution is conservative and satisfies the geometric conservation law (GCL) by construction. Numerical convergence studies as well as a large set of benchmark problems for hydrodynamics and magnetohydrodynamics (MHD) demonstrate the accuracy and robustness of the proposed method. Our numerical results clearly show that the new combination of very high order schemes with regenerated meshes that allow topology changes in each time step lead to substantial improvements compared to direct ALE methods on moving conforming meshes without topology change.
Article
Full-text available
We present a high‐order solver for simulating vortex‐induced vibrations (VIVs) at very challenging situations. For example, VIVs of a row of very closely placed objects with large relative displacements. This solver works on unstructured hybrid grids by employing the high‐order tensor‐product spectral difference (SD) method for quadrilateral grids and the Raviart‐Thomas SD method for triangular grids. To deal with the challenging situations where a traditional conforming moving mesh is incapable, we split a computational domain into non‐overlapping subdomains, where each interior subdomain encloses an object and moves freely with respect to its neighbors. A nonuniform sliding‐mesh method that ensures high‐order accuracy is developed to deal with sliding interfaces between subdomains. A monolithic approach is adopted to seamlessly couple the fluid and the solid vibration equations. Moreover, the solver is parallelized to further improve its efficiency on distributed‐memory computers. Through a series of numerical tests, we demonstrate that this solver is high‐order accurate for both inviscid and viscous flows and has good parallel efficiency, making it ideal for VIV studies.
Article
Modern turbomachinery relies on accurate prediction of the flow and especially the state of turbulence to achieve the required level of performance. Transition, relaminarization, wake interactions, and interrow influence form complex, highly unsteady flow patterns. Large eddy simulation (LES) emerges as a promising method to deliver improved accuracy over Reynolds-averaged Navier-Stokes (RANS) approaches as the major energy-carrying scales are fully resolved. However, for wall-bounded flows, modeling of (parts of) the boundary layer might still be inevitable to keep the computational costs manageable. In this paper, we aim to characterize and analyze the boundary-layer state in such a scenario. We employ the high-order discontinuous Galerkin spectral element method to perform a wall-resolved LES of a stator-rotor-stator cascade (Ma up to 0.65, Re up to 8.0×105). The interfaces between the blade rows are treated with a high-order accurate sliding mesh approach. Time-averaged performance characteristics are compared against experimental and numerical data. The temporal evolution of the solution is first assessed through the phase-averaged flow field at different stator-rotor positions. Subsequently, special emphasis is placed on the spatiotemporal evolution of turbulence near the blade surface. Boundary-layer profile and energy spectra analysis are used to give insight into the turbulence development. This investigation not only reveals the complexities of the boundary-layer dynamics but can also serve as benchmark and reference for evaluation and development of wall-modeled LES approaches for turbomachinery applications.
Article
The construction of discontinuous Galerkin (DG) methods for the compressible Euler or Navier-Stokes equations (NSE) includes the approximation of non-linear flux terms in the volume integrals. The terms can lead to aliasing and stability issues in turbulence simulations with moderate Mach numbers (Ma≲0.3), e.g. due to under-resolution of vortical dominated structures typical in large eddy simulations (LES). The kinetic energy or entropy are elevated in smooth, but under-resolved parts of the solution which are affected by aliasing. It is known that the kinetic energy is not a conserved quantity for compressible flows, but for small Mach numbers minor deviations from a conserved evolution can be expected. While it is formally possible to construct kinetic energy preserving (KEP) and entropy conserving (EC) DG methods for the Euler equations, due to the viscous terms in case of the NSE, we aim to construct kinetic energy dissipative (KED) or entropy stable (ES) DG methods on moving curved hexahedral meshes. The Arbitrary Lagrangian-Eulerian (ALE) approach is used to include the effect of mesh motion in the split form DG methods. First, we use the three dimensional Taylor-Green vortex to investigate and analyze our theoretical findings and the behavior of the novel split form ALE DG schemes for a turbulent vortical dominated flow. Second, we apply the framework to a complex aerodynamics application. An implicit LES split form ALE DG approach is used to simulate the transitional flow around a plunging SD7003 airfoil at Reynolds number Re=40,000 and Mach number Ma=0.1. We compare the standard nodal ALE DG scheme, the ALE DG variant with consistent overintegration of the non-linear terms and the novel KED and ES split form ALE DG methods in terms of robustness, accuracy and computational efficiency.
Article
High order (HO) schemes are attractive candidates for the numerical solution of multiscale problems occurring in fluid dynamics and related disciplines. Among the HO discretization variants, discontinuous Galerkin schemes offer a collection of advantageous features which have lead to a strong increase in interest in them and related formulations in the last decade. The methods have matured sufficiently to be of practical use for a range of problems, for example in direct numerical and large eddy simulation of turbulence. However, in order to take full advantage of the potential benefits of these methods, all steps in the simulation chain must be designed and executed with HO in mind. Especially in this area, many commercially available closed-source solutions fall short. In this work, we therefore present the FLEXI framework, a HO consistent, open-source simulation tool chain for solving the compressible Navier–Stokes equations on CPU clusters. We describe the numerical algorithms and implementation details and give an overview of the features and capabilities of all parts of the framework. Beyond these technical details, we also discuss the important but often overlooked issues of code stability, reproducibility and user-friendliness. The benefits gained by developing an open-source framework are discussed, with a particular focus on usability for the open-source community. We close with sample applications that demonstrate the wide range of use cases and the expandability of FLEXI and an overview of current and future developments.
Article
Erosion and fouling caused by ingested particles causes performance degradation and safety issues in turbo-machinery components. Simulating these processes is a complex multiphysics and multiscale problem which has not reached a satisfactory level of maturity yet. The current state of the art approach is based on RANS solutions, which provide an averaged carrier phase on which the particles are advanced in an a posteriori manner. Upon wall impact, the particle quantities are then fed to the erosion and rebound models. In this work, we present as an alternative to this approach an Eulerian/Lagrangian simulation framework of high-order accuracy in space and time for the time-resolved prediction of particle motion in complex flows. We apply it to the LES of a particle-laden flow over a T106C low pressure turbine linear cascade. We then contrast time-averaged and time-accurate flow fields as carrier phases for the particles and highlight how the associated modeling assumptions influence the solution. Based on the particle Stokes number, we identify characteristic regimes and their interaction with the flow phase. By a detailed comparison of the particle statistics, we highlight the effects of turbulent small scale behavior and define the modeling challenges associated with finding accurate particle closure models for time-averaged simulations. This framework constitutes a first step towards high-fidelity erosion prediction for turbomachinery applications.
Article
We present a new approach for multi-material arbitrary Lagrangian–Eulerian (ALE) hydrodynamics simulations based on high-order finite elements posed on high-order curvilinear meshes. The method builds on and extends our previous work in the Lagrangian [V. A. Dobrev, T. V. Kolev, and R. N. Rieben, SIAM J. Sci. Comput., 34 (2012), pp. B606–B641] and remap [R. W. Anderson et al., Internat. J. Numer. Methods Fluids, 77 (2015), pp. 249–273] phases of ALE, and depends critically on a functional perspective that enables subzonal physics and material modeling [V. A. Dobrev et al., Internat. J. Numer. Methods Fluids, 82 (2016), pp. 689–706]. Curvilinear mesh relaxation is based on node movement, which is determined through the solution of an elliptic equation. The remap phase is posed in terms of advecting state variables between two meshes over a fictitious time interval. The resulting advection equation is solved by a discontinuous Galerkin (DG) formulation, combined with a customized Flux Corrected Transport (FCT) type algorithm. Because conservative fields are remapped, additional synchronization steps are introduced to preserve bounds with respect to primal fields. These steps include modification of the low-order FCT solutions, definition of conservative FCT fluxes based on primal field bounds, and monotone transitions between primal and conservative fields. This paper describes the mathematical formulation and properties of our approach and reports a number of numerical results from its implementation in the BLAST code [BLAST: High-order finite element Lagrangian hydrocode, http://www.llnl.gov/CASC/blast]. Additional details can be found in [R. W. Anderson et al., High-Order Multi-Material ALE Hydrodynamics (Extended Version), Tech. report LLNL-JRNL-706339, Lawrence Livermore National Laboratory, Livermore, CA, 2016].
Article
We present a new Lagrangian discontinuous Galerkin (DG) hydrodynamic method for solving the two-dimensional gas dynamic equations on unstructured hybrid meshes. The physical conservation laws for the momentum and total energy are discretized using a DG method based on linear Taylor expansions. Three different approaches are investigated for calculating the density variation over the element. The first approach evolves a Taylor expansion of the specific volume field. The second approach follows certain finite element methods and uses the strong mass conservation to calculate the density field at a location inside the element or on the element surface. The third approach evolves a Taylor expansion of the density field. The nodal velocity, and the corresponding forces, are explicitly calculated by solving a multidirectional approximate Riemann problem. An effective limiting strategy is presented that ensures monotonicity of the primitive variables. This new Lagrangian DG hydrodynamic method conserves mass, momentum, and total energy. Results from a suite of test problems are presented to demonstrate the robustness and expected second-order accuracy of this new method.