Technical ReportPDF Available

Abstract and Figures

We consider the space-time discretization of the (linear) anisotropic diffusion equation, using an isogeometric analysis (IgA) approximation in space and a discontinuous Galerkin (DG) approximation in time. Drawing inspiration from a former spectral analysis, we propose for the resulting space-time linear system a new solution method combining a suitable preconditioned GMRES (PGMRES) algorithm with a few iterations of an appropriate multigrid method. The performance of our new solution method is illustrated through numerical experiments, which show its competitiveness in terms of robustness, run-time and parallel scaling. Accepted for publication on J. Sci. Comp.
Content may be subject to copyright.
Journal of Scientific Computing manuscript No.
(will be inserted by the editor)
Fast Parallel Solver for the Space-Time IgA-DG Discretization
of the Diffusion Equation
Pietro Benedusi ·Paola Ferrari ·Carlo Garoni ·
Rolf Krause ·Stefano Serra-Capizzano
Received: date / Accepted: date
Abstract We consider the space-time discretization of the diffusion equation, using an iso-
geometric analysis (IgA) approximation in space and a discontinuous Galerkin (DG) ap-
proximation in time. Drawing inspiration from a former spectral analysis, we propose for
the resulting space-time linear system a multigrid preconditioned GMRES method, which
combines a preconditioned GMRES with a standard multigrid acting only in space. The per-
formance of the proposed solver is illustrated through numerical experiments, which show
its competitiveness in terms of iteration count, run-time and parallel scaling.
Keywords isogeometric analysis ·discontinuous Galerkin ·preconditioned GMRES ·
multigrid ·parallel solver ·spectral distribution ·diffusion equation
Mathematics Subject Classification (2010) 65M60 ·65F08 ·65M55 ·65Y05 ·47B06 ·
35Q79
1 Introduction
In recent years, with ever increasing computational capacities, space-time methods have re-
ceived fast growing attention from the scientific community. Space-time approximations of
dynamic problems, in contrast to standard time-stepping techniques, enable full space-time
parallelism on modern massively parallel architectures [27]. Moreover, they can naturally
P. Benedusi, R. Krause
University of Italian Switzerland (USI), Euler Institute, Lugano, Switzerland
E-mail: pietro.benedusi@usi.ch, rolf.krause@usi.ch
P. Ferrari
University of Insubria, Department of Science and High Technology, Como, Italy
E-mail: pferrari@uninsubria.it
C. Garoni
University of Rome Tor Vergata, Department of Mathematics, Rome, Italy
E-mail: garoni@mat.uniroma2.it
S. Serra-Capizzano
University of Insubria, Department of Humanities and Innovation, Como, Italy, and Uppsala University,
Department of Information Technology, Division of Scientific Computing, Uppsala, Sweden
E-mail: stefano.serrac@uninsubria.it, stefano.serra@it.uu.se
2 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
deal with moving domains [38,57,58,59,63] and allow for space-time adaptivity [1,24,28,
39,47,49,61]. The main idea of space-time formulations is to consider the temporal dimen-
sion as an additional spatial one and assemble a large space-time system to be solved in
parallel as in [25]. Space-time methods have been used in combination with various numer-
ical techniques, including finite differences [2,11, 35], finite elements [4,26,37,40], isoge-
ometric analysis [34,41], and discontinuous Galerkin methods [1,16,32, 37, 38, 48,57,63].
Moreover, they have been considered for a variety of applications, such as mechanics [15],
fluid dynamics [11,38,54], fluid-structure interaction [60], and many others. When dealing
with space-time finite elements, the time direction needs special care. To ensure that the in-
formation flows in the positive time direction, a particular choice of the basis in time is often
used. The discontinuous Galerkin formulation with an “upwind” flow is a common choice
in this context; see, for example, [38,51, 57,62].
Specialized parallel solvers have been recently developed for the large linear systems
arising from space-time discretizations. We mention in particular the space-time parallel
multigrid proposed by Gander and Neum¨
uller [29], the parallel preconditioners for space-
time isogeometric analysis proposed by Hofer et al. [34], the fast diagonalization techniques
proposed by Langer and Zank [42] and Loli et al. [44], and the parallel proposal by Mc-
Donald and Wathen [46]. We also refer the reader to [56] for a recent review on space-time
methods for parabolic evolution equations, and to [55] for algebraic multigrid methods.
In the present paper, we focus on the diffusion equation
tu(t,x)·K(x)u(t,x) = f(t,x),(t,x)(0,T)×(0,1)d,
u(t,x) = 0,(t,x)(0,T)×((0,1)d),
u(t,x) = 0,(t,x)∈ {0} × (0,1)d,
(1.1)
where K(x)Rd×dis the matrix of diffusion coefficients and f(t,x)is a source term. It is as-
sumed that K(x)is symmetric positive definite at every point x(0,1)dand each component
of K(x)is a continuous bounded function on (0,1)d. We impose homogeneous Dirichlet ini-
tial/boundary conditions both for simplicity and because the inhomogeneous case reduces
to the homogeneous case by considering a lifting of the boundary data [50]. We consider
for (1.1) the same space-time approximation as in [10], involving a p
p
p-degree Ck
k
kisogeo-
metric analysis (IgA) discretization in space and a q-degree discontinuous Galerkin (DG)
discretization in time. Here, p
p
p= (p1,..., pd)and k
k
k= (k1,...,kd), where 0k
k
kp
p
p1(i.e.,
0kipi1 for all i=1,...,d) and the parameters piand kirepresent, respectively, the
polynomial degree and the smoothness of the IgA basis functions in direction xi.
The overall discretization process leads to solving a large space-time linear system. We
propose a fast solver for this system in the case of maximal smoothness k
k
k=p
p
p1, i.e., the
case corresponding to the classical IgA paradigm [3,9, 17,36]. The solver is a preconditioned
GMRES (PGMRES) method whose preconditioner ˜
Pis obtained as an approximation of
another preconditioner Pinspired by the spectral analysis carried out in [10]. Informally
speaking, the preconditioner ˜
Pis a standard multigrid, which is applied only in space and not
in time, and which involves, at all levels, a single symmetric Gauss–Seidel post-smoothing
step and standard bisection for the interpolation and restriction operators (following the
Galerkin assembly). The proposed solver is then a multigrid preconditioned GMRES (MG-
GMRES). Its performance is illustrated through numerical experiments and turns out to be
satisfactory in terms of iteration count and run-time. In addition, the solver is suited for
parallel computation as it shows remarkable scaling properties with respect to the number
of cores. Comparisons with other benchmark solvers are also presented and reveal the actual
competitiveness of our proposal.
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 3
The paper is organized as follows. In Section 2, we briefly recall the space-time IgA-
DG discretization of (1.1) and we report the main result of [10] concerning the spectral
distribution of the associated discretization matrix C. In Section 3, we present a PGMRES
method for the matrix C, which is the root from which the proposed solver originated. In
Section 4, we describe the proposed solver. In Section 5, we describe its parallel version. In
Section 6, we illustrate its performance in terms of iteration count, run-time and scaling. In
Section 7, we test it on a generalization of problem (1.1) where (0,1)dis replaced by a non-
rectangular domain and the considered IgA discretization involves a non-trivial geometry.
In Section 8, we draw conclusions. In order to keep this paper as concise as possible, we
borrow notation and terminology from [10]. It is therefore recommended that the reader
takes a look at Sections 1 and 2 of [10].
2 Space-Time IgA-DG Discretization of the Diffusion Equation
Let NNand n
n
n= (n1,...,nd)Nd, and define the following uniform partitions in time
and space:
ti=it,i=0,...,N,t=T/N,
xi
i
i=i
i
ix= (i1x1,...,idxd),i
i
i=0,...,n
n
n,x= (x1,...,xd) = (1/n1,...,1/nd).
We consider for the differential problem (1.1) the same space-time discretization as in [10],
i.e., we use a p
p
p-degree Ck
k
kIgA approximation in space based on the uniform mesh {xi
i
i,i
i
i=
0,...,n
n
n}and a q-degree DG approximation in time based on the uniform mesh {ti,i=
0,...,N}. Here, p
p
p= (p1,..., pd)and k
k
k= (k1,...,kd)are multi-indices, with piand 0
kipi1 representing, respectively, the polynomial degree and the smoothness of the IgA
basis functions in direction xi. As explained in [10, Section 3], the overall discretization
process leads to a linear system
C[q,p
p
p,k
k
k]
N,n
n
n(K)u=f,(2.1)
where:
C[q,p
p
p,k
k
k]
N,n
n
n(K)is the N×Nblock matrix given by
C[q,p
p
p,k
k
k]
N,n
n
n(K) =
A[q,p
p
p,k
k
k]
n
n
n(K)
B[q,p
p
p,k
k
k]
n
n
nA[q,p
p
p,k
k
k]
n
n
n(K)
......
B[q,p
p
p,k
k
k]
n
n
nA[q,p
p
p,k
k
k]
n
n
n(K)
; (2.2)
the blocks A[q,p
p
p,k
k
k]
n
n
n(K)and B[q,p
p
p,k
k
k]
n
n
nare (q+1)¯n×(q+1)¯nmatrices given by
A[q,p
p
p,k
k
k]
n
n
n(K) = K[q]Mn
n
n,[p
p
p,k
k
k]+t
2M[q]Kn
n
n,[p
p
p,k
k
k](K),(2.3)
B[q,p
p
p,k
k
k]
n
n
n=J[q]Mn
n
n,[p
p
p,k
k
k],(2.4)
where ¯n=d
i=1(ni(piki)+ ki1)is the number of degrees of freedom (DoFs) in space
(the total number of DoFs is equal to the size N(q+1)¯nof the matrix C[q,p
p
p,k
k
k]
N,n
n
n(K)); each
block row in the block partition of C[q,p
p
p,k
k
k]
N,n
n
n(K)given by (2.2) is referred to as a time slab;
4 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
Mn
n
n,[p
p
p,k
k
k]and Kn
n
n,[p
p
p,k
k
k](K)are the ¯nׯnmass and stiffness matrices in space, which are given
by
Mn
n
n,[p
p
p,k
k
k]=Z[0,1]dBj
j
j+1,[p
p
p,k
k
k](x)Bi
i
i+1,[p
p
p,k
k
k](x)dxn
n
n(p
p
pk
k
k)+k
k
k1
i
i
i,j
j
j=1
,(2.5)
Kn
n
n,[p
p
p,k
k
k](K) = Z[0,1]dK(x)Bj
j
j+1,[p
p
p,k
k
k](x)·Bi
i
i+1,[p
p
p,k
k
k](x)dxn
n
n(p
p
pk
k
k)+k
k
k1
i
i
i,j
j
j=1
,(2.6)
where B1,[p
p
p,k
k
k],...,Bn
n
n(p
p
pk
k
k)+k
k
k+1,[p
p
p,k
k
k]are the tensor-product B-splines defined by
Bi
i
i,[p
p
p,k
k
k](x) =
d
r=1
Bir,[pr,kr](xr),i
i
i=1,...,n
n
n(p
p
pk
k
k) + k
k
k+1,
and B1,[pr,kr],...,Bnr(prkr)+kr+1,[pr,kr]are the B-splines of degree prand smoothness Ckr
defined on the knot sequence
0,...,0
| {z }
pr+1
,1
nr
,..., 1
nr
| {z }
prkr
,2
nr
,..., 2
nr
| {z }
prkr
, .. . , nr1
nr
,..., nr1
nr
| {z }
prkr
,1,...,1
| {z }
pr+1.
M[q],K[q],J[q]are the (q+1)×(q+1)blocks given by
M[q]=Z1
1`j,[q](τ)`i,[q](τ)dτq+1
i,j=1
,(2.7)
K[q]=`j,[q](1)`i,[q](1)Z1
1`j,[q](τ)`0
i,[q](τ)dτq+1
i,j=1
,(2.8)
J[q]=`j,[q](1)`i,[q](1)q+1
i,j=1,(2.9)
where {`1,[q],...,`q+1,[q]}is a fixed basis for the space of polynomials of degree q.
In the context of (nodal) DG methods [33], `1,[q],.. . ,`q+1,[q]are often chosen as the La-
grange polynomials associated with q+1 fixed points {τ1,...,τq+1} ⊆ [1,1], such as,
for example, the Gauss–Lobatto or the right Gauss–Radau nodes in [1,1].
The solution of system (2.1) yields the approximate solution of problem (1.1); see [10] for
details. The main result of [10] is reported in Theorem 2.1 below; see also [8, Section 6.2] for
a more recent and lucid proof. Before stating Theorem 2.1, let us recall the notion of spectral
distribution for a given sequence of matrices. In what follows, we say that a matrix-valued
function f:DCs×s, defined on a measurable set DR`, is measurable if its components
fi j :DC,i,j=1,...,s, are (Lebesgue) measurable.
Definition 2.1 Let {Xm}mbe a sequence of matrices, with Xmof size dmtending to infinity,
and let f:DCs×sbe a measurable matrix-valued function defined on a set DR`with
0<measure(D)<. We say that {Xm}mhas a (asymptotic) spectral distribution described
by f, and we write {Xm}mλf, if
lim
m
1
dm
dm
j=1
F(λj(Xm)) = 1
measure(D)ZD
s
i=1F(λi(f(y)))
sdy
for all continuous functions F:CCwith compact support. In this case, fis called the
spectral symbol of {Xm}m.
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 5
Remark 2.1 The informal meaning behind Definition 2.1 is the following: assuming that
fpossesses sRiemann-integrable eigenvalue functions λi(f(y)),i=1,...,s, the eigenval-
ues of Xm, except possibly for o(dm)outliers, can be subdivided into sdifferent subsets of
approximately the same cardinality; and the eigenvalues belonging to the ith subset are ap-
proximately equal to the samples of the ith eigenvalue function λi(f(y)) over a uniform grid
in the domain D. For instance, if `=1, dm=ms, and D= [a,b], then, assuming we have no
outliers, the eigenvalues of Xmare approximately equal to
λifa+jba
m,j=1,...,m,i=1,...,s,
for mlarge enough; similarly, if `=2, dm=m2s, and D= [a1,b1]×[a2,b2], then, assuming
we have no outliers, the eigenvalues of Xmare approximately equal to
λifa1+j1b1a1
m,a2+j2b2a2
m,j1,j2=1,...,m,i=1,...,s,
for mlarge enough; and so on for `3.
Theorem 2.1 Let q 0be an integer, let p
p
pNdand 0k
k
kp
p
p1. Assume that K(x)
is symmetric positive definite at every point x(0,1)dand each component of K(x)is a
continuous bounded function on (0,1)d. Suppose the following two conditions are met:
n
n
n=α
α
αn, where α
α
α= (α1,...,αd)is a vector with positive components in Qdand n varies
in some infinite subset of Nsuch that n
n
n=α
α
αnNd;
N=N(n)is such that N and N/n20as n .
Then, for the sequence of normalized space-time matrices {2Nnd2C[q,p
p
p,k
k
k]
N,n
n
n(K)}nwe have
the spectral distribution relation
{2Nnd2C[q,p
p
p,k
k
k]
N,n
n
n(K)}nλf[α
α
α,K]
[q,p
p
p,k
k
k],
where:
the spectral symbol f[α
α
α,K]
[q,p
p
p,k
k
k]:[0,1]d×[π,π]dC(q+1)d
i=1(piki)×(q+1)d
i=1(piki)is de-
fined as
f[α
α
α,K]
[q,p
p
p,k
k
k](x,θ
θ
θ) = f[α
α
α,K]
[p
p
p,k
k
k](x,θ
θ
θ)TM[q]; (2.10)
– f[α
α
α,K]
[p
p
p,k
k
k]:[0,1]d×[π,π]dCd
i=1(piki)×d
i=1(piki)is defined as
f[α
α
α,K]
[p
p
p,k
k
k](x,θ
θ
θ) = 1
d
i=1αi
d
i,j=1
αiαjKi j(x)(H[p
p
p,k
k
k])i j(θ
θ
θ); (2.11)
H[p
p
p,k
k
k]is a d ×d block matrix whose (i,j)entry is a d
i=1(piki)×d
i=1(piki)block
defined as in [10, eq. (5.12)];
Tis the final time in (1.1) and M[q]is given in (2.7).
With the same argument used for proving Theorem 2.1, it not difficult to prove the
following result.
Theorem 2.2 Suppose the hypotheses of Theorem 2.1 are satisfied, and let
Q[q,p
p
p,k
k
k]
N,n
n
n(K) = t
2INM[q]Kn
n
n,[p
p
p,k
k
k](K).
Then,
{2Nnd2(INA[q,p
p
p,k
k
k]
n
n
n(K))}nλf[α
α
α,K]
[q,p
p
p,k
k
k],{2Nnd2Q[q,p
p
p,k
k
k]
N,n
n
n(K)}nλf[α
α
α,K]
[q,p
p
p,k
k
k].
6 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
Table 3.1: Number of iterations GM[p]and PGM[p]needed by, respectively, the GMRES and the PGMRES
with preconditioner P[q,p
p
p,k
k
k]
N,n
n
n(K), for solving the linear system (2.1), up to a precision ε=108, in the case
where d=2, K(x) = I2,f(t,x) = 1, T =1, q=0, n
n
n= (n,n),p
p
p= (p,p),k
k
k= (p1,p1),N=n. The total
size of the space-time system (number of DoFs) is given by n¯n=n(n+p2)2.
n=NGM[3]PGM[3]GM[4]PGM[4]GM[5]PGM[5]
20 66 21 85 21 170 21
40 168 40 178 40 235 40
60 295 59 314 59 360 59
80 443 77 473 77 506 77
100 609 94 652 94 699 94
120 790 111 847 111 909 111
n=NGM[6]PGM[6]GM[7]PGM[7]GM[8]PGM[8]
20 269 21 532 21 674 21
40 380 40 572 40 656 40
60 477 59 611 59 690 59
80 621 77 720 77 791 77
100 780 94 879 94 963 94
120 971 111 1025 111 1114 111
3 PGMRES for the Space-Time IgA-DG System
Suppose the hypotheses of Theorem 2.1 are satisfied. Then, on the basis of Theorem 2.2 and
the theory of (block) generalized locally Toeplitz (GLT) sequences [7,8,30, 31, 52, 53], we
expect that the sequence of preconditioned matrices
(INA[q,p
p
p,k
k
k]
N,n
n
n(K))1C[q,p
p
p,k
k
k]
N,n
n
n(K),(3.1)
as well as the sequence of preconditioned matrices
(Q[q,p
p
p,k
k
k]
N,n
n
n(K))1C[q,p
p
p,k
k
k]
N,n
n
n(K) = 2
t(INM[q]Kn
n
n,[p
p
p,k
k
k](K))1C[q,p
p
p,k
k
k]
N,n
n
n(K),(3.2)
has an asymptotic spectral distribution described by the preconditioned symbol
f[α
α
α,K]
[q,p
p
p,k
k
k]1f[α
α
α,K]
[q,p
p
p,k
k
k]=I(q+1)d
i=1(piki).
This means that the eigenvalues of the two sequences of matrices (3.1) and (3.2) are (weakly)
clustered at 1; see [7, Section 2.4.2]. Therefore, in view of the convergence properties of
the GMRES method [13]—see in particular [13, Theorem 2.13] and the original research
paper by Bertaccini and Ng [14]—we may expect that the PGMRES with preconditioner
INA[q,p
p
p,k
k
k]
N,n
n
n(K)or Q[q,p
p
p,k
k
k]
N,n
n
n(K)for solving a linear system with coefficient matrix C[q,p
p
p,k
k
k]
N,n
n
n(K)
has an optimal convergence rate, i.e., the number of iterations for reaching a preassigned
accuracy εis independent of (or only weakly dependent on) the matrix size. We may also
expect that the same is true for the PGMRES with preconditioner
P[q,p
p
p,k
k
k]
N,n
n
n(K) = INIq+1Kn
n
n,[p
p
p,k
k
k](K) = IN(q+1)Kn
n
n,[p
p
p,k
k
k](K),(3.3)
because (up to a negligible normalization factor t/2) P[q,p
p
p,k
k
k]
N,n
n
n(K)is spectrally equivalent
to Q[q,p
p
p,k
k
k]
N,n
n
n(K). Indeed, the spectrum of (P[q,p
p
p,k
k
k]
N,n
n
n(K))1(INM[q]Kn
n
n,[p
p
p,k
k
k](K)) is contained
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 7
Table 3.2: Number of iterations GM[p,k]and PGM[p,k]needed by, respectively, the GMRES and the PGM-
RES with preconditioner P[q,p
p
p,k
k
k]
N,n
n
n(K), for solving the linear system (2.1), up to a precision ε=108, in the
case where d=2,
K(x1,x2) = cos(x1) + x20
0x1+sin(x2),
f(t,x) = 1, T =1, q=1, n
n
n= (n,n),p
p
p= (p,p),k
k
k= (k,k),N=20. The number of DoFs is given by 40¯n=
40(n(pk) + k1)2. Note that K(x1,x2)is singular at (x1,x2) = (0,0).
nGM[1,0]PGM[1,0]GM[2,0]PGM[2,0]GM[2,1]PGM[2,1]GM[3,1]PGM[3,1]
20 244 42 383 42 156 42 276 42
40 502 42 778 42 314 42 560 42
60 763 42 1174 42 474 42 842 42
80 1026 42 1570 42 635 42 1146 42
100 1275 42 1966 42 796 42 1894 42
120 1608 42 2374 42 954 42 1898 42
nGM[4,1]PGM[4,1]GM[4,2]PGM[4,2]GM[5,2]PGM[5,2]GM[5,3]PGM[5,3]
20 444 42 390 42 522 42 514 42
40 759 42 565 42 721 42 643 42
60 1148 42 771 42 953 42 831 42
80 1536 42 1035 42 1337 42 1026 42
100 1909 42 1299 42 2232 42 1226 42
120 2329 42 1564 42 2390 42 1831 42
in [cq,Cq]for some positive constants cq,Cq>0 depending only on q. For instance, one
can take cq=λmin(M[q])and Cq=λmax(M[q]), which are both positive as M[q]is symmetric
positive definite (see (2.7)).
To show that our expectation is realized, we solve system (2.1) in two space dimensions
(d=2), up to a precision ε=108, by means of the GMRES and the PGMRES with pre-
conditioner P[q,p
p
p,k
k
k]
N,n
n
n(K), using f(t,x) = 1, T =1, α
α
α= (1,1),n
n
n=α
α
αn= (n,n),p
p
p= (p,p),
k
k
k= (k,k), and varying K(x),N,n,q,p,k. The resulting number of iterations are collected
in Tables 3.1–3.3. We see from the tables that the GMRES solver rapidly deteriorates with
increasing n, and it is not robust with respect to p,k. On the other hand, the convergence
rate of the proposed PGMRES is robust with respect to all spatial parameters n,p,k, though
its performance is clearly better in the case where Nis fixed (Tables 3.2–3.3) than in the
case where Nincreases (Table 3.1). An explanation of this phenomenon based on Theo-
rem 2.1 is the following. In the case where Nis fixed, the ratio N/n2converges to 0 much
more quickly than in the case where N=n. Consequently, when Nis fixed, the spectrum of
both 2Nnd2C[q,p
p
p,k
k
k]
N,n
n
n(K)and 2Nnd2Q[q,p
p
p,k
k
k]
N,n
n
n(K)is better described by the symbol f[α
α
α,K]
[q,p
p
p,k
k
k]than
when N=n. Similarly, the spectrum of the preconditioned matrix (Q[q,p
p
p,k
k
k]
N,n
n
n(K))1C[q,p
p
p,k
k
k]
N,n
n
n(K)
is better described by the preconditioned symbol I(q+1)d
i=1(piki). In conclusion, the eigen-
values of the preconditioned matrix are supposed to be more clustered when Nis fixed than
when N=n.
In order to investigate the influence of qon the number of PGMRES iterations, we
performed a further numerical experiment in Table 3.4. We observe that the considered
PGMRES is not robust with respect to q, but the number of PGMRES iterations grows
linearly with q. By comparing Tables 3.1 and 3.4, we note that the PGMRES convergence
is linear with respect to both Nand q. In practice, increasing qis the most convenient way
8 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
Table 3.3: Number of iterations GM[p,k]and PGM[p,k]needed by, respectively, the GMRES and the PGM-
RES with preconditioner P[q,p
p
p,k
k
k]
N,n
n
n(K), for solving the linear system (2.1), up to a precision ε=108, in the
case where d=2,
K(x1,x2) = (2+cosx1)(1+x2)cos(x1+x2)sin(x1+x2)
cos(x1+x2)sin(x1+x2) (2+sinx2)(1+x1),
f(t,x) = 1, T =1, q=2, n
n
n= (n,n),p
p
p= (p,p),k
k
k= (k,k),N=20. The number of DoFs is given by 60¯n=
60(n(pk) + k1)2.
nGM[2,0]PGM[2,0]GM[2,1]PGM[2,1]GM[3,0]PGM[3,0]GM[3,2]PGM[3,2]
20 286 40 112 40 400 40 123 40
40 579 40 228 40 809 40 224 40
60 874 40 345 40 1218 40 339 40
80 1170 40 463 40 1716 40 456 40
100 1466 40 580 40 2204 40 573 40
120 1757 40 697 40 2487 40 690 40
nGM[4,0]PGM[4,0]GM[4,3]PGM[4,3]GM[5,0]PGM[5,0]GM[5,4]PGM[5,4]
20 779 40 208 40 1460 40 396 40
40 1070 40 270 40 1982 40 419 40
60 1580 40 361 40 2376 40 466 40
80 2176 40 487 40 2733 40 531 40
100 2668 40 613 40 3559 40 657 40
120 3284 40 738 40 4565 40 791 40
Table 3.4: Same setting as in Table 3.1 with n=N=20 and q=0,1,2,3,4.
qGM[3]PGM[3]GM[4]PGM[4]GM[5]PGM[5]
0 66 21 85 21 170 21
1 122 42 154 42 280 42
2 175 64 225 64 391 64
3 222 95 289 95 464 95
4 247 115 351 115 602 116
qGM[6]PGM[6]GM[7]PGM[7]GM[8]PGM[8]
0 269 21 532 21 674 21
1 446 42 688 42 834 42
2 491 64 580 64 672 64
3 616 95 916 95 1103 95
4 1031 116 1927 116 5468 116
to improve the temporal accuracy of the discrete solution u; see, e.g., [12]. This is due to
the superconvergence property, according to which the order of convergence in time of a
q-degree DG method is 2q+1 [19,43]. Tables 3.1 and 3.4 show that the strategy of keeping
Nfixed and increasing qis more convenient even in terms of performance of the proposed
PGMRES.
As it is known, each PGMRES iteration requires solving a linear system with coefficient
matrix given by the preconditioner P[q,p
p
p,k
k
k]
N,n
n
n(K), and this is not required in a GMRES iteration.
Thus, if we want to prove that the proposed PGMRES is fast, we have to show that we are
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 9
able to solve efficiently a linear system with matrix P[q,p
p
p,k
k
k]
N,n
n
n(K). However, for the reasons
explained in Section 4, this is not exactly the path we will follow.
Before moving on to Section 4, we remark that, thanks to the tensor structure (3.3),
the solution of a linear system with coefficient matrix P[q,p
p
p,k
k
k]
N,n
n
n(K)reduces to the solution
of N(q+1)linear systems with coefficient matrix Kn
n
n,[p
p
p,k
k
k](K). Indeed, the solution of the
system P[q,p
p
p,k
k
k]
N,n
n
n(K)x=yis given by
x= (P[q,p
p
p,k
k
k]
N,n
n
n(K))1y= (IN(q+1)Kn
n
n,[p
p
p,k
k
k](K)1)y=
Kn
n
n,[p
p
p,k
k
k](K)1y1
.
.
.
Kn
n
n,[p
p
p,k
k
k](K)1yN(q+1)
,(3.4)
where yT= [yT
1,...,yT
N(q+1)]and each yihas length ¯n. It is then clear that the computation
of the solution xis equivalent to solving the N(q+1)linear systems Kn
n
n,[p
p
p,k
k
k](K)xi=yi,
i=1,...,N(q+1). Note that the various xican be computed in parallel as the computation
of xiis independent of the computation of xjwhenever i6=j.
4 Fast Solver for the Space-Time IgA-DG System
From here on, we focus on the maximal smoothness case k
k
k=p
p
p1, that is, the case
corresponding to the classical IgA approach. For notational simplicity, we drop the sub-
script/superscript k
k
k=p
p
p1, so that, for instance, the matrices C[q,p
p
p,p
p
p1]
N,n
n
n(K),P[q,p
p
p,p
p
p1]
N,n
n
n(K),
Kn
n
n,[p
p
p,p
p
p1](K)will be denoted by C[q,p
p
p]
N,n
n
n(K),P[q,p
p
p]
N,n
n
n(K),Kn
n
n,[p
p
p](K), respectively.
The solver suggested in Section 3 for a linear system with matrix C[q,p
p
p]
N,n
n
n(K)is a PGMRES
with preconditioner P[q,p
p
p]
N,n
n
n(K). According to (3.4), the solution of a linear system with ma-
trix P[q,p
p
p]
N,n
n
n(K), which is required at each PGMRES iteration, is equivalent to solving N(q+1)
linear systems with matrix Kn
n
n,[p
p
p](K). Fast solvers for Kn
n
n,[p
p
p](K)that have been proposed in
recent papers (see [20,21,22] and references therein) might be employed here. However,
using an exact solver for Kn
n
n,[p
p
p](K)is not what we have in mind. Indeed, it was discovered
experimentally that the PGMRES method converges faster if the linear system with matrix
P[q,p
p
p]
N,n
n
n(K)occurring at each PGMRES iteration is solved inexactly. More precisely, when
solving the N(q+1)linear systems with matrix Kn
n
n,[p
p
p](K)occurring at each PGMRES itera-
tion, it is enough to approximate their solutions by performing only a few standard multigrid
iterations in order to achieve an excellent PGMRES run-time; and, in fact, only one stan-
dard multigrid iteration is sufficient. In view of these experimental discoveries, we propose
to solve a linear system with matrix C[q,p
p
p]
N,n
n
n(K)in the following way.
Algorithm 4.1
1. Apply to the given system the PGMRES algorithm with preconditioner P[q,p
p
p]
N,n
n
n(K).
2. The exact solution of the linear system with matrix P[q,p
p
p]
N,n
n
n(K)occurring at each PGM-
RES iteration would require solving N(q+1)linear systems with matrix Kn
n
n,[p
p
p](K)
as per eq. (3.4).
3. Instead of solving exactly these N(q+1)systems, apply to each of them, starting
from the zero vector as initial guess, µmultigrid (V-cycle) iterations involving, at all
levels, a single symmetric Gauss–Seidel post-smoothing step and standard bisection
for the interpolation and restriction operators (following the Galerkin assembly in
which the interpolation operator is the transpose of the restriction operator).
10 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
As we shall see in the numerics of Section 6, the choice µ=1 yields the best per-
formance of Algorithm 4.1. The proposed solver is not the PGMRES with preconditioner
P[q,p
p
p]
N,n
n
n(K)because, at each iteration, the linear system associated with P[q,p
p
p]
N,n
n
n(K)is not solved
exactly. However, the solver is still a PGMRES with a different preconditioner ˜
P[q,p
p
p]
N,n
n
n(K).
To see this, let MG be the iteration matrix of the multigrid method used in step 3 of Algo-
rithm 4.1 for solving a linear system with matrix Kn
n
n,[p
p
p](K). Recall that MG depends only
on Kn
n
n,[p
p
p](K)and not on the specific right-hand side of the system to solve. If the system to
solve is Kn
n
n,[p
p
p](K)xi=yi, the approximate solution ˜
xiobtained after µmultigrid iterations
starting from the zero initial guess is given by
˜
xi= (I¯nMGµ)Kn
n
n,[p
p
p](K)1yi.
Hence, the approximation ˜
xcomputed by our solver for the exact solution (3.4) of the system
P[q,p
p
p]
N,n
n
n(K)x=yis given by
˜
x=
(I¯nMGµ)Kn
n
n,[p
p
p,k
k
k](K)1y1
.
.
.
(I¯nMGµ)Kn
n
n,[p
p
p,k
k
k](K)1yN(q+1)
= (IN(q+1)(I¯nMGµ)Kn
n
n,[p
p
p](K)1)y
=˜
P[q,p
p
p]
N,n
n
n(K)1y,
where ˜
P[q,p
p
p]
N,n
n
n(K) = IN(q+1)Kn
n
n,[p
p
p](K)(I¯nMGµ)1.(4.1)
In conclusion, the proposed solver is the PGMRES with preconditioner ˜
P[q,p
p
p]
N,n
n
n(K).From the
expression of ˜
P[q,p
p
p]
N,n
n
n(K), we can also say that the proposed solver is a MG-GMRES, that is, a
PGMRES with preconditioner given by a standard multigrid applied only in space. A more
precise notation for this solver could be MGspace-GMRES, but for simplicity we just write
MG-GMRES.
5 Fast Parallel Solver for the Space-Time IgA-DG System
In Section 4, we have described the sequential version of the proposed solver. The same
version is used also in the case where ρN(q+1)processors are available, with the only
difference that step 3 of Algorithm 4.1 is performed in parallel. In practice, silinear systems
are assigned to the ith processor for i=1,...,ρ, with s1+.. . +sρ=N(q+1)and s1,...,sρ
approximately equal to each other according to a load balancing principle. This is illustrated
in Figure 5.1 (left), which shows the row-wise partition of P[q,p
p
p]
N,n
n
n(K) = IN(q+1)K[q,p
p
p]
N,n
n
n(K)
corresponding to the distribution of the N(q+1)systems among ρ=N(q+1)1 proces-
sors.
If ρ>N(q+1)processors are available, we use a slight modification of the solver,
which is suited for parallel computation. As before, the modification only concerns step 3
of Algorithm 4.1. Since we now have more processors than systems to be solved, after as-
signing 1 processor to each system, we still have ρN(q+1)unused processors. Following
again a load balancing principle, we distribute the unused processors among the N(q+1)
systems, so that now one system can be shared between two or more different processors; see
Figure 5.1 (right). Suppose that the system K[q,p
p
p]
N,n
n
n(K)x=yis shared between σprocessors.
The symmetric Gauss–Seidel post-smoothing iteration in step 3 of Algorithm 4.1 cannot be
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 11
Partition of P[q,p
p
p]
N,n
n
n(K)for ρ=N(q+1)1 Partition of P[q,p
p
p]
N,n
n
n(K)for ρ=N(q+1) + 1
˜
K
˜
K
˜
K
˜
K
˜
K
˜
K
˜
K
˜
K
ρ1ρ2ρ3ρ4ρ5
Fig. 5.1: Row-wise partitions of the preconditioner P[q,p
p
p]
N,n
n
n(K) = IN(q+1)˜
Kusing ρ=N(q+1)1 processors
(left) and ρ=N(q+1) + 1 processors (right) with N(q+1) = 4. For simplicity, we write “ ˜
K” instead of
K[q,p
p
p]
N,n
n
n(K)”.
performed in parallel. Therefore, we replace it with its block-wise version. To be precise,
we recall that the symmetric Gauss–Seidel iteration for a system with matrix E=L+UD
is just the preconditioned Richardson iteration with preconditioner M=LD1U.1Its block-
wise version in the case where we consider σdiagonal blocks E1,...,Eσof Eis simply
the preconditioned Richardson iteration with preconditioner M1⊕ · ·· ⊕ Mσ, where Miis the
symmetric Gauss–Seidel preconditioner for Eiand M1⊕ · ·· ⊕ Mσis the block diagonal ma-
trix whose diagonal blocks are M1,...,Mσ. This block-wise version is suited for parallel
computation in the case where σprocessors are available.
6 Numerical Experiments: Iteration Count, Timing and Scaling
In this section, we illustrate through numerical experiments the performance of the proposed
solver and we compare it to the performance of other benchmark parallel solvers, such as
the PGMRES with block-wise ILU(0) preconditioner.
6.1 Implementation Details
For the numerics of this section, as well as throughout this paper, we used the C++ frame-
work PETSc [5,6] and the domain specific language Utopia [64] for the parallel linear al-
gebra and solvers, and the Cray-MPICH compiler. For the assembly of high order finite
elements, we used the PetIGA package [18]. A parallel tensor-product routine was imple-
mented to assemble space-time matrices. Numerical experiments have been performed on
the Cray XC40 nodes of the Piz Daint supercomputer of the Swiss national supercomputing
centre (CSCS).2The used partition features 1813 computation nodes, each of which holds
two 18-core Intel Xeon E5-2695v4 (2.10GHz) processors. We stress that the PETSc default
row-wise partition follows a load balancing principle and, except in the trivial case ρ=N,
does not correspond to the row-wise partition described in Section 5; see Figure 6.1. There-
fore, the partition must be adjusted by the user. Alternatively, one can use a PETSc built-in
class for sparse block matrices and specify the block size (q+1)n.
1The matrices L,U,Dare, respectively, the lower triangular part of E(including the diagonal), the upper
triangular part of E(including the diagonal), and the diagonal part of E.
2https://www.cscs.ch/computers/piz-daint/
12 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
PETSc default partition
˜
K
˜
K
˜
K
˜
K
ρ1ρ2ρ3ρ4ρ5
Fig. 6.1: The PETSc default row-wise partition does not account for the structure of the space-time problem;
compare with Figure 5.1.
6.2 Experimental Setting
In the numerics of this section, we solve the linear system (2.1) arising from the choices
d=2, f(t,x) = 1, T =1, n
n
n= (n,n),p
p
p= (p,p),k
k
k= (p1,p1). The basis functions
`1,[q],...,`q+1,[q]are chosen as the Lagrange polynomials associated with the right Gauss–
Radau nodes in [1,1]. The values of K(x),N,n,q,pare specified in each example. For
each solver considered herein, we use the tolerance ε=108and the PETSc default stopping
criterion based on the preconditioned relative residual. Moreover, the PGMRES method is
always applied with restart after 30 iterations as per PETSc default. Whenever we report
the run-time of a solver, the time spent in I/O operations and matrix assembly is ignored.
Run-times are always expressed in seconds. In all the tables below, the number of iterations
needed by a given solver to converge within the tolerance ε=108is reported in square
brackets next to the corresponding run-time. Throughout this section, we use the following
abbreviations for the solvers.
ILU(0)-GMRES
PGMRES with preconditioner given by an ILU(0) factorization (ILU factorization with
no fill-in) of the system matrix.
MGL
µ,ν-GMRES
The proposed solver, as described in Section 4, with µmultigrid (V-cycle) iterations
applied to Kn
n
n,[p
p
p](K). Each multigrid iteration involves νsymmetric Gauss–Seidel post-
smoothing steps at the finest level and 1 symmetric Gauss–Seidel post-smoothing step at
the coarse levels. The choice ν=1 corresponds to our solver proposal. Different values
of νare considered for comparison purposes. The superscript Ldenotes the number of
multigrid levels.
TMGL
µ,ν-GMRES
The same as MGL
µ,ν-GMRES, with the only difference that the multigrid iterations are
performed with the telescopic option, thus giving rise to the telescopic multigrid (TMG)
[23,45]. This technique consists in reducing the number of processors used on the coarse
levels and can be beneficial for the parallel multigrid performance. In the numerics of
this section, we only reduced the number of processors used on the coarsest level to one
fourth of the number of processors used at all other levels.
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 13
Table 6.1: PGMRES iterations and run-time (using 64 cores) to solve the linear system (2.1) up to a precision
of 108, according to the experimental setting described in Section 6.2. We used K(x) = I2,q=0, N=32,
n=259 p. The total size of the space-time system (number of DoFs) is given by 32·2572.
p1 2 3 4
ILU(0)-GMRES 3.7 [579] 4.3 [367] 5.2 [269] 6.7 [226]
MG5
3,2-GMRES 1.4 [33] 2.9 [33] 4.7 [33] 7.2 [33]
MG5
1,2-GMRES 0.8 [33] 1.6 [33] 2.5 [33] 4.0 [35]
MG5
3,1-GMRES 1.1 [33] 2.2 [33] 3.3 [33] 5.0 [34]
MG5
1,1-GMRES 0.6 [33] 1.2 [33] 1.8 [34] 3.1 [39]
p5 6 7 8 9
ILU(0)-GMRES 8.2 [193] 10.1 [174] 11.9 [156] 22.5 [234] 44.9 [383]
MG5
3,2-GMRES 10.5 [35] 14.7 [36] 21.1 [41] 34.6 [53] 57.6 [73]
MG5
1,2-GMRES 6.6 [42] 11.0 [52] 16.0 [60] 26.2 [77] 47.0 [90]
MG5
3,1-GMRES 7.1 [36] 11.4 [43] 17.0 [51] 28.5 [67] 42.1 [83]
MG5
1,1-GMRES 5.3 [50] 9.1 [63] 13.5 [75] 19.8 [87] 30.7 [112]
Table 6.2: PGMRES iterations and run-time (using 64 cores) to solve the linear system (2.1) up to a precision
of 108, according to the experimental setting described in Section 6.2. We used
K(x1,x2) = cos(x1) + x20
0x1+sin(x2),
q=1, N=20, n=131 p. The total size of the space-time system (number of DoFs) is given by 40·1292.
Note that K(x1,x2)is singular at (x1,x2) = (0,0).
p1 2 3 4
ILU(0)-GMRES 1.3 [449] 1.7 [283] 2.2 [219] 2.9 [183]
MG5
2,3-GMRES 0.6 [55] 1.3 [55] 2.4 [55] 4.1 [58]
MG5
1,3-GMRES 0.5 [57] 1.0 [56] 1.8 [56] 3.5 [68]
MG5
2,1-GMRES 0.5 [57] 1.0 [57] 1.6 [58] 3.1 [77]
MG5
1,1-GMRES 0.5 [67] 0.8 [65] 1.3 [68] 2.8 [90]
p5 6 7 8 9
ILU(0)-GMRES 3.6 [158] 4.4 [141] 6.0 [148] 9.5 [186] 24.8 [397]
MG5
2,3-GMRES 7.6 [64] 12.7 [90] 18.5 [101] 32.2 [139] 48.9 [173]
MG5
1,3-GMRES 6.2 [85] 10.4 [103] 15.0 [116] 26.5 [161] 38.0 [189]
MG5
2,1-GMRES 5.2 [91] 8.6 [112] 12.6 [128] 22.0 [179] 30.7 [205]
MG5
1,1-GMRES 4.6 [110] 7.2 [125] 11.0 [150] 19.4 [210] 30.2 [269]
6.3 Iteration Count and Timing
Tables 6.1–6.3 illustrate the performance of the proposed solver in terms of number of iter-
ations and run-time. It is clear from the tables that the best performance of the solver is ob-
tained when applying to Kn
n
n,[p
p
p](K)a single multigrid iteration (µ=1) with only one smooth-
ing step at the finest level (ν=1). Moreover, the solver is competitive with respect to the
ILU(0)-GMRES. The worst performance of the solver with respect to the ILU(0)-GMRES
is attained in Table 6.2, where the diffusion matrix K(x1,x2)is singular at (x1,x2) = (0,0).
14 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
Table 6.3: PGMRES iterations and run-time (using 64 cores) to solve the linear system (2.1) up to a precision
of 108, according to the experimental setting described in Section 6.2. We used
K(x1,x2) = (2+cosx1)(1+x2)cos(x1+x2)sin(x1+x2)
cos(x1+x2)sin(x1+x2) (2+sinx2)(1+x1),
q=0, N=20, n=259 p. The total size of the space-time system (number of DoFs) is given by 20·2572.
p1 2 3 4
ILU(0)-GMRES 1.9 [450] 2.2 [284] 2.6 [205] 3.4 [170]
MG5
2,2-GMRES 0.2 [11] 0.5 [11] 0.8 [11] 1.5 [13]
MG5
1,2-GMRES 0.2 [12] 0.4 [11] 0.6 [12] 1.2 [15]
MG5
2,1-GMRES 0.2 [11] 0.4 [11] 0.6 [12] 1.1 [15]
MG5
1,1-GMRES 0.2 [12] 0.3 [11] 0.5 [14] 1.0 [19]
p5 6 7 8 9
ILU(0)-GMRES 4.4 [154] 5.2 [135] 6.4 [125] 12.6 [195] 22.8 [289]
MG5
2,2-GMRES 2.6 [17] 4.1 [20] 5.9 [23] 8.8 [27] 11.9 [30]
MG5
1,2-GMRES 2.1 [20] 3.3 [23] 4.6 [26] 7.2 [31] 10.1 [36]
MG5
2,1-GMRES 2.0 [20] 3.1 [23] 4.6 [27] 6.2 [31] 8.4 [35]
MG5
1,1-GMRES 1.7 [23] 2.5 [26] 3.6 [30] 5.5 [36] 7.4 [40]
Table 6.4: Strong scaling: PGMRES iterations and run-time to solve the linear system (2.1) up to a precision
of 108, according to the experimental setting described in Section 6.2. We used K(x) = I2,q=0, p=3,
N=64, n=384. The total size of the space-time system (number of DoFs) is given by 64·3852.
Cores 1 2 4 8
ILU(0)-GMRES 1385.0 [414] 682.1 [415] 336.7 [415] 181.9 [415]
MG7
1,1-GMRES 335.1 [64] 179.7 [64] 92.5 [64] 51.9 [64]
TMG7
1,1-GMRES 335.1 [64] 179.7 [64] 92.5 [64] 51.9 [64]
Cores 16 32 64 128
ILU(0)-GMRES 103.3 [415] 49.7 [416] 21.2 [417] 12.8 [500]
MG7
1,1-GMRES 31.8 [64] 16.6 [64] 8.3 [64] 4.4 [64]
TMG7
1,1-GMRES 31.3 [64] 16.5 [64] 8.0 [64] 4.2 [64]
Cores 256 512 1024 2048 4096
ILU(0)-GMRES 6.8 [519] 4.0 [550] 2.5 [619] 1.7 [753] 1.7 [1013]
MG7
1,1-GMRES 2.5 [65] 1.8 [65] 1.9 [65] 5.1 [65] 14.7 [66]
TMG7
1,1-GMRES 2.2 [64] 1.3 [63] 0.8 [64] 0.5 [64] 0.4 [64]
Table 6.5: Space-time weak scaling: PGMRES iterations and run-time to solve the linear system (2.1) up to a
precision of 108, according to the experimental setting described in Section 6.2. We used K(x) = I2,q=0,
p=2 and (N,n) = (8,65),(16,129),(32,256),(64,512). The ratio DoFs/Cores is constant in the table.
[Cores,n,N,L] [1,65,8,4] [8,129,16,5] [64,257,32,6] [512,513,64,7]
ILU(0)-GMRES 0.25 [50] 0.86 [121] 2.80 [367] 7.6 [989]
TMGL
1,1-GMRES 0.11 [10] 0.27 [17] 0.67 [33] 1.4 [64]
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 15
100101102103104
Cores
101
100
101
102
103
104
Run-time (s)
ILU
MG
TMG
Fig. 6.2: Graphical representation of the run-times
reported in Table 6.4.
104105106107108
DoFs
102
101
100
101
102
103
Run-time (s)
ILU
TMG
Fig. 6.3: Graphical representation of the run-times
reported in Table 6.5.
6.4 Scaling
In the scaling experiments, besides the multigrid already considered above, we also employ
a TMG for performance reasons. To avoid memory bounds, we use at most 16 cores per
node. From Table 6.4 and Figure 6.2 we see that the proposed solver, especially when using
the TMG option, shows a nearly optimal strong scaling with respect to the number of cores.3
Table 6.5 and Figure 6.3 illustrate the weak scaling properties of the proposed solver, which
possesses a superior parallel efficiency with respect to the standard ILU(0) approach in
terms of iteration count and run-time. For both solvers, however, the weak scaling is not
ideal (constant run-time). This is due to the fact that Ngrows from 8 to 64 and both solvers
are not robust with respect to N.
7 Non-Rectangular Domain and Non-Trivial Geometry
So far, the performance of the proposed solver has been illustrated for the diffusion problem
(1.1) over the hypersquare (0,1)d. However, no special difficulty arises if (0,1)dis replaced
by a non-rectangular domain described (exactly) by a geometry map G:[0,1]das
per IgA paradigm. Indeed, as long as a tensor-product structure between space and time is
maintained, the geometry map Gacts as a reparameterization of through (0,1)d, and the
resulting discretization matrix is still given by (2.2)–(2.9) with the only difference that:
a factor |det(JG(x))|should be included in the integrand of (2.5), where JG(x)is the
Jacobian matrix of G(x);
the matrix K(x)in (2.6) should be replaced by JG(x)1K(G(x))JG(x)T|det(JG(x))|.
In short, a change of domain from (0,1)dto essentially amounts to a mere change of
diffusion matrix from Kto J1
GK(G)JT
G|det(JG)|, which does not affect the performance of
the proposed solver.
In Table 7.1, we validate the previous claim by testing the solver on the linear sys-
tem arising from the space-time IgA-DG discretization of (1.1) in the case where (0,1)d
is replaced by a non-rectangular domain described by a non-trivial geometry map G:
3We observe a slight reduction in the ideal scaling as the number of cores grows from 2 to 16. This is due
to the fact that runs are performed on a single node with its own limited memory. For more than 16 cores,
the memory bound is no longer present since computations are performed on multiple nodes with increasing
memory. When the number of cores exceeds two thousand, communication takes over and scaling is no longer
observable.
16 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
Table 7.1: PGMRES iterations and run-time (using 64 cores) to solve, up to a precision of 108, the linear
system arising from the space-time IgA-DG discretization of (1.1) in the case where (0,1)dis replaced by the
domain (7.1) described by the geometry map (7.2). The experimental setting is the same as in Section 6.2.
We used K(x) = I2,q=1, N=20, n=131 p. The total size of the space-time system (number of DoFs) is
given by 40·1292.
p1 2 3 4
ILU(0)-GMRES 1.6 [412] 2.2 [354] 4.0 [296] 4.3 [268]
MG5
1,1-GMRES 0.7 [99] 1.3 [91] 2.2 [103] 4.0 [140]
p5 6 7 8 9
ILU(0)-GMRES 6.0 [266] 8.0 [257] 16.6 [415] 31.3 [622] 48.2 [775]
MG5
1,1-GMRES 7.2 [178] 11.8 [219] 16.5 [241] 29.6 [348] 39.9 [386]
[0,1]d. The experimental setting is the same as in Section 6.2, with the only difference
that (0,1)2is now replaced by a quarter of an annulus
={xR2:r2<x2
1+x2
2<R2,x1>0,x2>0},r=1,R=2,(7.1)
described by the geometry map G:[0,1]2,
G(ˆ
x) = (x1= [r+ˆx1(Rr)]cosπ
2ˆx2,
x2= [r+ˆx1(Rr)]sinπ
2ˆx2,ˆ
x[0,1]2.(7.2)
We remark that the geometry map Gis a common benchmark example in IgA; see, e.g., [20,
21].
8 Conclusions
We have proposed a MG-GMRES solver for the space-time IgA-DG discretization of the
diffusion problem (1.1). Through numerical experiments, we have illustrated the competi-
tiveness of our proposal in terms of iteration count, run-time and parallel scaling. We have
also shown its applicability to more general problems than (1.1) involving a non-rectangular
domain and a non-trivial geometry map G. To conclude, we remark that the proposed
solver is highly flexible as it does not depend on the domain or the space-time discretization.
It could therefore be applied to other space-time discretizations, as long as a tensor-product
structure is maintained between space and time.
Data Availability Statement If requested by the handling editor or the reviewers, the codes used for pro-
ducing the numerical results of this paper will be made publicly available.
Acknowledgements Paola Ferrari is partially financed by the GNCS 2019 Project “Metodi Numerici per
Problemi Mal Posti”. Paola Ferrari, Carlo Garoni and Stefano Serra-Capizzano are grateful to the Ital-
ian INdAM-GNCS for the scientific support. Carlo Garoni acknowledges the MIUR Excellence Depart-
ment Project awarded to the Department of Mathematics of the University of Rome Tor Vergata (CUP
E83C18000100006) and the support obtained by the Beyond Borders Programme of the University of Rome
Tor Vergata through the Project ASTRID (CUP E84I19002250005). Rolf Krause acknowledges the funding
obtained from the European High-Performance Computing Joint Undertaking (JU) under Grant Agreement
N. 955701 (Project TIME-X); the JU receives support from the European Union’s Horizon 2020 Research
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 17
and Innovation Programme and from Belgium, France, Germany and Switzerland. Finally, the authors ac-
knowledge the Deutsche Forschungsgemeinschaft (DFG) as part of the “ExaSolvers” Project in the Priority
Programme 1648 “Software for Exascale Computing” (SPPEXA) and the Swiss National Science Foundation
(SNSF) under the lead agency grant agreement SNSF-162199.
References
1. ABEDI R., PETRACOV IC I B., HA BER R.B. A space-time discontinuous Galerkin method for lin-
earized elastodynamics with element-wise momentum balance. Comput. Methods Appl. Mech. Engrg.
195 (2006) 3247–3273.
2. ARBENZ P., HU PP D., OBRIST D. A parallel solver for the time-periodic Navier–Stokes equations. In
“PPAM 2013: Parallel Processing and Applied Mathematics”, Springer (2014), pp. 291–300.
3. AURICCHIO F., BE IR ÃO DA VEI GA L., HUGHES T.J.R., REALI A., SANGALLI G. Isogeometric col-
location methods. Math. Models Methods Appl. Sci. 20 (2010) 2075–2107.
4. AZIZ A.K., MONK P. Continuous finite elements in space and time for the heat equation. Math. Comput.
52 (1989) 255–274.
5. BAL AY S., ABH YANK AR S., ADAM S M. F., BROWN J., BRU NE P., BUSCHELMAN K ., DALCIN L.,
DENER A., EIJKHOUT V., GROP P W.D., KARPEYEV D., KAUS HI K D., KN EP LE Y M.G., MAY D.A.,
CURFMAN MCINNES L., MILL S R. T., MUNSON T., RUPP K., SANA N P., SMITH B.F., ZAMPINI S.,
ZHANG H., ZHAN G H. PETSc web page. https://www.mcs.anl.gov/petsc (2019).
6. BAL AY S., ABH YANK AR S., ADAM S M. F., BROWN J., BRU NE P., BUSCHELMAN K ., DALCIN L.,
DENER A., EIJKHOUT V., GROP P W.D., KARPEYEV D., KAUS HI K D., KN EP LE Y M.G., MAY D.A.,
CURFMAN MCINNES L., MILL S R. T., MUNSON T., RUPP K., SANA N P., SMITH B.F., ZAMPINI S.,
ZHANG H., ZHAN G H. PETSc users manual. Technical Report ANL-95/11 - Revision 3.11, Argonne
National Laboratory (2019).
7. BARBARINO G., G ARONI C., SERRA-CAPIZZANO S. Block generalized locally Toeplitz sequences:
theory and applications in the unidimensional case. Electron. Trans. Numer. Anal. 53 (2020) 28–112.
8. BAR BAR IN O G., GARONI C., SERRA-CAPIZZANO S. Block generalized locally Toeplitz sequences:
theory and applications in the multidimensional case. Electron. Trans. Numer. Anal. 53 (2020) 113–
216.
9. BEIRÃO DA VEI GA L., BUFFA A., SANGALLI G., VÁZQ UE Z R. Mathematical analysis of variational
isogeometric methods. Acta Numerica 23 (2014) 157–287.
10. BENEDUSI P., GA RONI C., KRAUSE R., LIX., SERRA-CAPIZZANO S. Space-time FE-DG discretiza-
tion of the anisotropic diffusion equation in any dimension: the spectral symbol. SIAM J. Matrix Anal.
Appl. 39 (2018) 1383–1420.
11. BENEDUSI P., HU PP D., ARBENZ P., KR AUS E R. A parallel multigrid solver for time-periodic in-
compressible Navier–Stokes equations in 3D. In “Numerical Mathematics and Advanced Applications
ENUMATH 2015”, Springer (2016), pp. 265–273.
12. BENEDUSI P., MINION M., KRAUSE R. An experimental comparison of a space-time multigrid method
with PFASST for a reaction-diffusion problem. In revision.
13. BERTACCI NI D ., DURASTANTE F. Iterative Methods and Preconditioning for Large and Sparse Linear
Systems with Applications. Taylor & Francis, Boca Raton (2018).
14. BERTACCI NI D ., N GM.K. Band-Toeplitz preconditioned GMRES iterations for time-dependent PDEs.
BIT Numer. Math. 43 (2003) 901–914.
15. BETSCH P., STEINMANN P. Conservation properties of a time FE method—part II: time-stepping
schemes for non-linear elastodynamics. Inter. J. Numer. Methods Engrg. 50 (2001) 1931–1955.
16. ˇ
CESENEK J., FE IS TAUE R M. Theory of the space-time discontinuous Galerkin method for nonstationary
parabolic problems with nonlinear convection and diffusion. SIAM J. Numer. Anal. 50 (2012) 1181–
1206.
17. COT TR EL L J.A., HUGHES T.J.R., BAZILEVS Y. Isogeometric Analysis: Toward Integration of CAD
and FEA. Wiley, Chicester (2009).
18. DALCIN L., CO LLIER N., VIGNAL P., C ˆ
ORTES A.M.A., CALO V.M. PetIGA: a framework for high
performance isogeometric analysis. Comput. Methods Appl. Mech. Engrg. 308 (2016) 151–181.
19. DELFOUR M., H AGER W., TROC HU F. Discontinuous Galerkin methods for ordinary differential equa-
tions. Math. Comput. 36 (1981) 455–473.
20. DONATEL LI M., GARO NI C ., M AN NI C., SER RA -CAPIZZANO S., SP EL EE RS H. Robust and optimal
multi-iterative techniques for IgA Galerkin linear systems. Comput. Methods Appl. Mech. Engrg. 284
(2015) 230–264.
18 P. Benedusi, P. Ferrari, C. Garoni, R. Krause, S. Serra-Capizzano
21. DONATEL LI M., GARONI C., M AN NI C ., S ER RA -CAPIZZANO S., SP EL EE RS H. Robust and optimal
multi-iterative techniques for IgA collocation linear systems. Comput. Methods Appl. Mech. Engrg. 284
(2015) 1120–1146.
22. DONATEL LI M., GARONI C., M AN NI C ., S ER RA -CAPIZZANO S., SP EL EE RS H. Symbol-based multi-
grid methods for Galerkin B-spline isogeometric analysis. SIAM J. Numer. Anal. 55 (2017) 31–62.
23. DOUGLAS C.C. A review of numerous parallel multigrid methods. In “Applications on Advanced Ar-
chitecture Computers”, SIAM (1996), pp. 187–202.
24. ERIKSSON K., JOHNSO N C. , LO GG A. Adaptive computational methods for parabolic problems: Part
1. Fundamentals. Encyclop. Comput. Mech. (2004).
25. FALG OU T R.D., FRI ED HO FF S., KOL EV TZ. V., MACLACHL AN S .P., SCHRODER J.B., VAND E-
WALLE S. Multigrid methods with space-time concurrency. Comput. Visual. Sci. 18 (2007) 123–143.
26. FRENCH D.A. A space-time finite element method for the wave equation. Comput. Methods Appl. Mech.
Engrg. 107 (1993) 145–157.
27. GANDER M.J. 50 years of time parallel time integration. Article 3 in “Multiple Shooting and Time
Domain Decomposition Methods”, Springer (2015).
28. GANDER M.J., HALPERN L . Techniques for locally adaptive time stepping developed over the last two
decades. Domain Decomp. Methods Sci. Engrg. XX (2013) 377–385.
29. GANDER M.J., NE UM ¨
ULLER M. Analysis of a new space-time parallel multigrid algorithm for
parabolic problems. SIAM J. Sci. Comput. 38 (2016) A2173–A2208.
30. GARON I C., SE RRA-CAPIZZANO S. Generalized Locally Toeplitz Sequences: Theory and Applications.
Volume I, Springer, Cham (2017).
31. GARON I C., SE RRA-CAPIZZANO S. Generalized Locally Toeplitz Sequences: Theory and Applications.
Volume II, Springer, Cham (2018).
32. GRIEBEL M., O ELT Z D. A sparse grid space-time discretization scheme for parabolic problems. Com-
puting 81 (2007) 1–34.
33. HESTHAVEN J.S., WARB URTON T. Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and
Applications. Springer, New York (2008).
34. HOFER C., LANGER U., NE UM ¨
ULLER M. Parallel and robust preconditioning for space-time isogeo-
metric analysis of parabolic evolution problems. SIAM J. Sci. Comput. 41 (2019) A1793–A1821.
35. HORTON G., VANDEWALLE S. A space-time multigrid method for parabolic partial differential equa-
tions. SIAM J. Sci. Comput. 16 (1995) 848–864.
36. HUGHES T.J.R., COTTR EL L J.A., BAZILEVS Y. Isogeometric analysis: CAD, finite elements, NURBS,
exact geometry and mesh refinement. Comput. Methods Appl. Mech. Engrg. 194 (2005) 4135–4195.
37. HUGHES T.J.R., HULB ERT G.M. Space-time finite element methods for elastodynamics: formulations
and error estimates. Comput. Methods Appl. Mech. Engrg. 66 (1988) 339–363.
38. KLAIJ C.M., VAN DER VE GT J.J.W., VAN DER VE N H. Space-time discontinuous Galerkin method for
the compressible Navier–Stokes equations. J. Comput. Phys. 217 (2006) 589–611.
39. KRAUSE D., KRAUSE R. Enabling local time stepping in the parallel implicit solution of reaction-
diffusion equations via space-time finite elements on shallow tree meshes. Appl. Math. Comput. 277
(2016) 164–179.
40. LA DY ˇ
ZENSKAJA O.A., SOLO NN IK OV V.A., URA LCEVA N.N. Linear and Quasi-Linear Equations of
Parabolic Type. Amer. Math. Soc. (1968).
41. LANGER U., MOORE S.E., NEU M ¨
ULLER M. Space-time isogeometric analysis of parabolic evolution
problems. Comput. Methods Appl. Mech. Engrg. 306 (2016) 342–363.
42. LANGER U., ZANK M. Efficient direct space-time finite element solvers for parabolic initial-boundary
value problems in anisotropic Sobolev spaces. arXiv:2008.01996 (2020).
43. LASAINT P., RAVI ART P.A. On a finite element method for solving the neutron transport equation. In
“Mathematical Aspects of Finite Elements in Partial Differential Equations”, Academic Press (1974),
pp. 89–123.
44. LOLI G., MONTARDINI M., SANGALLI G., TAN I M. An efficient solver for space-time isogeometric
Galerkin methods for parabolic problems. Comput. Math. Appl. 80 (2020) 2586–2603.
45. MAY D.A., SANA N P., RUP P K., KN EP LE Y M.G., SMITH B.F. Extreme-scale multigrid components
within PETSc. Article 5 in “Proceedings of the Platform for Advanced Scientific Computing Confer-
ence”, ACM (2016).
46. MCDONAL D E., WATHEN A. A simple proposal for parallel computation over time of an evolutionary
process with implicit time stepping. In “Numerical Mathematics and Advanced Applications ENUMATH
2015”, Springer (2016), pp. 285–293.
47. MEIDNER D., V EX LE R B. Adaptive space-time finite element methods for parabolic optimization prob-
lems. SIAM J. Control Optim. 46 (2007) 116–142.
48. MILLER S.T., HAB ER R.B. A spacetime discontinuous Galerkin method for hyperbolic heat conduction.
Comput. Methods Appl. Mech. Engrg. 198 (2008) 194–209.
Fast Parallel Solver for the Space-Time IgA-DG Discretization of the Diffusion Equation 19
49. NE UM ¨
ULLER M., STE IN BAC H O. Refinement of flexible space-time finite element meshes and discon-
tinuous Galerkin methods. Comput. Visual. Sci. 14 (2011) 189–205.
50. QUARTERON I A. Numerical Models for Differential Problems. Springer, Milan (2009).
51. SC H ¨
OTZAU D., SCHWAB C. An hp a priori error analysis of the DG time-stepping method for initial
value problems. Calcolo 37 (2000) 207–232.
52. SERRA-CAPIZZANO S. Generalized locally Toeplitz sequences: spectral analysis and applications to
discretized partial differential equations. Linear Algebra Appl. 366 (2003) 371–402.
53. SERRA-CAPIZZANO S. The GLT class as a generalized Fourier analysis and applications. Linear Al-
gebra Appl. 419 (2006) 180–233.
54. SHAKIB F., HUGHES T.J.R., ZDEN ˇ
EK J. A new finite element formulation for computational fluid dy-
namics: X. The compressible Euler and Navier–Stokes equations. Comput. Methods Appl. Mech. Engrg.
89 (1991) 141–219.
55. STEINBAC H O., YANG H. Comparison of algebraic multigrid methods for an adaptive space-time finite
element discretization of the heat equation in 3D and 4D. Numer. Linear Algebra Appl. 25(2018) e2143.
56. STEINBAC H O., YANG H. Space-time finite element methods for parabolic evolution equations: dis-
cretization, a posteriori error estimation, adaptivity and solution. In “Space-Time Methods: Applica-
tions to Partial Differential Equations”, Radon Series on Computational and Applied Mathematics 25
(2019), pp. 207–248.
57. SUDIRHAM J.J., VAN DE R VEG T J.J.W., VAN DAMME R.M.J. Space-time discontinuous Galerkin
method for advection-diffusion problems on time-dependent domains. Appl. Numer. Math. 56 (2006)
1491–1518.
58. TEZDUYAR T.E., BEH R M., LIOU J. A new strategy for finite element computations involving moving
boundaries and interfaces—The deforming-spatial-domain/space-time procedure: I. The concept and the
preliminary numerical tests. Comput. Methods Appl. Mech. Engrg. 94 (1992) 339–351.
59. TEZDUYAR T.E., BEH R M., MI TTAL S ., L IOU J. A new strategy for finite element computations involv-
ing moving boundaries and interfaces—The deforming-spatial-domain/space-time procedure: II. Com-
putation of free-surface flows, two-liquid flows, and flows with drifting cylinders. Comput. Methods
Appl. Mech. Engrg. 94 (1992) 353–371.
60. TEZDUYAR T.E., SATHE S., KEE DY R., ST EI N K. Space-time finite element techniques for computation
of fluid-structure interactions. Comput. Methods Appl. Mech. Engrg. 195 (2006) 2002–2027.
61. THITE S. Adaptive spacetime meshing for discontinuous Galerkin methods. Comput. Geom. 42 (2009)
20–44.
62. THOMÉE V. Galerkin Finite Element Methods for Parabolic Problems. Springer, New York (2006).
63. VAN DER VE GT J.J.W., VAN DER VE N H. Space-time discontinuous Galerkin finite element method
with dynamic grid motion for inviscid compressible flows: I. General formulation. J. Comput. Phys. 182
(2002) 546–585.
64. ZULIAN P., KO PANI ˇ
KOVÁ A., NES TO LA M.C.G., FINK A., FAD EL N ., M AGRI V., SCHNEIDER T.,
BOTT ER E., MANKAU J. Utopia: a C++ embedded domain specific language for scientific computing.
Git repository. https://bitbucket.org/zulianp/utopia (2016).
... The first analysis on DG methods as time stepping techniques was provided by [12] and [17], followed by the work of [41,50,48]. More recently, specialized solution methods have been introduced, for example by [45,40,31,4]. A priori and posteriori error analysis have been also provided, e.g. ...
... Specialized parallel solvers have been recently developed for large linear systems arising from space-time discretizations. We mention in particular the parallel STMG proposed by [24], the parallel preconditioners for space-time isogeometric analysis proposed by [28] and [4] as well as the block preconditioned GMRES by [37]. When dealing with a space-time discretization, where time is somehow considered as an additional spatial dimension, it is natural to extend the same paradigm for the solving process and consider space-time multigrid type algorithms. ...
... More specialized preconditioners and corresponding tensor solvers could also be applied to system (3.4); see, e.g., [39,9,4]. If γ = 0, equation (3.4) is non-linear and the STMG algorithm is encapsulated in a Newton iteration. ...
Preprint
We consider two parallel-in-time approaches applied to a (reaction) diffusion problem, possibly non-linear. In particular, we consider PFASST (Parallel Full Approximation Scheme in Space and Time) and space-time multilevel strategies. For both approaches, we start from an integral formulation of the continuous time dependent problem. Then, a collocation form for PFASST and a discontinuous Galerkin discretization in time for the space-time multigrid are employed, resulting in the same discrete solution at the time nodes. Strong and weak scaling of both multilevel strategies is compared for varying order of the temporal discretization. Moreover, we investigate the respective convergence behavior for non-linear problems and highlight quantitative differences. 1. Introduction. Since the clock frequency of computer processors has not increased significantly in the past fifteen years, an increase in computational performance for numerical algorithms can be achieved only by increasing parallel concur-rency, and modern supercomputers now contain many thousands of computing cores. Exploiting the capabilities of such massively parallel systems is not straightforward; algorithms with optimal complexity and excellent scalability must be designed to minimize the run-time of computationally intensive problems, such as the solution of time dependent partial differential equations (PDEs). When dealing with parallel solvers for discretized PDEs, the solution process is traditionally parallelized in space using domain decomposition techniques, until stagnation. Considering the technology trend, the traditional sequential time stepping will increasingly become the bottleneck for computational scalability. Hence, the development of new parallel methods that exploit concurrency in the time direction has become essential for time dependent problems. However, parallelization in time can be a challenging task, as, for many physical processes, the time direction is governed by a causality principle, with a preferential direction of information flow through the temporal domain, i.e. forward in time. Nevertheless, several new methods for temporal parallelization have been proposed in the last 20 years. For a more comprehensive review regarding the parallel-in-time literature of the past 50 years we refer to [22]. The objective of this work is to compare two of the most relevant recent approaches: PFASST [14] and space-time multigrid methods (STMG) [26, 29, 18, 24, 21, 6]. The current paper is similar in spirit to the comparison presented in [19] (where the authors suggest a future comparison with PFASST). Since many parallel-in-time methods are based on a coupling between coarse and fine time propagators, they can
Article
Full-text available
We consider two parallel-in-time approaches applied to a (reaction) diffusion problem, possibly non-linear. In particular, we consider PFASST (Parallel Full Approximation Scheme in Space and Time) and space-time multigrid strategies. For both approaches, we start from an integral formulation of the continuous time dependent problem. Then, a collocation form for PFASST and a discontinuous Galerkin discretization in time for the space-time multigrid are employed, resulting in the same discrete solution at the time nodes. Strong and weak scaling of both multilevel strategies are compared for varying orders of the temporal discretization. Moreover, we investigate the respective convergence behavior for non-linear problems and highlight quantitative differences in execution times. For the linear problem, we observe that the two methods show similar scaling behavior with PFASST being more favorable for high order methods or when few parallel resources are available. For the non-linear problem, PFASST is more flexible in terms of solution strategy, while space-time multigrid requires a full non-linear solve.
Article
Full-text available
We consider two parallel-in-time approaches applied to a (reaction) diffusion problem, possibly non-linear. In particular, we consider PFASST (Parallel Full Approximation Scheme in Space and Time) and space-time multigrid strategies. For both approaches, we start from an integral formulation of the continuous time dependent problem. Then, a collocation form for PFASST and a discontinuous Galerkin discretization in time for the space-time multigrid are employed, resulting in the same discrete solution at the time nodes. Strong and weak scaling of both multilevel strategies are compared for varying orders of the temporal discretization. Moreover, we investigate the respective convergence behavior for non-linear problems and highlight quantitative differences in execution times. For the linear problem, we observe that the two methods show similar scaling behavior with PFASST being more favorable for high order methods or when few parallel resources are available. For the non-linear problem, PFASST is more flexible in terms of solution strategy, while space-time multigrid requires a full non-linear solve.
Article
Full-text available
In computational mathematics, when dealing with a large linear discrete problem (e.g., a linear system) arising from the numerical discretization of a partial differential equation (PDE), the knowledge of the spectral distribution of the associated matrix has proved to be a useful information for designing/analyzing appropriate solvers---especially, preconditioned Krylov and multigrid solvers---for the considered problem. Actually, this spectral information is of interest also in itself as long as the eigenvalues of the aforementioned matrix represent physical quantities of interest, which is the case for several problems from engineering and applied sciences (e.g., the study of natural vibration frequencies in an elastic material). The theory of multilevel generalized locally Toeplitz (GLT) sequences is a powerful apparatus for computing the asymptotic spectral distribution of matrices A_n arising from virtually any kind of numerical discretization of PDEs. Indeed, when the mesh-fineness parameter n tends to infinity, these matrices A_n give rise to a sequence {A_n}_n, which often turns out to be a multilevel GLT sequence or one of its "relatives", i.e., a multilevel block GLT sequence or a (multilevel) reduced GLT sequence. In particular, multilevel block GLT sequences are encountered in the discretization of systems of PDEs as well as in the higher-order finite element or discontinuous Galerkin approximation of scalar/vectorial PDEs. In this work, we systematically develop the theory of multilevel block GLT sequences as an extension of the theories of (unilevel) GLT sequences [40], multilevel GLT sequences [41], and block GLT sequences [8]. We also present several emblematic applications of this theory in the context of PDE discretizations.
Article
Full-text available
In computational mathematics, when dealing with a large linear discrete problem (e.g., a linear system) arising from the numerical discretization of a differential equation (DE), the knowledge of the spectral distribution of the associated matrix has proved to be a useful information for designing/analyzing appropriate solvers---especially, preconditioned Krylov and multigrid solvers---for the considered problem. Actually, this spectral information is of interest also in itself as long as the eigenvalues of the aforementioned matrix represent physical quantities of interest, which is the case for several problems from engineering and applied sciences (e.g., the study of natural vibration frequencies in an elastic material). The theory of generalized locally Toeplitz (GLT) sequences is a powerful apparatus for computing the asymptotic spectral distribution of matrices A_n arising from virtually any kind of numerical discretization of DEs. Indeed, when the mesh-fineness parameter n tends to infinity, these matrices A_n give rise to a sequence {A_n}_n, which often turns out to be a GLT sequence or one of its "relatives", i.e., a block GLT sequence or a reduced GLT sequence. In particular, block GLT sequences are encountered in the discretization of systems of DEs as well as in the higher-order finite element or discontinuous Galerkin approximation of scalar/vectorial DEs. This work is a review, refinement, extension, and systematic exposition of the theory of block GLT sequences. It also includes several emblematic applications of this theory in the context of DE discretizations.
Article
Full-text available
The multidimensional heat equation, along with its more general version known as (linear) anisotropic diffusion equation, is discretized by a discontinuous Galerkin (DG) method in time and a finite element (FE) method of arbitrary regularity in space. We show that the resulting space-time discretization matrices enjoy an asymptotic spectral distribution as the mesh fineness increases, and we determine the associated spectral symbol, i.e., the function that carefully describes the spectral distribution. The analysis of this paper is carried out in a stepwise fashion, without omitting details, and it is supported by several numerical experiments. It is preparatory to the development of specialized solvers for linear systems arising from the FE--DG approximation of both the heat equation and the anisotropic diffusion equation.
Book
Full-text available
This book describes, in a basic way, the most useful and effective iterative solvers and appropriate preconditioning techniques for some of the most important classes of large and sparse linear systems. The solution of large and sparse linear systems is the most time-consuming part for most of the scientific computing simulations. Indeed, mathematical models become more and more accurate by including a greater volume of data, but this requires the solution of larger and harder algebraic systems. In recent years, research has focused on the efficient solution of large sparse and/or structured systems generated by the discretization of numerical models by using iterative solvers.
Article
In this work we focus on the preconditioning of a Galerkin space–time isogeometric discretization of the heat equation. Exploiting the tensor product structure of the basis functions in the parametric domain, we propose a preconditioner that is the sum of Kronecker products of matrices and that can be efficiently applied thanks to an extension of the classical Fast Diagonalization method. The preconditioner is robust w.r.t. the polynomial degree of the spline space and the time required for the application is almost proportional to the number of degrees-of-freedom, for a serial execution. By incorporating some information on the geometry parametrization and on the equation coefficients, we keep high efficiency with non-trivial domains and variable thermal conductivity and heat capacity coefficients.
Article
We propose and investigate new robust preconditioners for space-time Isogeometric Analysis of parabolic evolution problems. These preconditioners are based on a time parallel multigrid method. We consider a decomposition of the space-time cylinder into time-slabs which are coupled via a discontinuous Galerkin technique. The time-slabs provide the structure for the time-parallel multigrid solver. The most important part of the multigrid method is the smoother. We utilize the special structure of the involved operator to decouple its application into several spatial problems by means of generalized eigenvalue or Schur decompositions. Some of these problems have a symmetric saddle point structure, for which we present robust preconditions. Finally, we present numerical experiments confirming the robustness of our space-time IgA solver.
Article
The aim of this work is to compare algebraic multigrid (AMG) preconditioned GMRES methods for solving the nonsymmetric and positive definite linear systems of algebraic equations that arise from a space–time finite-element discretization of the heat equation in 3D and 4D space–time domains. The finite-element discretization is based on a Galerkin–Petrov variational formulation employing piecewise linear finite elements simultaneously in space and time. We focus on a performance comparison of conventional and modern AMG methods for such finite-element equations, as well as robustness with respect to the mesh discretization and the heat capacity constant. We discuss different coarsening and interpolation strategies in the AMG methods for coarse-grid selection and coarse-grid matrix construction. Further, we compare AMG performance for the space–time finite-element discretization on both uniform and adaptive meshes consisting of tetrahedra and pentachora in 3D and 4D, respectively. The mesh adaptivity occurring in space and time is guided by a residual-based a posteriori error estimation.