ArticlePDF Available

Abstract

We present in this paper a new method based on Bernoulli matrix polynomials to approximate the exponential of a matrix. The developed method has given rise to two new algorithms whose efficiency and precision are compared to the most efficient implementations that currently exist. For that, a state-of-the-art test matrix battery, that allows deeply exploring the highlights and downsides of each method, has been used. Since the new algorithms proposed here do make an intensive use of matrix products, we also provide a GPUs-based implementation that allows to achieve a high performance thanks to the optimal implementation of matrix multiplication available on these devices.
On Bernoulli matrix polynomials and matrix
exponential approximation
E. Defeza, J. Ib´nezb, P. Alonso-Jord´ac,, Jos´e M. Alonsob, J. Peinadoc
Universitat Polit`ecnica de Val`encia, Camino de Vera s/n, 46022, Valencia. Spain
aInstituto de Matem´atica Multidisciplinar
bInstituto de Instrumentaci´on para Imagen Molecular
cDepartment of Information Systems and Computation
Abstract
We present in this paper a new method based on Bernoulli matrix polynomials
to approximate the exponential of a matrix. The developed method has given
rise to two new algorithms whose efficiency and precision are compared to the
most efficient implementations that currently exist. For that, a state-of-the-art
test matrix battery, that allows deeply exploring the highlights and downsides of
each method, has been used. Since the new algorithms proposed here do make an
intensive use of matrix products, we also provide a GPUs-based implementation
that allows to achieve a high performance thanks to the optimal implementation
of matrix multiplication available on these devices.
Keywords:
Bernoulli matrix approximation, Matrix exponential function, GPU computing
1. Introduction
The computation of matrix functions has received attention in the last years
because of its several applications in different areas of science and technology. Of
all the matrix functions, the exponential matrix eA,ACr×r,stands out, due
both to its applications in the resolution of systems of differential equations, or
in the graph theory, and to the difficulties involved in its computation, see [1–5].
Among the proposed methods for the approximate computation of the expo-
nential matrix, two fundamental ones stand out: those based on rational Pad´e
approximations [6–10], and those based on polynomial approximations, which
are either Taylor series developments [11–13] or serial developments of Hermite
matrix polynomials [14, 15]. In general, polynomial approximations showed to
Corresponding author
Email addresses: edefez@imm.upv.es (E. Defez), jjibanez@dsic.upv.es (J. Ib´a˜nez),
palonso@upv.es (P. Alonso-Jord´a), jmalonso@dsic.upv.es (Jos´e M. Alonso),
jpeinado@dsic.upv.es (J. Peinado)
Preprint submitted to Journal of Computational and Applied MathematicsNovember 11, 2021
be more efficient than the Pad´e algorithm in tests because they are more ac-
curate, despite a slightly higher cost in some cases. All these methods use the
basic property of Scaling-Squaring, based on the relationship
eA=eA/2s2s
.
Thus, if Pm(A) is a matrix polynomial approximation of eA, then given a
matrix Aand a scaling factor s,Pm(A/2s) is an approximation to eA/2sand
eA(Pm(A/2s))2s
.(1)
Bernoulli polynomials and Bernoulli numbers have been extensively used in
several areas of mathematics, as number theory, and they appear in many math-
ematical formulas, such as the residual term of the Euler-Maclaurian quadra-
ture rule [16, p. 63], the Taylor series expansion of the trigonometric functions
tan (x),csc (x) and cot (x) [16, p. 116-117] and the Taylor series expansion of
the hyperbolic function tanh (x), [16, p. 125]. They are also employed in the
well known exact expression of the even values of the Riemann zeta function:
ξ(2k) = X
n1
1
n2k=(1)k1(2π)2kB2k
2(2k)! , k 1.
Moreover, they are even used for solving initial value problems [17], boundary
value problems [18, 19], high-order linear and nonlinear Fredholm and Volterra
integro-differential equations [20, 21], complex differential equations [22] and
partial differential equations [23–25]. An excellent survey about Bernoulli poly-
nomials and its applications can be found in [26]. The development of series
functions of Bernoulli polynomials has been studied in [27, 28].
In this paper, we present a new series development of the exponential matrix
in terms of Bernoulli matrix polynomials which demonstrates that the polyno-
mial approximations of the exponential matrix are more accurate and less com-
putationally expensive in most cases than those based on Pad´e approximants.
We also verify that this new method based on Bernoulli matrix polynomials is
a competitive method for the approximation of the exponential matrix, with a
similar computational cost than the Taylor one but, generally, more accurate.
The organization of the paper is as follows. Section 2 is devoted to Bernoulli
polynomials. We show how to obtain a series of a matrix exponential in terms
of Bernoulli matrix polynomials and how to approach that exponential of a
matrix. The following section describes the algorithms proposed. Tests and
comparisons are presented in Section 4. We close the document with some
conclusion remarks.
Notation
Throughout this paper, we denote by Cr×rthe set of all the complex square
matrices of size rand by Ithe identity matrix. A polynomial of degree m
means an expression of the form Pm(t) = amtm+am1tm1+··· +a1t+a0,
2
where tis a real variable and aj, for 0 jm, are complex numbers. In this
way, we can define the matrix polynomial Pm(B) for BCr×ras Pm(B) =
amBm+am1Bm1+·· · +a1B+a0I. Throughout this paper, we denote In
(or I) and 0n×nthe matrix identity and the null matrix of order n, respectively.
With dxe, we denote the result of rounding xto the nearest integer greater than
or equal to xand bxcdenotes the result of rounding xto the nearest integer
less than or equal to x. As usual, the matrix norm ||·|| denotes any subordinate
matrix norm; in particular || · ||1is the usual 1norm. Finally, if A(k, m) are
matrices in Cn×nfor m0, k 0, from [29] it follows that
X
m0X
k0A(k, m) = X
m0
m
X
k=0 A(k, m k).(2)
2. On Bernoulli matrix polynomials
The Bernoulli polynomials Bn(x) are defined in [16, p.588] as the coefficients
of the generating function
g(x, t) = tetx
et1=X
n0
Bn(x)
n!tn,|t|<2π, (3)
where g(x, t) is an holomorphic function in Cfor the variable t(it has an avoid-
able singularity in t= 0). Bernoulli polynomials Bn(x) has the explicit expres-
sion
Bn(x) =
n
X
k=0 n
kBkxnk,(4)
where the Bernoulli numbers are defined by Bn=Bn(0). Therefore, it follows
that the Bernoulli numbers satisfy
B0= 1,Bk=
k1
X
i=0 k
iBi
k+ 1 i, k 1.(5)
Note that B3=B5=··· =B2k+1 = 0, for k1. For a matrix ACr×rwe
define the mth Bernoulli matrix polynomial by the expression
Bm(A) =
m
X
k=0 m
kBkAmk.(6)
Thus, we can now calculate the exact value of eAt t
et1, where A
Cr×r. By using (3) and (6) one gets
eAt t
et1=
X
n0
An
n!tn
X
k0
Bk
k!tk
=X
n0X
k0
AnBk
n!k!tntk.
3
Taking into account that A(k, n) = AnBktntk
n!k!from (2), we have
eAt t
et1=X
n0
n
X
k=0
Bk
k!tkAnk
(nk)!tnk
=X
n0 n
X
k=0 n
kBkAnk!tn
n!=X
n0
Bn(A)tn
n!,
where Bn(A) is the n-th Bernoulli matrix polynomial defined in (6). In this
way, we can use the series expansion
eAt =et1
tX
n0
Bn(A)tn
n!,|t|<2π, (7)
to obtain approximations of the matrix exponential. To do this, let’s take sthe
scaling (to be determined) of the matrix Aand take t= 1 in (7). We use the
matrix exponential approximation
Pm(A/2s)=(e1)
m
X
n=0
Bn(A/2s)
n!.(8)
Approximation (8) has the problem that it is not expressed in explicit terms
of powers of the matrix A/2s. This explicit relationship is provided below.
Lemma 1. Given expression (8), we get
1
(e1)Pm(A/2s) =
m
X
n=0
Bn(A/2s)
n!=
m
X
i=0
α(m)
i(A/2s)i,(9)
where α(m)
i=
m
X
k=ik
kiBki
k!.
Proof: For m= 0 and m= 1 formula (9) trivially holds. From (8) we have that,
for m= 2 and using (6), one gets
1
(e1)P2(A/2s) =
2
X
n=0
Bn(A/2s)
n!
=1
0!B0(A/2s) + 1
1!B1(A/2s) + 1
2!B2(A/2s)
=1
0! 0
X
k=0 0
kBk(A/2s)0k!+1
1! 1
X
k=0 1
kBk(A/2s)1k!
+1
2! 2
X
k=0 2
kBk(A/2s)2k!
4
= 2
X
k=0
1
k!k
kBk!(A/2s)0+ 2
X
k=1
1
k!k
k1Bk1!(A/2s)1
+ 2
X
k=2
1
k!k
k2Bk2!(A/2s)2
=α(2)
0(A/2s)0+α(2)
1(A/2s)1+α(2)
2(A/2s)2
=
2
X
i=0
α(2)
i(A/2s)i,
thus formula (9) is true for m= 2. Similarly, for m= 3 one gets:
1
(e1)P3(A/2s) =
3
X
n=0
Bn(A/2s)
n!
=1
0!B0(A/2s) + 1
1!B1(A/2s) + 1
2!B2(A/2s) + 1
3!B3(A/2s)
=1
0! 0
X
k=0 0
kBk(A/2s)0k!+1
1! 1
X
k=0 1
kBk(A/2s)1k!
+1
2! 2
X
k=0 2
kBk(A/2s)2k!+1
3! 3
X
k=0 3
kBk(A/2s)3k!
= 3
X
k=0
1
k!k
kBk!(A/2s)0+ 3
X
k=1
1
k!k
k1Bk1!(A/2s)1
+ 3
X
k=2
1
k!k
k2Bk2!(A/2s)2+ 3
X
k=3
1
k!k
k3Bk3!(A/2s)3
=α(3)
0(A/2s)0+α(3)
1(A/2s)1+α(3)
2(A/2s)2+α(3)
3(A/2s)3
=
3
X
i=0
α(3)
i(A/2s)i,
and formula (9) is also true for m= 3. We will use now an induction argument.
Suppose that formula (9) it is valid for mand let’s see for m+ 1. Using (6)
together with the induction hypothesis, one gets
1
(e1)Pm+1(A/2s) =
m+1
X
n=0
Bn(A/2s)
n!
=
m
X
n=0
Bn(A/2s)
n!+1
(m+ 1)!Bm+1 (A/2s)
=
m
X
i=0
α(m)
i(A/2s)i+1
(m+ 1)! m+1
X
k=0 m+ 1
kBk(A/2s)m+1k!
5
=α(m)
0+m+ 1
m+ 1Bm+1
(m+ 1)!(A/2s)0+· · · +α(m)
m+m+ 1
1B1
(m+ 1)!(A/2s)m
+m+ 1
0B0
(m+ 1)!(A/2s)m+1
= m+1
X
k=0 k
kBk
k!!(A/2s)0+ m+1
X
k=1 k
k1Bk1
k!!(A/2s)1+·· ·
·· · + m+1
X
k=m+1 k
k(m+1)Bk(m+1)
k!!(A/2s)m+1
=
m+1
X
i=0
α(m+1)
i(A/2s)i.
3. The proposed algorithms
The matrix polynomial Pm(A/2s) from (8) can be computed efficiently in
terms of matrix products using values for min the set
mk∈ {2,4,6,9,12,16,20,25,30, . . .},
with k= 1,2, . . ., by using the Paterson–Stockmeyer’s method [30]. If we con-
sider Pmk(A), then:
Pmk(A) = (10)
(((pmkAq+pmk1Aq1+pmk2Aq2+· ·· +pmkq+1A+pmkqI)Aq
+pmkq1Aq1+pmkq2Aq2+· ·· +pmk2q+1A+pmk2qI)Aq
+pmk2q1Aq1+pmk2q2Aq2+· ·· +pmk3q+1A+pmk3qI)Aq
. . .
+pq1Aq1+pq2Aq2+· ·· +p1A+p0I,
where q=mkor q=mk. Taking into account Table 4.1 from [4, pp.
74], the computational cost in terms of matrix products of (10) is k. To obtain
the exponential of matrix Awith enough precision and efficiency, it is necessary
to determine the values of mand sin expression (8). Once these values have
been determined, the approximation (1) is used to compute eAby means of
Bernoulli matrix polynomials.
For obtaining the optimal order of the series expansion mkand the scaling
parameter s, we have used a similar error analysis as in [12]:
Let be
Am=Xn×n:ρeXTm(X)I<1,
where ρ(·) is the spectral radius of a matrix and Tm(X) the Taylor approxima-
tion of matrix exponential of order m.
6
The backward error for computing eAcan be defined as the matrix ∆Asuch
that eA+∆A=Tm(A). It can be verified that
AwX
km+1
c(m)
kAk,
where P
km+1
c(m)
kAkis the backward error of matrix exponential for the Taylor
approximation. The absolute backward error can be bound as follows:
Eab(A) = kAk '
X
km+1
c(m)
kAk
X
km+1 |c(m)
k|||Ak||1/kk
X
km+1 |c(m)
k|βk
m,(11)
where βm= max ||Ak||1/k , k m+ 1. Theorem 2 from [12] shows that
βm=||Ak0||1/k0, where m+ 1 k02m+ 1, and that value can be approxi-
mated by means of
βm'max na1/(m+1)
m+1 , a1/(m+2)
m+2 o,
where am+1 and am+2 are the 1–norm estimation of ||Am+1|| and ||Am+2 ||, re-
spectively, using the block 1–norm estimation algorithm of Higham and Tisseur
[31]. Let be
Θ(ab)
m= max
θ0 : X
km+1 c(m)
kθku
,(12)
where u= 253 is the unit roundoff in IEEE double precision floating-point
arithmetic. If an integer s0 verifies 2sβm<Θ(ab)
m, then
Eab(A/2s)X
km+1 c(m)
k(ab)
m)k< u
and mand swill be taken, respectively, as the adequate order of the polynomial
and the scaling parameter.
Using the above reasoning, the relative backward error can be bound in the
following way:
Erb(A) = kAk
kAk| '
P
km+1
c(m)
kAk
kAkP
km+1 |c(m)
k|
Ak
kAk
=X
km|c(m)
k+1|||Ak||1/k kX
km|c(m)
k+1|βk
m.(13)
7
Let be
Θ(rb)
m= max
θ0 : X
kmc(m)
k+1θku
.(14)
Therefore, if an integer s0 satisfies 2sβm<Θ(rb)
m, then
Erb(A/2s)X
kmc(m)
k+1(rb)
m)k< u
and the polynomial order mand the scaling parameter swill have been obtained.
The Θ(ab)
mand Θ(rb)
mparameters were worked out, with the required pre-
cision by using symbolic computations, from m= 2 to m= 30. Then, the
maximum value Θmbetween Θ(ab)
mand Θ(rb)
mwas computed for each m, i.e.
Θm=max(ab)
m,Θ(rb)
m), giving place to the Θmparameter included in the sec-
ond column of Table 2 from [12]. As a result, Θmequals Θ(ab)
mwhen m16 and
Θmmatches Θ(rb)
motherwise. On other words, the absolute backward error was
considered when the polynomial order is less or equal than 16 and the relative
backward error was taken into account for greater values.
Algorithm 1 computes the matrix exponential function based on Bernoulli
series and the Paterson–Stockmeyer’s method. Step 1 of this algorithm uses
the procedure previously described to obtain the values of mand s(as known,
a more detailed description can be found in [12]). In steps 2, the Bernoulli
approximation is employed, depending of value mcalculated in step 1. Finally,
in steps 3-5, the matrix exponential is recovered.
Algorithm 1 Scaling and squaring Bernoulli algorithm for computing B=eA,
where ACr×rand mMis the maximum approximation order allowed.
1: Choose adequate order mk6mMand scaling parameter sN∪ {0}
2: B=Pmk(A/2s) by using (10) (Pmk(·) Bernoulli matrix polynomial)
3: for i=1:sdo Recovering the matrix exponential
4: B=B2
5: end for
Bearing in mind (9), if Pm(A/2s) represents the matrix polynomial of order
mcorresponding to the Bernoulli approximation of exponential of matrix A/2s
and Tm(A/2s) denotes the same matrix polynomial but according to the Taylor
approach, then:
Pm(A/2s)=(e1)
m
X
i=0
α(m)
i(A/2s)i=
m
X
i=0
b(m)
i(A/2s)i,(15)
Tm(A/2s) =
m
X
i=0
1
i!(A/2s)i=
m
X
i=0
ti(A/2s)i.
8
Table 1: Approximation polynomial coefficient vector differences between Bernoulli (b) and
Taylor (t) methods.
m ||b-t|| ||b-t||/||t||
2 5.023311e-01 2.009324e-01
4 5.695696e-02 2.103026e-02
6 2.741618e-03 1.008669e-03
9 1.293850e-05 4.759808e-06
12 4.657888e-08 1.713541e-08
16 2.819122e-11 1.037097e-11
20 1.445479e-14 5.317621e-15
25 2.735502e-19 1.006335e-19
30 4.901565e-22 1.803185e-22
The coefficients b(m)
iof the Bernoulli approximation polynomial differ sig-
nificantly from those of the Taylor one, ti, with i= 0, . . . , m, when m
{2,4,6,9,12,16,20}. However, they are practically identical for m∈ {25,30, . . .}.
This is so because, as the degree mof the Bernoulli polynomial increases, all its
coefficients b(m)
ivary, approaching the corresponding Taylor ones. Table 1 col-
lects the 1-norm of the absolute and relative differences between the coefficient
vectors of Bernoulli and Taylor polynomial approximations for different values
of m. As it can be seen, the 1-norm of these differences is less than the unit
roundoff (u= 253 w1.11 ×1016) when m= 25 or m= 30. As an example,
the 1-norm of the relative difference between these coefficients when m= 25
is equal to 1.006335 ×1019. As expected, the differences are smaller when m
increases.
Therefore, these experimental results show that backward errors expressed
in (11) and (13) are only fulfilled for Bernoulli approximation for values of m
greater than or equal to 25, but not for lower values. As a consequence of
this analysis carried out, the Algorithm 2 has been developed. It computes the
matrix exponential by means of Taylor series, when mis less than or equal to
20, or by means of Bernoulli series, when mis equal to 25 or 30.
Algorithm 2 Scaling and squaring Bernoulli algorithm for computing B=eA,
where ACr×rand mMis the maximum approximation order allowed.
1: Choose adequate order mk6mMand scaling parameter sN∪ {0}
2: if mk20 then
3: B=Pmk(A/2s) using (10) (Pmk(·) Taylor matrix polynomial)
4: else
5: B=Pmk(A/2s) using (10) (Pmk(·) Bernoulli matrix polynomial)
6: end if
7: for i=1:sdo Recovering the matrix exponential
8: B=B2
9: end for
9
4. Numerical experiments
In this section, we will firstly compare expmber, the MATLAB implementa-
tion corresponding to Algorithm 1, based on Bernoulli approximation, with the
functions exptaynsv3 [12], that computes the matrix exponential using Taylor
matrix polynomials, and expm new [8], which implements a scaling and squaring
Pad´e-based algorithm to work out the mentioned matrix function. Next, we
will compare expmbertay, the function that combines Taylor and Bernoulli ap-
proximations in accordance with Algorithm 2, with expmber,exptaynsv3 and
expm new.
Algorithm 3 computes the “exact” matrix exponential function thanks to
MATLAB’s Symbolic Math Toolbox with 256 digits of precision. This algorithm
provides the exact solution when it finds that the relative error between T(j)
mk(n)
and T(i)
mk1(n) is less than u= 253 (see (18)), where T(s)
m(n) is the Taylor matrix
approximation of order mof the scaled matrix A/2s, computed with ndigits of
precision by using the vpa (variable-precision arithmetic) MATLAB function in
the Algorithm 4. Previously, T(j)
mk(n) and T(i)
mk1have been calculated so that
(16) and (17) are fulfilled.
Algorithm 3 Computes the “exact” matrix exponential T=eA, where A
Cr×r, by means of Taylor expansion.
1: if there exist two consecutive orders mk1, mk∈ {30,36,42,49,56,64}and
integers 1 i, j 15 for ssuch that
T(i)
mk1(n)T(i1)
mk1(n)
1
T(i)
mk1(n)
1
< u, (16)
and
T(j)
mk(n)T(j1)
mk(n)
1
T(j)
mk(n)
1
< u, (17)
and
T(j)
mk(n)T(i)
mk1(n)
1
T(j)
mk(n)
1
< u (18)
by using Algorithm 4, then
return T=T(j)
mk(n)
2: else
return error
3: end if
10
Algorithm 4 Computes T(s)
m(n) = eA, where ACr×r, by Taylor expansion
of order mand scaling parameter susing vpa MATLAB function with ndigits
of precision.
1: Compute T(s)
m(n) = Pm(A/2s) using Taylor expansion of order mwith n
digits of precision
2: for i=1:sdo
3: T(s)
m(n) = [T(s)
m(n)]2
4: end for
4.1. Experiments description
The following test battery, composed of three types of different and repre-
sentative matrices, has been chosen to compare the numerical performance of
the above described codes:
a) One hundred diagonalizable 128×128 real matrices with 1-norms varying
from 2.18 to 207.52. These matrices have the form A=V DV T, where D
is a diagonal matrix with real and complex eigenvalues and Vis an orthog-
onal matrix obtained as V=H/128, being Hthe Hadamard matrix.
The “exact” matrix exponential was computed as exp(A) = Vexp(D)VT
(see [4, pp. 10]).
b) One hundred non-diagonalizable 128×128 complex matrices with 1-norms
ranging from 84 to 98. These matrices have the form A=V JV T, where
Jis a Jordan matrix with complex eigenvalues of modulus less than 10
and random algebraic multiplicity varying from 1 to 5. Vis an orthogonal
matrix obtained as V=H/128, where His the Hadamard matrix. The
“exact” matrix exponential was worked out as exp(A) = Vexp(J)VT.
c) State-of-the-art matrices:
Forty 128 ×128 matrices from the Matrix Computation Toolbox
(MCT) [32].
Sixteen matrices from the Eigtool MATLAB package (EMP) [33] with
sizes 127 ×127 and 128 ×128.
The “exact” matrix exponential for these matrices was computed by
using Taylor approximations of orders 30, 36, 42, 49, 56 and 64,
changing their scaling parameter (see Algorithm 3).
Although the MCT and the EMP are initially composed of fifty-two
and twenty matrices, respectively, twelve from the MCT and four
from the EMP matrices were discarded for different reasons. For
example, matrices numbers 5, 10, 16, 17, 21, 25, 26, 42, 43, 44 and
49 belonging to the MCT and matrices numbers 5 and 6 appertaining
to the EMP were not taken into account since the exact exponential
solution could not be computed. Besides, matrix number 2 from
the MCT and matrices numbers 3 and 10 from the EMP were not
11
Table 2: Matrix products (P) for Tests 1, 2 and 3 using expmber,exptaynsv3 and expm new
MATLAB codes.
P(expmber) P(exptaynsv3) P(expm new)
Test 1 1131 1131 1178.33
Test 2 1100 1100 1227.33
Test 3 617 617 654.67
considered because the excessively high relative error provided by all
the methods to be compared.
An experiment, called Test, is performed for each of the three sets of matrices
described, respectively, that evaluates the computational cost and the numerical
accuracy of the methods under comparison. The three tests have been executed
using MATLAB (R2018b) running on an HP Pavilion dv8 Notebook PC with
an Intel Core i7 CPU Q720 @1.60Ghz processor and 6 GB of RAM.
4.2. Experimental results
Table 2 shows the computational costs of each method represented in terms
of the number of matrix products (P), taking into account that the cost of
the rest of the operations is negligible compared to it for big enough matrices.
As it can be seen, expmber and exptaynsv3 achieved an identical number of
matrix multiplications, since the same algorithm was used by both of them to
calculate the degree of the polynomial (m) and the value of the scaling (s).
This number of products was lower than that required by expm new, which gave
rise to the highest computational cost. In addition to the matrix products,
expm new solves a system of linear equations with nright-hand side vectors
where nrepresents the size of the square coefficient matrix, whose computational
cost was approximated as 4/3 matrix products.
Table 3, on the other hand, shows the percentage of cases in which the
relative errors of expmber are lower, greater or equal than those of exptaynsv3
and expm new. More in detail, the relative error was computed as
E = kexp(A)˜exp(A)k1
kexp(A)k1
where ˜exp(A) is the approximate solution and exp(A) is the exact one.
With the exception of Test 3, the Bernoulli approach resulted in relative
errors lower than those of Taylor one. With regard to Pad´e, the Bernoulli
algorithm always offered considerably more accurate results, which reached up
to 100% of the matrices for Test 2.
For the three tests, respectively, the normwise relative errors (a), the per-
formance profiles (b), the ratio of the relative errors (c) and the ratio of the
matrix products (d) among the distinct compared methods have been plotted
in Figures 1, 2, and 3.
12
Table 3: Relative error comparison among expmber vs exptaynsv3 and expmber vs expm new
for the three tests.
Test 1 Test 2 Test 3
E(expmber)<E(exptaynsv3) 56% 91% 30.36%
E(expmber)>E(exptaynsv3) 43% 9% 62.5%
E(expmber)=E(exptaynsv3) 1% 0% 7.14%
E(expmber)<E(expm new) 97% 100% 69.64%
E(expmber)>E(expm new) 3% 0% 30.36%
E(expmber)=E(expm new) 0% 0% 0%
Regarding the normwise relative errors presented in Figures 1a, 2a and 3a,
their solid line represents the function kexpu, where kexp (or cond) is the con-
dition number of the matrix exponential function [4, Chapter 3] and uis the
unit roundoff. In general, expmber exhibited a very good numerical stability.
This can be appreciated seeing the distance from each matrix normwise relative
error to the cond uline. In Figures 1a and 2a, the numerical stability is even
much better because these errors are below this line. Because kexp was infinite
or enormously high for the matrices 6, 7, 12, 15, 23, 36, 39, 50 and 51 from
the MCT and for the matrices 1, 4, 8 and 15 from the EMP, all of them were
rejected in the Figure 3a visualisation but considered in the other ones.
In the performance profile Figures (1b, 2b and 3b), the αcoordinate, on the
x-axis, varies from 1 to 5 in steps equal to 0.1. For a concrete αvalue, the p
coordinate, on the y-axis, means the probability that the considered algorithm
has a relative error lower than or equal to α-times the smallest relative error
over all the methods on the given test. For the first two tests (Figures 1b,
2b), the performance profile shows that Bernoulli and Taylor methods accuracy
was similar. Both of them had much better correctness than the Pad´e method.
Notwithstanding, Figure 3b reveals that exptaynsv3 code improved the result
accuracy against the expmber function, for the Test 3.
In Figures 1c, 2c and 3c, the ratios of relative errors have been presented
in decreasing order with respect to E(expmber)/E(exptaynsv3). They confirm
the data exposed in Table 3, where it was shown that expmber provides more
accurate results than exptaynsv3 for Tests 1 and 2, but not for Test 3. It is
obvious to note that Pad´e offered the worst performance in most cases.
In our opinion, this is clearly due to the distinctive numerical characteristics
of the 3 sets of matrices analysed and the degree of the polynomial (m) re-
quired to be used. According to our experience, expmber provides results with
a very appropriate accuracy for values of mequal to 25 or 30. However, for
significantly lower values, expmber will be less competitive than other codes,
such as exptaynsv3. Minimum, maximum and average values of mrequired for
Tests 1, 2 and 3 are collected in Table 4. In more detail, Figure 4 shows the
approximation polynomial order employed in the calculation of the exponential
function by means of expmber (or exptaynsv3) for each of the matrices that
are part of the test battery.
As it was presented in Table 2, expmber and exptaynsv3 functions per-
13
0 20 40 60 80 100
Matrix
10-16
10-15
10-14
10-13
10-12
Er
cond*u
expmber
exptaynsv3
expm_newm
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmber
exptaynsv3
expm_newm
(b) Performance profile.
0 20 40 60 80 100
Matrix
10-2
10-1
100
101
102
103
Relative error ratio
E(expmber)/E(exptaynsv3)
E(expmber)/E(expm_newm)
(c) Ratio of relative errors.
0 20 40 60 80 100
Matrix
1
1.05
1.1
1.15
1.2
Matrix product ratio
P(exptaynsv3)/P(expmber)
P(expm_newm)/P(expmber)
(d) Ratio of matrix products.
Figure 1: Experimental results for Test 1.
Table 4: Minimum, maximum and average polynomial degree (m) required for Tests 1, 2 and
3 using expmber or exptaynsv3 functions.
Minimum Maximum Average
Test 1 16 30 27.51
Test 2 30 30 30
Test 3 12 30 25.70
formed a lower number of matrix operations than expm new one. This statement
can be also corroborated from the results displayed in Figures 1d, 2d and 3d,
where the ratio between the number of expm new and expmber matrix products
ranged from 1.03 to 1.22 for Test 1, from 1.03 to 1.12 for Test 2 and from 0.67
to 2.87 for Test 3.
Next, we will analyse the idea of using Bernoulli and Taylor methods to-
gether, giving place to a novel approach to compute the matrix exponential
function. For that, we will start getting the benefits of the exptaynsv3 func-
tion against expm new one. As Table 5 shows, the percentage of cases in which
14
0 20 40 60 80 100
Matrix
10-15
10-14
Er
cond*u
expmber
exptaynsv3
expm_newm
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmber
exptaynsv3
expm_newm
(b) Performance profile.
0 20 40 60 80 100
Matrix
0.2
0.3
0.4
0.5
0.6
Relative error ratio
E(expmber)/E(exptaynsv3)
E(expmber)/E(expm_newm)
(c) Ratio of relative errors.
0 20 40 60 80 100
Matrix
1
1.02
1.04
1.06
1.08
1.1
1.12
Matrix product ratio
P(exptaynsv3)/P(expmber)
P(expm_newm)/P(expmber)
(d) Ratio of matrix products.
Figure 2: Experimental results for Test 2.
Table 5: Relative error comparison between exptaynsv3 and expm new for the three tests.
Test 1 Test 2 Test 3
E(exptaynsv3)<E(expm new) 100% 100% 89.29%
E(exptaynsv3)>E(expm new) 0% 0% 10.71%
E(exptaynsv3)=E(expm new) 0% 0% 0%
Taylor relative error is lower than Pad´e reaches 100% for Test 1 and 2, and
89.29% for Test 3. Evidently, these error percentages improve those offered by
Bernoulli approximation with respect to expm new, as described in Table 3.
From these excellent results, we therefore considered the possibility of com-
bining Bernoulli and Taylor methods, giving rise to the expmbertay code.
In this new function, and according to the comparison between the coeffi-
cients of their polynomials carried out previously, we will use the Taylor ap-
proach (exptaynsv3) for values of m below 25 and the Bernoulli approximation
(expmber) when m equals 25 or 30. In this way, the number of matrix prod-
ucts needed by expmbertay will obviously be identical to that of expmber or
15
0 10 20 30 40 50
Matrix
10-20
10-15
10-10
Er
cond*u
expmber
exptaynsv3
expm_newm
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
p
expmber
exptaynsv3
expm_newm
(b) Performance profile.
0 10 20 30 40 50 60
Matrix
10-10
10-5
100
105
1010
Relative error ratio
E(expmber)/E(exptaynsv3)
E(expmber)/E(expm_newm)
(c) Ratio of relative errors.
0 10 20 30 40 50 60
Matrix
1
1.5
2
2.5
Matrix product ratio
P(exptaynsv3)/P(expmber)
P(expm_newm)/P(expmber)
(d) Ratio of matrix products.
Figure 3: Experimental results for Test 3.
exptaynsv3.
Table 6 collects thus the percentage of matrices in which the relative errors
of expmbertay are lower, greater or equal than those of exptaynsv3,expmber
and expm new. For the vast majority of matrices, expmbertay provided an accu-
racy in the results practically identical to that of expmber, but improving even
the latter by 23.21% for the matrices of Test 3. With respect to exptaynsv3,
expmbertay also enhanced the results achieved by expmber, so that this com-
bined method is now better or equal than exptaynsv3 in 55.36% of cases for Test
3. Moreover, expmbertay became better than expm new in 100% of the matrices
for Test 1 and 2, and 91.07% for Test 3, which is higher than the percentages
individually offered by expmber (69.64%) and exptaynsv3 (89.29%).
Numerical features of expmbertay are finally exposed in Figures 5, 6 and 7
for the three tests by means of the normwise relative errors (a), the performance
profiles (b) and the ratio of the relative errors (c). As it can be seen, the method
presents an excellent precision in the results, with very low relative errors and
a very high probability in the performance profile pictures.
16
0 20 40 60 80 100
Matrix
16
18
20
22
24
26
28
30
Approximation polynomial order
M(expmber)
M(exptaynsv3)
(a) Test 1.
0 20 40 60 80 100
Matrix
29
29.5
30
30.5
31
Approximation polynomial order
M(expmber)
M(exptaynsv3)
(b) Test 2.
0 10 20 30 40 50 60
Matrix
10
15
20
25
30
Approximation polynomial order
M(expmber)
M(exptaynsv3)
(c) Test 3.
Figure 4: Polynomial order (m) for Test 1, 2 and 3.
Table 6: Relative error comparison among expmbertay vs exptaynsv3,expmbertay vs expmber,
and expmbertay vs expm new for the three tests.
Test 1 Test 2 Test 3
E(expmbertay)<E(exptaynsv3) 57% 91% 44.64%
E(expmbertay)>E(exptaynsv3) 42% 9% 44.64%
E(expmbertay)=E(exptaynsv3) 1% 0% 10.72%
E(expmbertay)<E(expmber) 3% 0% 23.21%
E(expmbertay)>E(expmber) 0% 0% 0%
E(expmbertay)=E(expmber) 97% 100% 76.79%
E(expmbertay)<E(expm new) 100% 100% 91.07%
E(expmbertay)>E(expm new) 0% 0% 8.93%
E(expmbertay)=E(expm new) 0% 0% 0%
We have also included into the developed software an “accelerated” version
of the expmber function that computes the matrix exponential on an NVIDIA
GPU. Matrix multiplication is an operation very rich in intrinsic parallelism that
17
0 20 40 60 80 100
Matrix
10-16
10-15
10-14
10-13
10-12
Er
cond*u
expmbertay
exptaynsv3
expmber
expm_newm
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmbertay
exptaynsv3
expmber
expm_newm
(b) Performance profile.
0 20 40 60 80 100
Matrix
10-2
10-1
100
101
102
103
Relative error ratio
E(expmbertay)/E(exptaynsv3)
E(expmbertay)/E(expmber)
E(expmbertay)/E(expm_newm)
(c) Ratio of relative errors.
Figure 5: Experimental results for Test 1.
can be optimized for GPUs. Algorithms that rely on many matrix multiplica-
tions, like the one proposed, can take full advantage of these devices through the
use of the cuBLAS [34] package. Our “GPU version” uses the regular MATLAB
scripting language in the same way as in the other algorithms used so far but, at
some points in the code, a function implemented into a MEX file is called. This
function is implemented in CUDA [35] and dispatches the operation described
in the function to the GPU. This way, all the matrix products are computed
by the GPU presented in the computing platform. The exact implementation
details about how these MEX files are implemented can be found in [36].
The experimental results corresponding to this part of the work were carried
out on a computer equipped with two processors Intel Xeon CPU E5-2698 v4
@2.20GHz (Broadwell architecture) featuring 20 cores each. The regular MAT-
LAB files, i.e. all those that do not make use of the GPU through a MEX file,
18
0 20 40 60 80 100
Matrix
10-15
10-14
Er
cond*u
expmbertay
exptaynsv3
expmber
expm_newm
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmbertay
exptaynsv3
expmber
expm_newm
(b) Performance profile.
0 20 40 60 80 100
Matrix
0.2
0.3
0.4
0.5
0.6
Relative error ratio
E(expmbertay)/E(exptaynsv3)
E(expmbertay)/E(expmber)
E(expmbertay)/E(expm_newm)
(c) Ratio of relative errors.
Figure 6: Experimental results for Test 2.
use the 40 cores available in the target computer by default1. We denote this
implementation as the “CPU version” when compared with the “GPU version”
described above. To get the algorithm performance on the GPU, we used one
NVIDIA Tesla P100-SXM2 (Kepler architecture) that counts on 3584 CUDA
cores and 16 GB of memory.
Figure 8 shows the execution time in seconds on the left and the speed up
achieved with the GPU version with regard to its CPU counterpart on the right.
The plots also compare the performance of the former algorithm based on Tay-
lor series (exptaynsv3) with the new one based on Bernoulli series (expmber)
presented here. At the light of the figure, it can be concluded that both algo-
1“Linear algebra and numerical functions such as fft,\(mldivide),eig,svd, and sort
are multithreaded in MATLAB. Multithreaded computations have been on by default in
MATLAB since Release 2008a.” In particular, MATLAB uses the Intel MKL, where the
matrix multiplication is threaded, i.e. is a parallel implementation with OpenMP.
19
0 10 20 30 40 50
Matrix
10-20
10-15
10-10
Er
cond*u
expmbertay
exptaynsv3
expmber
expm_newm
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
p
expmbertay
exptaynsv3
expmber
expm_newm
(b) Performance profile.
0 10 20 30 40 50 60
Matrix
10-10
10-5
100
105
1010
Relative error ratio
E(expmbertay)/E(exptaynsv3)
E(expmbertay)/E(expmber)
E(expmbertay)/E(expm_newm)
(c) Ratio of relative errors.
Figure 7: Experimental results for Test 3.
rithms, exptaynsv3 and expmber, behave very similarly. The reduction in time
obtained with the GPU with respect to the CPU starts approximately with ma-
trices of size n= 1000 and increases with the problem size. The weight of both
algorithms falls on the same basic computational kernel (matrix multiplications)
and both of them require an identical number of them. The computational per-
formance of routine expmbertay would be very similar to expmber, since it uses
once again the same number of matrix products.
5. Conclusions
The starting point of this work is a new expression of the matrix exponential
function cast in terms of Bernoulli matrix polynomials. Using this series expan-
sion, a new method for calculating the exponential of a matrix (implemented as
expmber code) has been developed. The proposed algorithm has been tested us-
ing a state-of-the-art matrix test battery with different features (diagonalizable
and non diagonalizable, with particular eigenvalue spectrum) that covers a wide
20
0
5
10
15
20
1000 2000 3000 4000 5000 6000 7000 8000 9000
Problem dimension
exptaynsv3 CPU.
expmber CPU.
exptaynsv3 GPU.
expmber GPU.
(a) Execution time.
0
0.5
1
1.5
2
2.5
3
3.5
4
1000 2000 3000 4000 5000 6000 7000 8000 9000
Problem dimension
exptaynsv3
expmber
(b) Speed up (GPU time / CPU time).
Figure 8: Execution time (a) and speed up (b) of the algorithm to compute the matrix
exponential using the Taylor series (exptaynsv3) and the Bernoulli (expmber) series on CPU
and on GPU for large randomly generated matrices.
range of cases. The developed code has been compared with the best imple-
mentations available, i.e. Pad´e-based algorithm (expm new) and Taylor-based
one (exptaynsv3), outperforming Pad´e-based algorithm and giving results at
the level of Taylor-based solutions in both accuracy and computational cost.
Preliminary results with the Bernoulli version for the matrix exponential
function motivated us to develop a hybrid code (called expmbertay) that com-
bines the best of both the Taylor and Bernoulli solutions and resulting in ex-
cellent results. Therefore, expmbertay code is clearly competitive and highly
recommended for the matrix exponential calculation, regardless of the type of
matrix to be computed. Finally, we showed that the two algorithms developed
in this contribution keep the advantages of other ones based on matrix polyno-
mial expansion. Since they all are based on matrix multiplications, the GPU
version implemented has turned out to be a strong tool to compute the matrix
exponential approximation when the numerical methods employed are stressed
with large dimension matrices.
Acknowledgements
This work has been partially supported by Spanish Ministerio de Econom´ıa
y Competitividad and European Regional Development Fund (ERDF) grants
TIN2017-89314-P and by the Programa de Apoyo a la Investigaci´on y De-
sarrollo 2018 of the Universitat Polit`ecnica de Val`encia (PAID-06-18) grants
SP20180016.
References
[1] C. F. V. Loan, A study of the matrix exponential, numerical analysis report,
Tech. rep., Manchester Institute for Mathematical Sciences, The University
21
(2006).
[2] C. B. Moler, C. V. Loan, Nineteen dubious ways to compute the exponential
of a matrix, SIAM Rev. 20 (4) (1978) 801–836.
[3] C. B. Moler, C. V. Loan, Nineteen dubious ways to compute the exponential
of a matrix, twenty-five years later*, SIAM Rev. 45 (2003) 3–49.
[4] N. J. Higham, Functions of Matrices: Theory and Computation, SIAM,
Philadelphia, PA, USA, 2008.
[5] M. Benzi, E. Estrada, C. Klymko, Ranking hubs and authorities using
matrix functions, Linear Algebra and its Applications 438 (2013) 2447–
2474.
[6] G. A. Baker, P. Graves-Morris, Pad´e Approximants, Encyclopedia of Math-
ematics and its Applications, Cambridge University Press, 1996.
[7] L. Dieci, A. Papini, Pad´e approximation for the exponential of a block
triangular matrix, Linear Algebra Appl. 308 (2000) 183–202.
[8] A. H. Al-Mohy, N. J. Higham, A new scaling and squaring algorithm for
the matrix exponential, SIAM J. Matrix Anal. Appl. 31 (3) (2009) 970–989.
[9] N. J. Higham, The scaling and squaring method for the matrix exponential
revisited, Tech. Rep. 452, Manchester Centre for Computational Mathe-
matics (2004).
[10] R. B. Sidje, Expokit: A software package for computing matrix exponen-
tials, ACM Trans. Math. Softw. 24 (1) (1998) 130–156.
[11] J. Sastre, J. Ib´nez, E. Defez, P. Ruiz, New scaling-squaring Taylor algo-
rithms for computing the matrix exponential, SIAM Journal on Scientific
Computing 37 (1) (2015) A439–A455.
[12] P. Ruiz, J. Sastre, J. Ib´a˜nez, E. Defez, High performance computing of the
matrix exponential, Journal of Computational and Applied Mathematics
291 (2016) 370–379.
[13] J. Sastre, J. Ib´a˜nez, E. Defez, Boosting the computation of the matrix
exponential, Applied Mathematics and Computation 340 (2019) 206–220.
[14] E. Defez, L. J´odar, Some applications of the Hermite matrix polynomi-
als series expansions, Journal of Computational and Applied Mathematics
99 (1) (1998) 105–117.
[15] J. Sastre, J. Ib´nez, E. Defez, P. Ruiz, Efficient orthogonal matrix polyno-
mial based method for computing matrix exponential, Applied Mathemat-
ics and Computation 217 (14) (2011) 6451–6463.
22
[16] F. W. Olver, D. W. Lozier, R. F. Boisvert, C. W. Clark, NIST handbook
of mathematical functions hardback and CD-ROM, Cambridge University
Press, 2010.
[17] E. Tohidi, K. Erfani, M. Gachpazan, S. Shateyi, A new Tau method for
solving nonlinear Lane-Emden type equations via Bernoulli operational ma-
trix of differentiation, Journal of Applied Mathematics 2013 (2013).
[18] A. W. Islam, M. A. Sharif, E. S. Carlson, Numerical investigation of double
diffusive natural convection of co2in a brine saturated geothermal reservoir,
Geothermics 48 (2013) 101–111.
[19] E. Tohidi, A. Bhrawy, K. Erfani, A collocation method based on Bernoulli
operational matrix for numerical solution of generalized pantograph equa-
tion, Applied Mathematical Modelling 37 (6) (2013) 4283–4294.
[20] A. Bhrawy, E. Tohidi, F. Soleymani, A new Bernoulli matrix method
for solving high-order linear and nonlinear Fredholm integro-differential
equations with piecewise intervals, Applied Mathematics and Computation
219 (2) (2012) 482–497.
[21] E. Tohidi, M. Ezadkhah, S. Shateyi, Numerical solution of nonlinear frac-
tional Volterra integro-differential equations via Bernoulli polynomials, Ab-
stract and Applied Analysis 2014 (2014).
[22] F. Toutounian, E. Tohidi, S. Shateyi, A collocation method based on the
Bernoulli operational matrix for solving high-order linear complex differ-
ential equations in a rectangular domain, Abstract and Applied Analysis
2013 (2013).
[23] E. Tohidi, F. Toutounian, Convergence analysis of Bernoulli matrix ap-
proach for one-dimensional matrix hyperbolic equations of the first order,
Computers & Mathematics with Applications 68 (1-2) (2014) 1–12.
[24] E. Tohidi, M. K. Zak, A new matrix approach for solving second-order lin-
ear matrix partial differential equations, Mediterranean Journal of Mathe-
matics 13 (3) (2016) 1353–1376.
[25] F. Toutounian, E. Tohidi, A new Bernoulli matrix method for solving sec-
ond order linear partial differential equations with the convergence analysis,
Applied Mathematics and Computation 223 (2013) 298–310.
[26] O. Kouba, Lecture Notes, Bernoulli Polynomials and Applications, arXiv
preprint arXiv:1309.7560 (2013).
[27] F. Costabile, F. Dell’Accio, Expansion over a rectangle of real functions in
Bernoulli polynomials and applications, BIT Numerical Mathematics 41 (3)
(2001) 451–464.
23
[28] F. Costabile, F. Dell’Accio, Expansions over a simplex of real functions
by means of Bernoulli polynomials, Numerical Algorithms 28 (1-4) (2001)
63–86.
[29] E. D. Rainville, Special functions, Vol. 442, New York, 1960.
[30] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar multiplica-
tions necessary to evaluate polynomials, SIAM Journal on Computing 2 (1)
(1973) 60–66.
[31] J. Higham, F. Tisseur, A block algorithm for matrix 1-norm estimation,
with an application to 1-norm pseudospectra, SIAM J. Matrix Anal. Appl.
21 (2000) 1185–1201.
[32] N. J. Higham, The test matrix toolbox for MATLAB (Version 3.0), Uni-
versity of Manchester Manchester, 1995.
[33] T. Wright, Eigtool, version 2.1 (2009).
URL web.comlab.ox.ac.uk/pseudospectra/eigtool.
[34] NVIDIA, cuBLAS (2020).
URL https://docs.nvidia.com/cuda/cublas
[35] NVIDIA, CUDA Toolkit Documentation v11.0.3 (2020).
URL https://docs.nvidia.com/cuda
[36] P. Alonso, J. Peinado, J. Ib´nez, J. Sastre, E. Defez, Computing matrix
trigonometric functions with GPUs through Matlab, The Journal of Super-
computing 75 (3) (2019) 1227–1240.
24
... were generalized to the matrix framework in [30]. Excluding B 1 = −0.5, all Bernoulli numbers B n , with n being an odd number, are null (see Appendix A, Remark A1, for the deduction of formula (3)). ...
... The use of expansion (5) to approximate the matrix exponential function with good results of precision and computational cost can be found in [30]. For a matrix A ∈ C r×r , using expression (5), we obtain (see Appendix A, Remark A2) ...
... In our implementations, the Paterson-Stockmeyer method [35] was employed. In this procedure, assuming that polynomial order m k is chosen from the set 2,4,6,9,12,16,20,25,30,36,42,49, 56, 64, . . . }, powers A i , 2 ≤ i ≤ q, must be calculated, where q = √ m k or q = √ m k is an integer divisor of m k . ...
Article
Full-text available
This paper presents three different alternatives to evaluate the matrix hyperbolic cosine using Bernoulli matrix polynomials, comparing them from the point of view of accuracy and computational complexity. The first two alternatives are derived from two different Bernoulli series expansions of the matrix hyperbolic cosine, while the third one is based on the approximation of the matrix exponential by means of Bernoulli matrix polynomials. We carry out an analysis of the absolute and relative forward errors incurred in the approximations, deriving corresponding suitable values for the matrix polynomial degree and the scaling factor to be used. Finally, we use a comprehensive matrix testbed to perform a thorough comparison of the alternative approximations, also taking into account other current state-of-the-art approaches. The most accurate and efficient options are identified as results.
... For example, some approximations use the Hermite matrix polynomials [23], while others derived from on Taylor polynomials [22,24]. More recently, a new method based on Bernoulli matrix polynomials was also proposed in [25]. ...
... Algorithm 4: Given a matrix A ∈ C n×n , the values Θ from Table 1, a minimum order m lower ∈ M, a maximum order m upper ∈ M, with M = {2, 4,6,9,12,16,20,25, 30}, and a tolerance tol, this algorithm computes the order of Taylor approximation m ∈ M, m lower ≤ m k ≤ m upper , and the scaling factor s, together with 2 −s A and the necessary powers of 4 −s B for computing P m k (4 −s B) from (9). In Steps 1-4 of Algorithm 4, the required powers of B for working out P m k (B) are computed. ...
... tanh_expm: this code corresponds to the implementation of Algorithm 1. For obtaining m ∈ {2,4,6,9,12,16,20,25, 30} and s and computing e 2A , it uses function exptaynsv3 (see[26]).• tanh_tayps: this development, based on Algorithm 2, incorporates Algorithm 4 for computing m and s, where m takes values in the same set than the tanh_expm code. ...
Article
Full-text available
In this paper, we introduce two approaches to compute the matrix hyperbolic tangent. While one of them is based on its own definition and uses the matrix exponential, the other one is focused on the expansion of its Taylor series. For this second approximation, we analyse two different alternatives to evaluate the corresponding matrix polynomials. This resulted in three stable and accurate codes, which we implemented in MATLAB and numerically and computationally compared by means of a battery of tests composed of distinct state-of-the-art matrices. Our results show that the Taylor series-based methods were more accurate, although somewhat more computationally expensive, compared with the approach based on the exponential matrix. To avoid this drawback, we propose the use of a set of formulas that allows us to evaluate polynomials in a more efficient way compared with that of the traditional Paterson–Stockmeyer method, thus, substantially reducing the number of matrix products (practically equal in number to the approach based on the matrix exponential), without penalising the accuracy of the result.
... have been generalized to the matrix framework in [13]: For a matrix A oe C r◊r , the nth Bernoulli matrix polynomial it is defined by the expression ...
... The use of expansion (5) to approximate matrix exponential with good results of precision and computational cost can be found in [13]. For a matrix A oe C r◊r , using expression (5) we obtain ...
Conference Paper
Full-text available
Hyperbolic Matrix functions cosh (A) and sinh (A) emerge in various areas of science and technology, and its computation has attracted significant attention due to their usefulness in the solution of systems of second-order linear differential equations. In this work, we introduce Bernoulli matrix polynomial series expansions for hyperbolic matrix cosine function cosh (A) in order to obtain accurate and powerful methods for their computation.
... Motivated by the references [21][22][23][24][25][26][31][32] and as a generalized case of the previous Eqs (1.1)-(1.6), we explore the following nonlinear fractional integro-differential equation NFIDEq ...
Article
Full-text available
Under some suitable conditions, we study the existence and uniqueness of a solution to a new modification of a nonlinear fractional integro-differential equation (NFIDEq) in dual Banach space CE (E, [0, T]), which simulates several phenomena in mathematical physics, quantum mechanics, and other domains. The desired conclusions are demonstrated with the use of fixed-point theorems after applying the theory of fractional calculus. The validation of the provided strategy has been done by utilizing the Bernoulli matrix approach (BMA) method as a numerical method. The major motivation for selecting the BMA approach is that it combines Bernoulli polynomial approximation with Caputo fractional derivatives and numerical integral transformation to reduce the NFIDEq to an algebraic system and then derive the numerical solution; additionally, the convergence analysis indicated that the proposed strategy has more precision than other numerical methods. Finally, as a verification of the theoretical work, we apply two examples with numerical results by using [Matlab R2022b], illustrating the comparisons between the exact solutions and numerical solutions, as well as the absolute error in each case is computed.
... More recent, they are derived the matrix recurrence relations, the matrix differential, matrix integro-differential and matrix partial differential equations for the Gould-Hopper-Laguerre-Appell matrix polynomials in [15]. In a similar vein, Defez et al. presented Bernoulli matrix polynomials and discussed some its properties in [16]. ...
Article
Full-text available
In recent years, much attention has been paid to the role of special matrix polynomials of a real or complex variable in mathematical physics, especially in boundary value problems. In this article , we define a new type of matrix-valued polynomials, called the first Appell matrix polynomial of two complex variables. The properties of the newly definite matrix polynomial involving, generating matrix functions, recurrence relations, Rodrigues' type formula and integral representation are investigated. Further, relevant connections between the first Appell matrix polynomial and various matrix functions are reported. The current study may open the door for further investigations concerning the practical applications of matrix polynomials associated with a system of differential equations.
... More recent, they are derived the matrix recurrence relations, the matrix differential, matrix integro-differential and matrix partial differential equations for the Gould-Hopper-Laguerre-Appell matrix polynomials in [15]. In a similar vein, Defez et al. presented Bernoulli matrix polynomials and discussed some its properties in [16]. ...
Article
In recent years, much attention has been paid to the role of special matrix polynomials of a real or complex variable in mathematical physics, especially in boundary value problems. In this article, we define a new type of matrix-valued polynomials, called the first Appell matrix polynomial of two complex variables. The properties of the newly definite matrix polynomial involving, generating matrix functions, recurrence relations, Rodrigues' type formula and integral representation are investigated. Further, relevant connections between the first Appell matrix polynomial and various matrix functions are reported. The current study may open the door for further investigations concerning the practical applications of matrix polynomials associated with a system of differential equations.
... In the last years, the matrix exponential e A has become a constant focus of attention due to its extensive applications-from the classical theory of differential equations for computing the solution of the matrix system Y (t) = AY(t), given by Y(t) = e At , to the graph theory [1][2][3], even including some recent progress about the numerical solutions of fractional partial differential equations [4,5]-as well as the multiple difficulties involved in its effective computation. They have motivated the development of distinct numerical methods, some of them classic and very well-known, as described in [6], and other more recent and novel using, for example, Bernoulli matrix polynomials [7]. ...
Article
Full-text available
The action of the matrix exponential on a vector eAtv, A∈Cn×n, v∈Cn, appears in problems that arise in mathematics, physics, and engineering, such as the solution of systems of linear ordinary differential equations with constant coefficients. Nowadays, several state-of-the-art approximations are available for estimating this type of action. In this work, two Taylor algorithms are proposed for computing eAv, which make use of the scaling and recovering technique based on a backward or forward error analysis. A battery of highly heterogeneous test matrices has been used in the different experiments performed to compare the numerical and computational properties of these algorithms, implemented in the MATLAB language. In general, both of them improve on those already existing in the literature, in terms of accuracy and response time. Moreover, a high-performance computing version that is able to take advantage of the computational power of a GPU platform has been developed, making it possible to tackle high dimension problems at an execution time significantly reduced.
Article
Full-text available
This paper presents new Taylor algorithms for the computation of the matrix exponential based on recent new matrix polynomial evaluation methods. Those methods are more efficient than the well known Paterson–Stockmeyer method. The cost of the proposed algorithms is reduced with respect to previous algorithms based on Taylor approximations. Tests have been performed to compare the MATLAB implementations of the new algorithms to a state-of-the-art Padé algorithm for the computation of the matrix exponential, providing higher accuracy and cost performances.
Article
Full-text available
This paper presents an implementation of one of the most up-to-day algorithms proposed to compute the matrix trigonometric functions sine and cosine. The method used is based on Taylor series approximations which intensively uses matrix multiplications. To accelerate matrix products, our application can use from one to four NVIDIA GPUs by using the NVIDIA cublas and cublasXt libraries. The application, implemented in C++, can be used from the Matlab command line thanks to the mex files provided. We experimentally assess our implementation in modern and very high-performance NVIDIA GPUs.
Article
Full-text available
The matrix exponential plays a fundamental role in linear differential equations arising in engineering, mechanics, and control theory. The most widely used, and the most generally efficient, technique for calculating the matrix exponential is a combination of "scaling and squaring" with a Pade approximation. For alternative scaling and squaring methods based on Taylor series, we present two modifications that provably reduce the number of matrix multiplications needed to satisfy the required accuracy bounds, and a detailed comparison of the several algorithmic variants is provided.
Article
Full-text available
This work presents a new algorithm for matrix exponential computation that significantly simplifies a Taylor scaling and squaring algorithm presented previously by the authors, preserving accuracy. A Matlab version of the new simplified algorithm has been compared with the original algorithm, providing similar results in terms of accuracy, but reducing processing time. It has also been compared with two state-of-the-art implementations based on Padé approximations, one commercial and the other implemented in Matlab, getting better accuracy and processing time results in the majority of cases.
Article
Full-text available
This paper presents a computational approach for solving a class of nonlinear Volterra integro-differential equations of fractional order which is based on the Bernoulli polynomials approximation. Our method consists of reducing the main problems to the solution of algebraic equations systems by expanding the required approximate solutions as the linear combination of the Bernoulli polynomials. Several examples are given and the numerical results are shown to demonstrate the efficiency of the proposed method.
Article
Full-text available
A new and efficient numerical approach is developed for solving nonlinear Lane-Emden type equations via Bernoulli operational matrix of differentiation. The fundamental structure of the presented method is based on the Tau method together with the Bernoulli polynomial approximations in which a new operational matrix is introduced. After implementation of our scheme, the main problem would be transformed into a system of algebraic equations such that its solutions are the unknown Bernoulli coefficients. Also, under several mild conditions the error analysis of the proposed method is provided. Several examples are included to illustrate the efficiency and accuracy of the proposed technique and also the results are compared with the different methods. All calculations are done in Maple 13.
Article
The basic aim of this article is to present a novel efficient matrix approach for solving the second-order linear matrix partial differential equations (MPDEs) under given initial conditions. For imposing the given initial conditions to the main MPDEs, the associated matrix integro-differential equations (MIDEs) with partial derivatives are obtained from direct integration with regard to the spatial variable x and time variable t. Hence, operational matrices of differentiation and integration together with the completeness of Bernoulli polynomials are used to reduce the obtained MIDEs to the corresponding algebraic Sylvester equations. Using two well-known subspace Krylov iterative methods (i.e., GMRES(10) and Bi-CGSTAB) we provide two algorithms for solving the mentioned Sylvester equations. A numerical example is provided to show the efficiency and accuracy of the presented approach.
Article
In this paper, a new matrix approach for solving second order linear partial differential equations (PDEs) under given initial conditions has been proposed. The basic idea includes integrating from the considered PDEs and transforming them to the associated integro-differential equations with partial derivatives. Therefore, Bernoulli operational matrices of differentiation and integration together with the completeness of Bernoulli polynomials can be used for transforming integro-differential equations to the corresponding algebraic equations. A rigorous error analysis in the infinity norm is given provided that the known functions and the exact solution are sufficiently smooth and bounded. A numerical example is included to demonstrate the validity and the applicability of the technique. The results confirm the theoretical prediction.
Article
In this paper, an approximate approach based on Bernoulli operational matrices has been presented to obtain the numerical solution of first-order matrix hyperbolic partial differential equations under the given initial conditions. After using the operational matrices, we use the completeness of Bernoulli basis in which the main problem reduces to a system of algebraic equations. The solutions of this algebraic system are the coefficients of the truncated double Bernoulli series which are defined in the interval [0, 1]. Also, convergence analysis associated to the presented idea is provided under several mild conditions. Several numerical examples are considered to show the efficiency of the technique.
Article
In geologic sequestration or in CO2-based geothermal systems, CO2 is present on top of the brine phase. In this study we performed a numerical analysis of a geothermal reservoir that is impermeable from the sides and is open to CO2 at the top. For this configuration, double diffusive natural convection due to density and temperature differences across the height enhance the mass transfer rate of CO2 into the initially stagnant brine. The analysis is done using mass, momentum, energy conservation laws, and the Darcy laws. The objective is to understand the diffusion of CO2 over long periods of time after sequestration into a subsurface porous media geothermal aquifer. The problem parameters are the solutal Rayleigh number (100 ≤ Ras ≤ 10,000), the buoyancy ratio (2 ≤ N ≤ 100), the cavity aspect ratio (0.5 ≤ A ≤ 2), and a fixed Lewis number (Le = 301). Numerical computations do not exhibit natural convection effects for homogeneous initial conditions. Hence a sinusoidal perturbation is added for the initial top boundary condition. It is found that the CO2 plumes move faster when Ras is increased, however they slow down with decreasing N. For every simulation run, the average CO2 concentration (S¯=(∑ini∑jnjci,j/ni×nj)) is computed. Higher concentration rates in laterally wide reservoirs make better candidates than deeper aquifers for CO2 sequestration.