ArticlePDF Available

Abstract

This paper presents new Taylor algorithms for the computation of the matrix exponential based on recent new matrix polynomial evaluation methods. Those methods are more efficient than the well known Paterson–Stockmeyer method. The cost of the proposed algorithms is reduced with respect to previous algorithms based on Taylor approximations. Tests have been performed to compare the MATLAB implementations of the new algorithms to a state-of-the-art Padé algorithm for the computation of the matrix exponential, providing higher accuracy and cost performances.
Boosting the computation of the matrix exponential
J. Sastrea,, J. Ib´nezb, E. Defezc
Universitat Polit`ecnica de Val`encia, Spain
aInstituto de Telecomunicaciones y Aplicaciones Multimedia
bInstituto de Instrumentaci´on para Imagen Molecular
cInstituto de Matem´atica Multidisciplinar
Abstract
This paper presents new Taylor algorithms for the computation of the ma-
trix exponential based on recent new matrix polynomial evaluation methods.
Those methods are more efficient than the well known Paterson–Stockmeyer
method. The cost of the proposed algorithms is reduced with respect to
previous algorithms based on Taylor approximations. Tests have been per-
formed to compare the MATLAB implementations of the new algorithms to
a state-of-the-art Pad´e algorithm for the computation of the matrix expo-
nential, providing higher accuracy and cost performances.
Keywords:
Matrix exponential, scaling and squaring, Taylor series, efficient matrix
polynomial evaluation.
1. Introduction
The computation of matrix functions has received remarkable attention in
the last decades because of its numerous applications in science and engineer-
ing [1]. From all the matrix functions the matrix exponential has been the
most studied function, and a large number of methods for its computation
have been proposed [2, 1].
In 2009 the authors submitted their first work with Taylor based algo-
rithms for computing the matrix exponential [3]. Until then, Pad´e approx-
Corresponding author: jsastrem@upv.es. This work has been supported by Spanish
Ministerio de Econom´ıa y Competitividad and European Regional Development Fund
(ERDF) grant TIN2014-59294-P.
Preprint submitted to Applied Mathematics and Computation
imants for the matrix exponential were preferred to Taylor approximations
because Pad´e algorithms were more efficient than the existing Taylor algo-
rithms, for similar accuracy [1]. Applying and improving the algorithms for
Pad´e approximants from [15] to Taylor approximations, the Taylor algorithms
from [3] showed to be generally more accurate than the Pad´e algorithm from
[15] in tests, with a slightly higher cost.
In [4] the authors presented a scaling and squaring Taylor algorithm for
computing the matrix exponential based on an improved mixed backward
and forward error analysis. It was more accurate than the state-of-the-art
Pad´e algorithm from [14] in the majority of tests, with a slightly higher
cost. Subsequently, [5] provided a formula for the forward relative error of
the matrix exponential Taylor approximation, and proposed to increase the
allowed error bounds depending on the matrix size and the Taylor approx-
imation order. This algorithm reduced the computational cost in exchange
for a small impact in accuracy. The method proposed in [6] simplified the
algorithm of [4], preserving accuracy, and showing to be more accurate than
the Pad´e algorithm from [14] in the majority of tests, being also more effi-
cient in some cases. Finally, [8] used Taylor approximations combined with
spline techniques to increase accuracy, also increasing the cost. In this work,
we present new Taylor algorithms based on the efficient matrix polynomial
evaluation methods from [9], increasing significantly the efficiency of the pre-
vious Taylor methods. We will show that the new algorithms are generally
both more accurate and efficient than the state-of-the-art Pad´e algorithm
from [14].
Throughout this paper Cn×ndenotes the set of complex matrices of size
n×n,Idenotes the identity matrix for this set, ρ(A) is the spectral radius
of matrix A, and Ndenotes the set of positive integers. The matrix norm k·k
denotes any subordinate matrix norm; in particular k·k1is the 1-norm. The
symbols d·e and b·c denote the smallest following and the largest previous
integer, respectively. The cost of the Taylor algorithms will be given in terms
of the number of evaluations of matrix products, denoting the cost of one
matrix product by M. Note that the multiplication by the matrix inverse in
Pad´e approximations is calculated as the solution of a multiple right-hand
side linear system. The cost of the solution of multiple right-hand side linear
systems AX =B, where matrices Aand Bare n×nwill be denoted by D.
Taking into account that, see [10, App. C]
D4/3M, (1)
2
the cost of evaluating rational approximations will be also given in terms of
M. All the given algorithms are intended for IEEE double precision arith-
metic. Their extension to different precision arithmetics is straightforward.
This paper is organized as follows: Section 2 presents a general scaling and
squaring Taylor algorithm. Section 3 introduces efficient evaluation formulas
for the Taylor matrix polynomial approximation of the matrix exponential
based on [9]. Section 4 presents the scaling and squaring error analysis. The
new algorithm is given in Section 5. Section 6 shows numerical results and
Section 7 gives some conclusions. Next theorem from [5] will be used in
Section 4 to bound the norm of matrix power series.
Theorem 1. Let hl(x) = Pklbkxkbe a power series with radius of con-
vergence R, and let ˜
hl(x) = Pkl|bk|xk.For any matrix ACn×nwith
ρ(A)< R, if akis an upper bound for ||Ak|| (||Ak|| ≤ ak),pN,1pl,
p0Nis the multiple of pwith lp0l+p1, and
αp= max{a
1
k
k:k=p, l, l +1, l +2, . . . , p01, p0+1, p0+2, . . . , l +p1},(2)
then ||hl(A)|| ≤ ˜
hl(αp).
2. General Taylor algorithm
The Taylor approximation of order mof the matrix exponential of A
Cn×n, denoted by Tm(A), is defined by the expression
Tm(A) =
m
X
k=0
Ak
k!.(3)
The scaling and squaring algorithms with Taylor approximation (3) are based
on the approximation eA=e2sA2s
(Tm(2sA))2s[2], where the non-
negative integers mand sare chosen to achieve full machine accuracy at a
minimum cost.
A general scaling and squaring Taylor algorithm for computing the matrix
exponential is presented in Algorithm 1, where mMis the maximum allowed
value of m.
3
Table 1: Values of qkdepending on the selection of mMin [6].
k0 1 2 3 4 5 6 7 8 9
mMmk124691216202530
20 1 2 2 3 3 4 4 4
25 1 2 2 3 3 4 4 5 5
30 1 2 2 3 3 4 4 5 5 5
Algorithm 1 General scaling and squaring Taylor algorithm for computing
B=eA, where ACn×nand mMis the maximum approximation order
allowed.
1: Preprocessing of matrix A.
2: Choose mk6mM, and an adequate scaling parameter sN∪ {0}for
the Taylor approximation with scaling.
3: Compute the matrix polynomial B=Tmk(A/2s)
4: for i=1:sdo
5: B=B2
6: end for
7: Postprocessing of matrix B.
In this paper the evaluation of the Taylor matrix polynomial of Step
3 is improved. The preprocessing and postprocessing steps (1 and 7) are
based on applying transformations to reduce the norm of matrix A, see [1],
and will not be discussed in this paper. In Step 2, the optimal order of
Taylor approximation mk6mMand the scaling parameter swill be chosen
improving the algorithm from [6].
In [6] the matrix polynomial Tm(2sA) was evaluated using the Paterson–
Stockmeyer method evaluation formula (7) of [6], see [11]. The optimal Taylor
orders mfor that method were in the set mk={1,2,4,6,9,12,16,20,25,30, . . .},
k= 0,1, . . ., respectively, where the matrix powers A2,A3,···, Aqwere eval-
uated and stored to be used in all the computations. Table 1, see [6, Table
1], shows some optimal values of q, denoted by qk, used in [6] for orders mk,
k= 0,1,2, . . . , M, and mM= 20,25 or 30. In this work Tm(2sA) will be
computed using new evaluation methods based on [9], more efficient than
Paterson–Stockmeyer method.
Finally, after the evaluation of Tm(2sA), srepeated squarings are ap-
4
plied in Steps 4–6. The postprocessing is applied in Step 7 to obtain the
matrix exponential approximation of the original matrix A. The computa-
tional cost of [6, Alg. 1] in terms of matrix products is
Cost(mk, s) = k+s. (4)
Note that if s= 0 then Cost(mk,0) = k. Using matrix polynomial evaluation
methods based on [9] the costs from (4) will be reduced.
3. Efficient evaluation of the matrix exponential Taylor matrix ap-
proximation
In this section formulas for evaluating Taylor based approximations of
orders m= 8,15,21,24 and 30 are given based on the results from [9]. MAT-
LAB R2017a Symbolic Math Toolbox with 200 decimal digit arithmetic was
used in all the calculations. Note that for Tm(A) with orders m= 1, 2, 4,
the same evaluation formulas as in [6] will be used. i.e.
T1(A) = A+I, (5)
T2(A) = A2/2 + A+I, (6)
T4(A) = ((A2/4 + A)/3 + I)A2/2 + A+I. (7)
3.1. Evaluation of T8(A)
Following [9, Ex.3.1] we can evaluate T8(A) with the following evaluation
formulas
y02(A) = A2(c1A2+c2A),(8)
T8(A) = (y02(A) + c3A2+c4A)(y02 (A) + c5A2) (9)
+c6y02(A) + A2/2 + A+I,
where the coefficients ci,i= 1,2,...,6, numbered correlatively, are given in
IEEE double precision arithmetic in Table 2, see [9, Table 4], A2is computed
once and stored to be reused in all the computations, and T8(A) can be
evaluated with a cost 3M. Note that with a cost 3M, taking s= 0 in
(4) and using Table 1 the maximum Taylor order available with Paterson–
Stockmeyer method is m= 6.
5
c14.980119205559973×103
c21.992047682223989×102
c37.665265321119147×102
c48.765009801785554×101
c51.225521150112075×101
c62.974307204847627×100
Table 2: Coefficients for computing the matrix exponential Taylor approximation of order
m= 8 using (8) and (9), see [9, Table 4].
3.2. Evaluation of a Taylor based approximation of order 15
Following [9, Ex. 5.1] we can evaluate a Taylor based approximation
of order m= 15 of the matrix exponential with the following evaluation
formulas
y02(A) = A2(c1A2+c2A),(10)
y12(A) = (y02 (A) + c3A2+c4A)(y02(A) + c5A2) + c6y02 (A) + c7A2,(11)
y22(A) = (y12 (A) + c8A2+c9A)(y12(A) + c10 y02(A) + c11 A)
+c12y12(A) + c13y02 (A) + c14A2+c15 A+c16I, (12)
where, analogously to [9, Ex. 5.1], A2is computed once and stored to be
reused in all the computations, the degree of polynomial y22 (A) is m= 16 and
it can be evaluated with a cost 4M. Note that with Paterson–Stockmeyer
method and cost 4M, using (4) and Table 1, the maximum available order
for Tm(A) is m= 9.
Similarly to [9, Ex. 5.1], if we rewrite y22(A) as a matrix polynomial
of degree m= 16, and equate the coefficients of the matrix powers Ai,
i= 0,1,...,15, of the rewritten y22(A) to the corresponding coefficients
of T15(A), we obtain a nonlinear system of 16 equations with 16 unknown
coefficients ci,i= 1,2,...,16.
Note that MATLAB Symbolic Math Toolbox solve function could not
give the general solution of the expressions (57)-(59) proposed in [9, Ex. 5.1],
returning one numerical approximation of the coefficients instead. Moreover,
we had to use a special ordering of the coefficients so that solve obtained
this numerical approximation, see [12, Sec. 4.2]. In the following we de-
scribe a method to obtain more solutions for the coefficients ci. The stability
recommendations from [9, p. 243] propose to select the solution giving the
6
lesser error in the system of equations, when substituting the coefficients ci
in IEEE double precision arithmetic.
y12(A) from (11) can be written as a polynomial of degree 8
y12(A) =
8
X
i=2
aiAi.(13)
Taking into account the four existing solutions from [9, Ex. 1] for evaluating
polynomials of degree 8 with cost 3M, see (25) and (26) from [9], it follows
that
y02(A) = ±A2(a8A2+a7/(2a8)A).(14)
If we write y22(A) as a polynomial of degree m= 16 and equate the co-
efficients of the matrix powers Aito the Taylor coefficients 1/i!, for i=
0,1,...,15, a system of 16 equations with 16 variables ai,i= 2,3,...,8,
and ci,i= 8,9,...,16, arises. This system can be solved using solve
function from MATLAB Symbolic Math Toolbox obtaining coefficients ai,
i= 2,3,...,8, and ci,i= 8,9,...,16. Then, [9, Ex. 3.1] gives four solu-
tions for the coefficients ci,i= 1,2,...,7, from (10) and (11) that allow to
evaluate (13) with such coefficients ai. From now on, we will call these dif-
ferent solutions of coefficients cibased on coefficients aias nested solutions.
From all the four solutions we selected the one giving the lesser error in the
original system of equations when rounding the coefficients to IEEE double
precision arithmetic, see Table 3. For that solution if we write y22(A) as a
polynomial of degree m= 16 the coefficient of A16, denoted by b16, and its
relative error with respect to the corresponding Taylor polynomial coefficient
are
b16 = 2.608368698098254 ×1014,|b16 1/16!|16! 0.454,(15)
where the coefficient b16 is presented in IEEE double precision arithmetic.
Note that there is another real solution that gives |b16 1/16!|16! 2.510 >
0.454, and then we discarded it. Note also that the relative error in (15) is
the same as in [9, Ex. 5.1], see (60) from [9].
The fact that
y22(A) = T15 (A) + b16A16 ,(16)
will have interesting implications in the error analysis of the algorithm. The
evaluation formulas (11) and (12) are slightly different from (58) and (59) of
[9, Ex. 5.1] so that the coefficient of the last matrix Ain (12) is 1, instead
of c2from [9, Table 9].
7
c14.018761610201036×104c92.224209172496374×100
c22.945531440279683×103c10 -5.792361707073261×100
c3-8.709066576837676×103c11 -4.130276365929783×102
c44.017568440673568×101c12 1.040801735231354×101
c53.230762888122312×102c13 -6.331712455883370×101
c65.768988513026145×100c14 3.484665863364574×101
c72.338576034271299×102c15 1
c82.381070373870987×101c16 1
Table 3: Coefficients of y02,y12 ,y22 from (10)-(12) for computing a Taylor based approx-
imation of the matrix exponential of order m= 15.
3.3. Evaluation of a Taylor based approximation of order 21
In a similar way to [9, Ex. 5.1] we developed the following formulas to
evaluate a Taylor approximation of order m= 21 of the matrix exponential
y03(A) = A3(c1A3+c2A2+c3A),(17)
y13(A) = (y03 (A) + c4A3+c5A2+c6A)(y03(A) + c7A3+c8A2)
+c9y03(A) + c10 A3+c11A2,(18)
y23(A) = (y13 (A) + c12A3+c13 A2+c14A)(y13 (A) + c15y03(A) + c16A)
+c17y13(A) + c18y03 (A) + c19A3+c20 A2+A+I, (19)
where A2and A3are computed once and reused in all the computations, the
degree of polynomial y23(A) is m= 24, and it can be evaluated with a cost
5M. Note that with cost 5Mthe maximum available order for Tm(A) using
Paterson–Stockmeyer method is m= 12, see (4) and Table 1.
Proceeding analogously to Section 3.2, writing
y13(A) =
12
X
i=2
aiAi,(20)
and taking into account the solutions of (38) from [9], it follows that
y03(A) = ±A3(a12 A3+a11/(2a12 )A2+ (4a10a12 a2
11)/(8a3/2
12 )A).(21)
Then, similarly to Section 3.2, we use function solve from the MATLAB
Symbolic Math Toolbox to obtain ai,i= 2,3,...,12, and ci,i= 12,13,...,20,
so that y23(x) = T21(x) + O(x22), for scalar x. Then, we use the solutions
8
c11.161658834444880×106c11 1.392249143769798×101
c24.500852739573010×106c12 -2.269101241269351×103
c35.374708803114821×105c13 -5.394098846866402×102
c42.005403977292901×103c14 3.112216227982407×101
c56.974348269544424×102c15 9.343851261938047×100
c69.418613214806352×101c16 6.865706355662834×101
c72.852960512714315×103c17 3.233370163085380×100
c8-7.544837153586671×103c18 -5.726379787260966×100
c91.829773504500424×100c19 -1.413550099309667×102
c10 3.151382711608315×102c20 -1.638413114712016×101
Table 4: Coefficients of y03 ,y13,y23 from (17)-(19) for computing a Taylor based matrix
exponential approximation of order m= 21.
from [9, pp. 237–240] to evaluate (20) with (17) and (18), obtaining different
nested solutions for coefficients ci,i= 1,2,...,11. Again, according to the
stability recommendations from [9, p. 243], we selected the solution given in
Table 4. For that solution, if we write y23 (A) from (19) as a polynomial of
degree m= 24, the coefficients of A22,A23 and A24, denoted by b22,b23 and
b24, respectively, and their relative errors with respect to the corresponding
Taylor polynomial coefficients are
b22 = 5.010366348377648 ×1022,|b22 1/22!|22! 0.437,(22)
b23 = 2.822218236752230 ×1023,|b23 1/23!|23! 0.270,(23)
b24 = 1.821018669767511 ×1024,|b24 1/24!|24! 0.130,(24)
(25)
where the coefficients b22,b23 and b24 are presented in IEEE double precision
arithmetic. Again, the fact that
y23(A) = T21 (A) + b22A22 +b23 A23 +b24A24 ,(26)
will have interesting implications in the error analysis of the algorithm.
3.4. Evaluation of Taylor approximations T24(A)and T30 (A)
The coefficients for the evaluation of T24(A) corresponding to
y04(A) = A4(c1A4+c2A3+c3A2+c4A),(27)
y14(A) = (y04 (A) + c5A4+c6A3+c7A2+c8A)(y04(A) + c9A4+c10 A3
9
+c11A2) + c12 y04(A) + c13 A4+c14A3+c15 A2+c16A, (28)
T24(A) = y14 (A)(y04(A) + c17 A4+c18A3+c19 A2+c20A)
+c21A4+c22A3+c23A2+A+I , (29)
can be obtained in a similar way to Sections 3.2 and 3.3. For the values of
the coefficients in IEEE double precision arithmetic see case m= 24 from
the nested function EFFEVPOL from the MATLAB implementation of the
proposed algorithm expmpol at
http://personales.upv.es/jorsasma/software/expmpol.m
Note that using (27)–(29), T24(A) can be evaluated with a cost 6M. For
6Mthe maximum available order with Paterson-Stockmeyer method is 16,
see Table 1.
Similarly, the coefficients for the evaluation of T30(A) corresponding to
y05(A) = A5(c1A5+c2A4+c3A3+c4A2+c5A),(30)
y15(A) = (y05 (A) + c6A5+c7A4+c8A3+c9A2+c10A)
×(y05(A) + c11 A5+c12A4+c13 A3+c14A2)
+c15y05(A) + c16A5+c17 A4+c18A3+c19 A2+c20A, (31)
T30(A) = y15 (A)(y05(A) + c21 A5+c22A4+c23 A3+c24A2+c25A)
+c26A5+c27A4+c28A3+c29 A2+A+I, (32)
can be obtained. For the coefficient values see case m= 30 from the nested
function EFFEVPOL from expmpol.
Note that using (30)–(32), T30(A) can be evaluated with a cost 7M, whereas
for that cost the maximum available order with Paterson–Stockmeyer method
is 20, see Table 1.
Finally, note that in the evaluation of Pad´e approximants from [14] many
of the products of the numerator can be reused for the denominator. In
general, this is not so advantageous if we compute the numerator and de-
nominator with the evaluation algorithms from [9]. For instance, to evaluate
r88 using the method of Section 3.1 to evaluate both numerator and denom-
inator polynomials of degree 8 we need one matrix product to evaluate A2,
two more products to evaluate the numerator and two more for the denomi-
nator. Therefore, the cost for evaluating r88 is 5M+ 1Dproviding an order
of the approximation 16. Note that with a lower cost of 5M,y23 from Section
3.3 gives an order of approximation 21.
10
4. Error analysis
For completeness of the exposition we summarize some results for the
error analysis of [6]. Denoting the remainder of the Taylor series as Rm(A) =
Pkm+1 Ak/k!, for a scaled matrix 2sA,sN∪ {0}, we can write
Tm(2sA)2s
=eAI+gm+1(2sA)2s
=eA+2shm+1(2sA),(33)
gm+1(2sA) = e2s
ARm(2sA), hm+1 2sA= log I+gm+1(2sA),
(34)
where log denotes the principal logarithm, hm+1(X) is defined in the set Ωm=
XCn×n:ρeXTm(X)I<1, and both gm+1 (2sA) and hm+1 (2sA)
are holomorphic functions of Ain Ωmand then commute with A. As
showed in [4], ∆A=hm+1(2sA) is the backward absolute error, and ∆E=
gm+1(2sA) is the forward relative error, from the approximation of eAby
the Taylor series with scaling and squaring in exact arithmetic.
Using the scalar Taylor series in (34) one gets
gm+1(x) = X
km+1
b(m)
kxk, hm+1(x) = X
k1
(1)k+1(gm+1(x))k
k=X
km+1
c(m)
kxk,
(35)
where b(m)
kand c(m)
kdepend on the order m. Moreover, b(m)
k=c(m)
k, k =
m+ 1, m + 2,...,2m+ 1, and if khm+1(2sA)k  1 or kgm+1(2sA)k  1,
then
A=hm+1(2sA)gm+1(2sA)=∆E, (36)
see [5].
Choosing sso that
hm+1 2sA
max 1,
2sA
u, (37)
where u=253 is the unit roundoff in IEEE double precision arithmetic, then:
If 2skAk ≥ 1, then ∆A≤ kAkuand using (33) one gets (Tm(2sA))2s=
eA+∆AeA,
If 2skAk<1, using (33)-(36) and the Taylor series, if (37) holds one
gets
11
Rm2sA
=
e2sAgm+1(2sA)
(38)
Tm2sAhm+1(2sA)
Tm2sA
u.
Hence, by (38), in IEEE double precision arithmetic one gets that
Tm(2sA) + Rm(2sA)Tm(2sA).
Using MATLAB symbolic Math Toolbox, high precision arithmetic, 200
series terms and a zero finder we obtained the maximal values Θmof Θ =
k2sAk, see Table 5, such that, using the notation of Theorem 1
||hm+1 (2sA)|| ≤ ˜
hm+1 (Θ) = Pkm+1 |c(m)
k|Θkmax{1,Θ}u. (39)
Hence, if ||2sA|| ≤ Θmthen (37) holds. For orders m= 20,25,30 where
Θm>1 by (14) of [6] the next bound holds
||hm+1 2sA|| ≤ ˜
hm+1 ||2sA||=˜
hm+1 (Θ) Θu, 0ΘΘm.(40)
The previous analysis is valid for the orders mkused in [6], and also for the
new evaluation formulas for T8(A), T24(A) and T30(A) from Section 3. For
y22(A) from (16) it is easy to show that all the previous error analysis is valid
substituting Rm(2sA) in (34) for
˜
R15+(2sA) = exp(2sA)y22(2sA) = R15(2sA)b16 A16.(41)
Similarly, for y23 (A) from (26) all the previous error analysis is also valid
substituting Rm(2sA) in (34) for
˜
R21+(2sA) = exp(2sA)y22(2sA) = R21(2sA)
24
X
i=22
biAi.(42)
Therefore, the new Θ15+ and Θ21+ values are obtained in a similar way to
the other values obtained in [6], where the suffix ”+” has been added. The
corresponding values are given in Table 5. Comparing those values to the
actual values obtained for T15(A), i.e. Θ15 0.658 and T21 (A), i.e. Θ21
1.624, we see that both values are lower than the corresponding values for
Θ15+ and Θ21+. That is the reason for adding the suffix ”+” in m= 15+
and m= 21+, notation that will be used from now on. The higher values
12
Table 5: Maximal values Θm=k2sAksuch that ˜
hm+1 m)max{1,Θm}u, coefficient
ratios c(m)
m+1/c(m)
m+2 and values u/c(m)
m+2. Note that m= 15+ and m= 21+ correspond to
approximations (16) and (26) that are not exactly T15(A) and T21(A).
mΘm|c(m)
m+1/c(m)
m+2| |u/c(m)
m+2|
1 1.490116111983279×1083/2 3.33×1016
2 8.733457513635361×1064/3 8.88×1016
4 1.678018844321752×1036/5 1.60×1014
8 1.773082199654024×10210/9 4.48×1011
15+ 6.950240768069781×1011.15 5.87×103
21+ 1.682715644786316 1.03 2.93×105
24 2.219048869365090 26/25 1.79×109
30 3.539666348743690 32/31 9.42×1017
Θ15+ and Θ21+ will imply a lower order or scaling for certain matrices. For
instance, this may be the case for matrices with Θ15 ≤ kAk ≤ Θ15+ and
Θ21 ≤ kAk ≤ Θ21+, see Section 5. We have checked that bound (40) also
holds for the approximation y23(A) from (26) and for T24(A). Hence, the
optimal orders mto be used are in the set
m
k={1,2,4,8,15+,21+,24,30}.(43)
5. New Taylor algorithms
This section summarizes the results from the scaling algorithm from [6]
and gives the new algorithms. The scaling algorithm from [6] can be applied
directly to the Taylor approximations T8(A), T24(A), y22(A) from (16) and
y23(A) from (26) from Section 3 using the new values for all the parameters
from Table 5, see [6, Table 2]. For the values corresponding to m= 15+ and
21+ the remainders (41) and (42) were used in (34) and the coefficients c(m)
m+1
and c(m)
m+2 from (35) were obtained symbolically.
For all norms appearing in the scaling algorithm we will use the 1-norm,
and mMwill be the maximum allowed Taylor order. In [6, Sec. 4.1] the min-
imum value of αpfrom Theorem 1, denoted by αmin, was used to determine
an initial scaling parameter. It was taken as, see (16) from [6],
αmin max{a1/(m+1)
m+1 , a1/(m+2)
m+2 },(44)
13
where am+1 and am+2 were the 1–norm estimation of ||Am+1|| and ||Am+2||
using the block 1–norm estimation algorithm of [13]. This algorithm has a
cost O(n2), negligible compared to the cost of a matrix product, i.e. O(n3) if
the matrix Ais big enough. Alternatively, we can also use bounds for ||Am+1||
and ||Am+2|| based on products of norms of the matrix powers computed at
each step of the algorithm, similarly to [7, Alg. 3] and [8, Alg. 3]. Then, no
norm estimations are used. For instance, if m= 4 and A2is known, then
kAm+1k=kA5k ≤ kA2k2kAk.
In numerical tests with matrices 128×128, see Section 6, we detected that
the cost of the 1–norm estimation algorithm of [13] is not negligible. Then, a
MATLAB implementation of Algorithm 1 with no norm estimations is given.
Moreover, an implementation that reduces the number of norm estimations
compared to that of [6, Alg. 2] is also given. Both options are possible in
function expmpol(A,kmax,NormEst) from
http://personales.upv.es/jorsasma/software/expmpol.m
This function can use mM= 24 (kmax=6) and mM= 30 (kmax=7), and select-
ing the use of the norm estimation algorithms from [13] (NormEst=1), or not
(NormEst=0). expmpol does not consider preprocessing and postprocessing
of matrix A, as indicated in Section 2.
Algorithm 2, called ms selectNoNormEst, presents the proposed algo-
rithm for the selection of the order mand the scaling parameter sto com-
pute the matrix exponential using no norm estimations and mM= 24. It
corresponds to Step 2 of Algorithm 1. The selection of mand sis analo-
gous to that from [6, Alg. 2], but using bounds for ||Am+1 || and ||Am+2||
based on products of norms of matrix powers. The version of Algorithm 2
allowing both mM= 24 and mM= 30 is straightforward, and a MATLAB
implementation allowing both values of mMis given as the nested function
ms selectNoNormEst of expmpol.
In Steps 1–18, Algorithm 2 checks if any of the Taylor optimal orders
mk= 1,2,4,8,15+,21+ and mM= 24, satisfies (37) without scaling, i.e.
s= 0, using the bounds provided in Section 4 and considering the two first
terms of the series of hm+1(x) from (35), in the same way as (19) from [6]
khm+1 (2sA)k
|c(m)
m+2|
m+2
X
k=m+1
c(m)
k
c(m)
m+2
ak
2sk max 1,
2sA
u
|c(m)
m+2|,(45)
As mentioned above, we obtain bounds of ||Am+1 || and ||Am+2|| using prod-
ucts of norms of matrix powers previously computed, see Steps 6, 8, etc.
14
If no value of mkmMsatisfies (45) with s= 0, the algorithm computes
αmin using only the bounds for ||Am+1|| and ||Am+2||, and determines the
initial scaling parameter sin Steps 19–20. Then, if s > 0, the algorithm
checks in Steps 22–26 if the initial scaling parameter can be reduced, checking
if (45) holds with sred =s1.
Finally, In Steps 27–32 we check if mM1= 21+ can be used with the
same scaling s, reducing the cost, similarly to [6, Alg. 2].
After selecting mand s, the matrix polynomial evaluation formulas from
Section 3 will be used. The implementation of this part is straightforward
using the formulas and coefficients from Section 3. A MATLAB implemen-
tation can be seen at function EFFEVPOL of expmpol. And, finally, the s
squarings from Steps 4-6 are done. See function expmpol for a complete
implementation of the algorithm.
For simplicity, the special case of nilpotent matrices, i.e. kAik= 0 ×I,
i= 2,3,4, or 5, is not included in Algorithm 2, but they are considered in
the MATLAB implementation ms selectNoNormEst in expmpol.
15
Algorithm 2 Algorithm ms selectNoNormEst: Given a matrix ACn×n
this algorithm obtains the order mand scaling parameter sfor Taylor ap-
proximation of order m6mM= 24 using no norm estimations of matrix
powers.
Input: ACn×n
Output: Order m, scaling parameter s, and some matrix powers Ai, 1 i3.
1: Set Θm,c(m)
m+1/c(m)
m+2 and u/c(m)
m+2 values from Table 5
2: a1=||A||
3: s= 0 .Null scaling tests
4: if a1<Θ1,m= 1, quit .Test m= 1 with s= 0
5: A2=A2,a2=kA2k
6: a3=a2a1,a4=a2
2,b= max{1,||A||} · u/c(2)
4
7: if c(2)
3/c(2)
4·a3+a4b,m= 2, quit .Test m= 2 with s= 0
8: a5=a2
2a1,a6=a3
2,b= max{1,||A||} · u/c(4)
6
9: if c(4)
5/c(4)
6·a5+a6b,m= 4, quit .Test m= 4 with s= 0
10: a9=a4
2a1,a10 =a5
2,b= max{1,||A||} · u/c(8)
10
11: if c(8)
9/c(8)
10 ·a9+a10 b,m= 8, quit .Test m= 8 with s= 0
12: a16 =a8
2,a17 =a8
2a1,b= max{1,||A||} · u/c(15)
17
13: if c(15)
16 /c(15)
17 ·a16 +a17 b,m= 15, quit .Test m= 15+ with s= 0
14: A3=A2A,a3=kA3k
15: a22 = min{a11
2, a6
3a2
2, a7
3a1},a23 = min{a10
2a3, a7
3a2},b= max{1,||A||} · u/c(21)
23
16: if c(21)
22 /c(21)
23 ·a22 +a23 b,m= 21, quit .Test m= 21+ with s= 0
17: a25 = min{a11
2a3, a7
3a2
2, a8
3a1},a26 = min{a13
2, a8
3a2},b= max{1,||A||} · u/c(24)
26
18: if c(24)
25 /c(24)
26 ·a25 +a26 b,m= 24, quit .Test m= 24 with s= 0
19: αmin = max{a1/25
25 , a1/26
26 }
20: s=dlog2(αmin/Θm)e.Calculate initial scaling s
21: if s > 0then
22: sred =s1.Check if (45) holds reducing the scaling
23: b= max{1,||A||/2sred )} · u/c(24)
26
24: if c(24)
25 /c(24)
26 ·a25/225sred +a26 /226sred bthen
25: s=sred .(45) holds, then s=sred
26: end if
27: b= max{1,||A||/2s} · u/c(21)
23 .Test if the scaled matrix allows using
mM1= 21+
28: if c(21)
22 /c(21)
23 ·a22/222s+a23 /223sbthen
29: m= 21, quit
30: else
31: m= 24, quit .The scaled matrix does not allow using mM1
32: end if
33: end if
16
The selection of mand sin Algorithm 2 is analogous to that of [6, Alg. 2]
with the new optimal orders mfrom Section 3, and using no norm estimations
for kAm+1kand kAm+2k.
With respect to algorithms for the selection of mand susing norm esti-
mations of matrix powers [13], it is easy to implement a direct adaptation of
[6, Alg. 2] that selects the optimal mand sfor the new values m
kfrom (43)
using the corresponding parameter values from Table 5. We will denote this
algorithm as expmpol orig. However, [6, Alg. 2] computes norm estimations
of matrix powers that can be avoided. MATLAB function ms selectNormEst
has been developed reducing the number of norm estimations of matrix pow-
ers with respect to expmpol orig, see expmpol. It is similar to function
ms selectNormEst from [7] for selecting mand sfor computing the matrix
cosine with a Taylor algorithm. The reduction of estimations will increase
the efficiency for small matrices, see Section 6. The selection of mand s
in ms selectNormEst is based essentially on testing if the order m
ksatisfies
(45) using products of known norms of matrix powers and, if so, then, test-
ing if m
k1also satisfies (45) using estimations of kAm
k1+1kand kAm
k1+2k.
General steps for ms selectNormEst with mM= 24 are:
1. Set s= 0.
2. Test if one order m=m
kfrom (43), with k= 2,3,4, i.e. m={4,8,15},
satisfies (45) with s= 0, using bounds of matrix powers based on
products of kAkand kA2k. If so, then: if (45) holds with m=m
k1
using the estimation of kAm+1kand kAm+2k, then m=m
k1, else
m=mk, and the algorithm quits. Note that, similarly to [6, Alg. 2],
we will not estimate kAm+2kif
c(m)
k
c(m)
m+2
kAm+1k
2s(m+1) >max 1,
2sA
u
|c(m)
m+2|,(46)
see (45).
3. If m= 15+ satisfies (45) with s= 0 using estimations for kA16kand
kA17kthen: if (45) holds with m= 8 using estimations of kA9kand
kA10kthen m= 8, else m= 15+, and the algorithm quits. Testing
m= 15+ with norm estimations is necessary in this step because A3is
computed in the following step, and A3is not used for the evaluation
of the approximation y22(A) of order m= 15+ from (10)-(12).
4. A3is computed.
17
5. If m= 21+ satisfies (45) with s= 0 using products of known norms of
matrix powers, including the estimations from Step 3, then m= 21+ is
selected and the algorithm quits. For instance, a bound for kAm+1kis
kA22k ≤ kA16 kmin{kA3k2,kA2k3}, where kA16kwas estimated in Step
3.
6. If m= 24 satisfies (45) using products of known norms of matrix powers
then: if (45) holds with m= 21+ and estimations of kA22kand kA23k,
then m= 21+, else m= 24, and the algorithm quits.
7. If m= 24 satisfies (45) using estimations of kA25kand kA26kthen: if
(45) holds with m= 21+ and estimations of kA22kand kA23 k, then
m= 21+, else m= 24, and the algorithm quits.
8. The rest is similar to Steps 22-32 from Algorithm 2, combining norm
estimations of matrix powers and bounds of norms of matrix powers
based on products of known norms of matrix powers.
The extension of the previous algorithm to mM= 30 is straightforward
and a complete MATLAB implementation of the algorithm can be seen in
MATLAB function ms selectNormEst of expmpol, allowing mM= 24 and
mM= 30.
6. Numerical experiments and conclusions
In this section we compare a MATLAB implementation of the new algo-
rithm expmpol, available at
http://personales.upv.es/jorsasma/software/expmpol.m
with the following three legacy MATLAB implementations for the computa-
tion of the matrix exponential:
expmspl: MATLAB implementation based on matrix splines from [8],
available at http://personales.upv.es/jorsasma/software/expmspl.m.
This function uses Taylor approximations with a fixed maximum order
mM= 30.
exptayns: MATLAB implementation of exptaynsv3 from [6], available
at http://personales.upv.es/jorsasma/software/exptaynsv3.m. For this
function the argument denoted as kmax determines the maximum Tay-
lor approximation order used.
18
expm new: MATLAB implementation based on Pad´e approximation
from [14]. For this function the argument denoted as schur fact is not
used, i.e. an initial transformation to complex or real Schur form is
not used. Anyway, this kind of transformations can be also applied to
our algorithms, see Test 4 of [7, Sec. 3] for the application of a Schur
form to a Taylor algorithm for the matrix cosine. The accuracy results
were similar to those of a state-of-the-art Pad´e algorithm that used the
same Schur form, but with a lower cost.
We use the following notations for different versions of the algorithms pro-
posed in this paper:
expmpol orig: Direct MATLAB implementation of expmpol based on
[6, Alg. 2], i.e. evaluating the Taylor approximations with the methods
from Section 3, changing the optimal orders mkfor m
kfrom (43) in [6,
Alg. 2] and using norm estimations.
expmpol: MATLAB implementation using ms selectNormEst from
Section 5 to select mand s(argument NormEst = 1 in function expmpol),
which reduces the number of norm estimations with respect to
expmpol orig.
expmpoln: MATLAB implementation using ms selectNoNormEst from
Section 5 to select mand s(Algorithm 2, with no norm estimations,
argument NormEst = 0 in function expmpol).
The experimental results have been carried out on an Intel i7-6700HQ
@2.60GHz, 32GB RAM, using MATLAB R2017a. The accuracy was tested
by computing the relative error E = keA˜
Xk1/
eA
1,where ˜
Xis the
computed approximation and the cost is given in terms of matrix products.
We used the following sets of matrices for testing:
1. One hundred diagonalizable matrices of size 128. These matrices have
the form VTDV , where Dis a diagonal matrix whose diagonal elements
are random values between kand kwith different integer values of k,
and Vis an orthogonal matrix obtained as V=H/16, where His the
Hadamard matrix.
2. Eighty matrices with multiple eigenvalues of size 128. These matrices
have the form VTDV , where Dis a block diagonal matrix whose di-
agonal blocks are Jordan blocks with random dimension and random
19
eigenvalues between 50 and 50, and Vis an orthogonal matrix ob-
tained as V=H/16, where His the Hadamard matrix.
3. Matrices 128 ×128, from the function matrix from the Matrix Compu-
tation Toolbox [17], and matrices from the MATLAB Eigtool package
available at http://www.cs.ox.ac.uk/pseudospectra/eigtool/. These ma-
trices appear in the state of the art in the exponential matrix compu-
tation [14, 15]. Matrices whose exponential cannot be represented in
double precision arithmetic due to overflow were excluded from all the
test matrices. One matrix from the Eigtool package where MATLAB
gave the warning “Matrix is singular to working precision” with func-
tion expm new was also excluded. Note that in Taylor methods there are
no matrix inversions. Finally, the test consisted of 31 matrices from the
Matrix Computation Toolbox and 7 matrices from the Eigtool package.
The “exact” value of the matrix exponential for matrix sets 1 and 2
was computed by using transformations eA=VTeDV, where VTeDVwas
computed using vpa function from MATLAB’s Symbolic Math Toolbox with
256 decimal digit precision. For matrix set 3, we used a quadruple precision
Taylor algorithm in Fortran with different orders and scaling parameters for
each matrix to check the result correctness.
Table 6 shows the total cost in terms of matrix products for function
expmpol, and the cost comparison in % between this function and expmpoln,
expmspl,exptayns and expm new for Tests 1-3, taking mM= 30 in all the
Taylor based functions. Table 7 shows the same comparison for the execution
times.
Table 8 shows the total cost in terms of matrix products for function
expmpol with kmax=6 (mM= 24), and the cost comparison in % between
this function and expmpoln with mM= 24, expmspl (mM= 30), exptayns
(mM= 25) and expm new for Tests 1-3. Table 9 shows the same comparison
for the execution times. For the execution times in MATLAB to be accurate
the experiments were repeated 100 times and the mean values were used.
The standard deviation of the values in all cases was less than or equal to
6.56%.
Figures 1, 2 and 3 show the results obtained in the three case studies.
Subfigures 1a, 2a and 3a show the graphics of the normwise relative errors
of functions expmpol,expmpoln,expmspl,exptayns and expm new, with
all the Taylor functions using mM= 30. These graphics show the numeri-
cal stability of the four functions, the relative errors of all implementations,
20
and a solid line that represents the unit roundoff multiplied by the relative
condition number of the exponential function at X[1, p. 55], respectively.
The relative condition number was computed using the MATLAB function
funm condest1 from the Matrix Function Toolbox [1, Appendix D]. For a
method to perform in a backward and forward stable manner, its error should
lie not far above this line on the graph [15, p. 1188]. Subfigures 1b, 2b and
3b show the performances of the four functions [16]. The αvalue varies be-
tween 1 and 5 with a step size equal to 0.1. Value pis the probability that
the considered algorithm has a relative error lower than or equal to α-times
the smallest error over all the methods. Subfigures 1c, 2c and 3c show the
ratios of errors of expmpol with respect to expmpoln,expmspl,exptayns
and expm new, ordering the matrices according to the ratio of relative errors
Eexpmpol/Eexpm new. Subfigures 1d, 2d and 3d show the ratios of the ma-
trix products of expmpol with respect to expmspl,exptayns and expm new.
Figure 4 shows the results obtained in the case study 3 when kmax=6
(m= 24) is considered for expmpol and expmpoln, and kmax=8 for exptayns
giving a similar maximum order (mM= 25). As mentioned above, expmspl
has a fixed maximum order mM= 30.
According to the results shown in the above tables and figures we can
outline the following conclusions:
All the functions performed in a numerically stable way in the three
case studies.
The implementations based on Taylor series are more accurate in gen-
eral than the implementation based on Pad´e approximants. expmpol
and expmpoln were the most accurate functions in Test 2, behaving
similarly. In Test 1 and 3 the most accurate functions were expmspl
and expmtayns, behaving also similarly.
The execution processing times for Tests 1-3 (matrices 128 ×128) and
function expm new were between 204.12% and 244.65% of the expmpol
times, and higher than all the Taylor functions.
expmpol with mM= 24 gives the lowest cost in terms of matrix prod-
ucts from all the functions, being the cost of expm new between 123.51%
and 135.89% of the cost of expmpol, and the cost of the Paterson–
Stockmeyer based functions exptayns and expmspl between 127.62%
and 140.18% of the cost of expmpol.
21
Table 6: Cost of expmpol in terms of matrix products, denoted by C0=Cexpmpol,
and relative cost comparison R=Cfunction/C0(%) between expmpol and functions
expmpoln,expmpol orig,expmspl,exptayns and expm new. All the Taylor approxima-
tions use maximum order mM= 30.
C0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new
Test 1 948 104.01 100 125.84 119.41 127.14
Test 2 703 104.84 100 132.01 121.48 116.31
Test 3 270 108.52 100 132.84 128.15 121.73
The cost in terms of matrix products of expmpoln (without estima-
tion) is between a 104.01% and a 109.27% of the cost of expmpol (with
estimation), but the execution times are between 51.08% and 65.68%
of the execution times of expmpol. Therefore, the cost of the estima-
tion algorithm is not negligible for matrices sized 128 ×128, and it is
important to reduce the number of estimations.
The cost of expmpol orig in terms of matrix products is equal to the
cost of expmpol, but the execution times are between 122.08% and
154.13% of the execution times of expmpol. Therefore, the reduction
of the estimations in expmpol is noticeable with respect to the original
algorithm expmpol orig, with no increase of the number of matrix
products.
in Test 3 with expmpol and expmpol with mM= 24, exptayns with
mM= 25 and expmspl with mM= 30, expmspl was the most accurate
function at a higher cost, and the second function was exptayns with
mM= 25.
According to Tables 6-9, in expmpol maximum order mM= 24 is rec-
ommended for maximum efficiency. Similarly to [4] and [3], we checked
that using mM= 30 provides a higher accuracy at a slightly higher
cost.
7. Conclusions
In this paper we have given two Taylor algorithms for the computation of
the matrix exponential. They are based on the matrix polynomial evaluation
22
Table 7: Execution times of expmpol in seconds, denoted by t0=texpmpol, and rela-
tive execution time comparison R=tfunction/t0(%) between expmpol and functions
expmpoln,expmpol orig,expmspl,exptayns and expm new. All the Taylor approxima-
tions use maximum order mM= 30.
t0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new
Test 1 0.2244 60.39 141.10 156.80 123.88 232.08
Test 2 0.1798 57.62 146.25 161.32 127.49 229.63
Test 3 0.0825 65.68 154.13 184.78 143.71 206.19
Table 8: Cost of expmpol with mM= 24 in terms of matrix products, denoted by C0=
Cexpmpol, and relative cost comparison R=Cfunction/C0(%) between expmpol (mM=
24) and functions expmpoln (mM= 24), expmpol orig (mM= 24), expmspl (mM= 30),
exptayns (mM= 25) and expm new.
C0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new
Test 1 887 107.22 100 134.50 127.62 135.89
Test 2 662 104.38 100 140.18 129.00 123.51
Test 3 259 109.27 100 138.48 133.59 126.90
Table 9: Execution times of expmpol with mM= 24 in seconds, denoted by t0=texpmpol,
and relative execution time comparison R=tfunction/t0(%) between expmpol (mM=
24) and functions expmpoln (mM= 24), expmpol orig (mM= 24), expmspl (mM= 30),
exptayns (mM= 25) and expm new.
t0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new
Test 1 0.2112 56.55 122.08 165.38 119.29 244.65
Test 2 0.1730 51.08 126.39 166.27 120.43 236.35
Test 3 0.0820 60.44 137.85 183.76 135.12 204.12
methods from [9] and on an improved version of the Taylor scaling algorithm
from [4], simplified in [6]. These algorithms achieve maximum efficiency using
maximum order mM= 24, and maximum accuracy with mM= 30.
In the last years Taylor algorithms have shown to be significantly more
accurate than Pad´e algorithms, being also more efficient in some cases. With
the matrix polynomial evaluation methods from [9] they are now considerably
more efficient, and the proposed Taylor algorithms were superior in tests to
the state-of-the-art Pad´e algorithm from [14] in both accuracy and efficiency.
Taylor methods based on the Paterson–Stockmeyer matrix polynomial
23
0 20 40 60 80 100
Matrix
10-16
10-15
10-14
10-13
Er
cond*u
expmpol
expmpoln
expmspl
exptayns
expm_new
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmpol
expmpoln
expmspl
exptayns
expm_new
(b) Perfomances.
0 20 40 60 80 100
Matrix
100
101
Ratio relative errors
E(expmpol)/E(expmpoln)
E(expmpol)/E(expmspl)
E(expmpol)/E(exptayns)
E(expmpol)/E(expm_new)
(c) Ratio of relative errors.
0 20 40 60 80 100
Matrix
0.5
0.6
0.7
0.8
0.9
1
Ratio matrix products
M(expmpol)/M(expmpoln)
M(expmpol)/M(expmspl)
M(expmpol)/M(exptayns)
M(expmpol)/M(expm_new)
(d) Ratio of matrix products.
Figure 1: Experimental results with Test 1. All the Taylor functions use maximum order
mM= 30.
evaluation method seem to be more accurate in certain cases than the pro-
posed algorithms, based on the methods from [9]. Future work is addressed
to:
Increasing the accuracy of the methods based on [9].
Increasing the efficiency searching Taylor approximations based on eval-
uation formulas (62)-(65) from [9] of the type ykj(A) with k2 and
j2 with higher orders of approximation than the ones given in this
paper for the same cost.
24
0 20 40 60 80
Matrix
10-16
10-15
10-14
10-13
10-12
Er
cond*u
expmpol
expmpoln
expmspl
exptayns
expm_new
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmpol
expmpoln
expmspl
exptayns
expm_new
(b) Perfomances.
0 20 40 60 80
Matrix
100
101
Ratio relative errors
E(expmpol)/E(expmpoln)
E(expmpol)/E(expmspl)
E(expmpol)/E(exptayns)
E(expmpol)/E(expm_new)
(c) Ratio of relative errors.
0 20 40 60 80
Matrix
0.5
0.6
0.7
0.8
0.9
1
Ratio matrix products
M(expmpol)/M(expmpoln)
M(expmpol)/M(expmspl)
M(expmpol)/M(exptayns)
M(expmpol)/M(expm_new)
(d) Ratio of matrix products.
Figure 2: Experimental results with Test 2. All the Taylor functions use maximum order
mM= 30.
8. References
[1] N.J. Higham, Functions of Matrices: Theory and Computation, Society
for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
[2] C.B. Moler, C.V. Loan, Nineteen dubious ways to compute the expo-
nential of a matrix, twenty-five years later, SIAM Rev. 45 (2003) 3–49.
[3] J. Sastre, J. J. Ib´nez, E. Defez, P. A. Ruiz, Efficient scaling-squaring
Taylor method for computing matrix exponential, SIAM J. Sci. Comput.
37 (1) (2015) A439–A455.
25
0 10 20 30
Matrix
10-20
10-15
Er
cond*u
expmpol
expmpoln
expmspl
exptayns
expm_new
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmpol
expmpoln
expmspl
exptayns
expm_new
(b) Perfomances.
0 10 20 30
Matrix
10-1
100
101
Ratio relative errors
E(expmpol)/E(expmpoln)
E(expmpol)/E(expmspl)
E(expmpol)/E(exptayns)
E(expmpol)/E(expm_new)
(c) Ratio of relative errors.
0 10 20 30 40
Matrix
0.6
0.7
0.8
0.9
1
1.1
1.2
Ratio matrix products
M(expmpol)/M(expmpoln)
M(expmpol)/M(expmspl)
M(expmpol)/M(exptayns)
M(expmpol)/M(expm_new)
(d) Ratio of matrix products.
Figure 3: Experimental results with Test 3. All the Taylor functions use maximum order
mM= 30.
[4] J. Sastre, J. Ib´a˜nez, E. Defez and P. Ruiz, Accurate matrix exponential
computation to solve coupled differential models in engineering, Math.
Comput. Model., 54 (2011) 1835–1840.
[5] J. Sastre, J. Ib´nez, E. Defez and P. Ruiz, Accurate and efficient matrix
exponential computation, Int. J. Comput. Math., 91 (1) (2014), 97–112
[6] P. Ruiz, J. Sastre, J. Ib´nez, E. Defez, High performance computing
of the matrix exponential, J. Comput. Appl. Math., 291 (2016), pp.
370-379.
[7] J. Sastre, J. Ib´nez, P. Alonso, J. Peinado, E. Defez, Two algorithms
26
0 10 20 30
Matrix
10-20
10-15
Er
cond*u
expmpol
expmpoln
expmspl
exptayns
expm_new
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
expmpol
expmpoln
expmspl
exptayns
expm_new
(b) Perfomances.
0 10 20 30
Matrix
10-1
100
101
Ratio relative errors
E(expmpol)/E(expmpoln)
E(expmpol)/E(expmspl)
E(expmpol)/E(exptayns)
E(expmpol)/E(expm_new)
(c) Ratio of relative errors.
0 10 20 30 40
Matrix
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
Ratio matrix products
M(expmpol)/M(expmpoln)
M(expmpol)/M(expmspl)
M(expmpol)/M(exptayns)
M(expmpol)/M(expm_new)
(d) Ratio of matrix products.
Figure 4: Experimental results with Test 3: expmpol and expmpol with mM= 24,
exptayns with mM= 25 and expmspl with mM= 30.
for computing the matrix cosine function, Appl. Math. Comput., 312
(2017), pp. 66-77.
[8] E. Defez, J. Ib´a˜nez, J. Sastre, J. Peinado, P. Alonso, , A
new efficient and accurate spline algorithm for the matrix expo-
nential computation, J. Comput. Appl. Math. (2017), in press,
https://doi.org/10.1016/j.cam.2017.11.029.
[9] J. Sastre, Efficient evaluation of matrix polynomials, Linear Algebra
Appl., 539(2018), pp. 229–250.
[10] S. Blackford, J. Dongarra, Installation guide for LAPACK, LAPACK
27
Working Note 41, Department of Computer Science University of Ten-
nessee, 1999.
[11] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar mul-
tiplications necessary to evaluate polynomials, SIAM J. Comput.,
2(1)(1973), pp. 60–66.
[12] J. Sastre, On the polynomial approximation of matrix functions, early
unpublished version of [9] submitted to AMC, Feb. 19, (2016), available
at http://personales.upv.es/jorsasma/AMC-S-16-00951.pdf
[13] N. J. Higham, Fortran codes for estimating the one-norm of a real or
complex matrix, with applications to condition estimation, ACM Trans.
Math. Softw. 14 (4) (1988) 381-396.
[14] A.H. Al-Mohy, N.J. Higham, A new scaling and squaring algorithm for
the matrix exponential, SIAM J. Matrix Anal. Appl. 31 (3) (2009) 970–
989.
[15] N.J. Higham, The scaling and squaring method for the matrix exponen-
tial revisited, SIAM J. Matrix Anal. Appl., 26(4):1179–1193, 2005.
[16] E.D. Dolan and J.J. Mor´e, Benchmarking optimization software with
performance profiles, Math. Programming, 91:201–213, 2002.
[17] N. J. Higham, The Matrix Computation Toolbox,
http://www.ma.man.ac.uk/ higham/mctoolbox.
28
... In the last years, the backward error analysis has become a fundamental tool in the development of algorithms based on Padé [1,2,3,4,5], Taylor [6,7,8,9,10,11] and other approximations [12] for computing the matrix exponential, logarithm, sine, cosine and other matrix functions. In many of the previous references the backward error analysis of the approximations is based on using the compositional inverse function of f , denoted by (f ) −1 , where the backward error of a certain approximation p(x) of function f (x) is defined by a quantity ∆x such that ...
... In this case ∆A cannot be obtained by using the method from [8] since formula (9) can be applied to cos(A + ∆A), but it cannot be applied to cos( √ A + ∆A). Instead, we use Theorem 2.1 to obtain enough terms of the backward error series to calculate new Θ b (m) values for function cos( √ A), denoted byΘ b (m), in a similar way to that in Example 3.3. ...
... With those formulas, T 12 (A) can be evaluated with 4 matrix products, two for evaluating A 2 and A 3 and two for evaluating (72) and (73). The matrix evaluation methods from [17] have been successfully used for computing the matrix exponential [9] and the matrix cosine [10]. Note that using the Paterson-Stockmeyer formula [18] T 12 (A) = (((a 12 A 3 + a 11 A 2 + a 10 A + a 9 I)A 3 + a 8 A 2 + a 7 A + a 6 I)A 3 + a 5 A 2 + a 4 A + a 3 I)A 3 + a 2 A 2 + a 1 A + a 0 I, where a i = (−1) i (2i)! , 5 matrix products evaluations are needed to evaluate the same approximation. ...
Article
Full-text available
In this paper we give a new formula to write the forward error of Taylor approximations of analytical functions in terms of the backward error of those approximations, considering exact arithmetic in both errors. Using this formula, a method to compute a backward error given by the power series centered in the same expansion point as the Taylor approximation is provided. The application of this method for Padé approximations is also shown. Based on the previous method, a MATLAB implementation for computing the first power series terms of the backward error for Taylor and Padé approximations of a given analytical function is provided, and examples of its use are given. Applications to the computation of matrix functions are given that overcome limitations of other backward error analyses which uses inverse compositional functions in the literature.
... EI were initially considered unpractical because of the computational challenges related to the matrix exponential [20]. However, novel numerical methods to compute the matrix exponential [21], [22], [23] have recently unlocked the potential of EI. This has already been used in computer graphics for simulating deformable objects, modeled as systems of particles [24], [25], [26]. ...
... Using (29) we have transformed the problem of computing (16) into a matrix exponential evaluation. Computing the matrix exponential is a challenging but well-understood numerical problem [21], [30], [22], [23]. We have used as starting point the scaling&squaring method, as revisited by Higham [21], a widely used method for computing the exponential of small-medium size dense matrices. ...
... We believe that computation times could be improved, especially for the first two operations. In computeIntegrals we could test novel techniques [23] to compute the matrix exponential, exploit the sparse structure of the matrix A, and warm-start the computation using quantities computed at the previous cycle. In prepareExpLDS, the inverse contact-space inertia matrix Υ could be computed faster using a customized algorithm, rather than with products between J, M −1 and J [34]. ...
Article
Full-text available
The simulation of multibody systems with frictional contacts is a fundamental tool for many fields, such as robotics, computer graphics, and mechanics. Hard frictional contacts are particularly troublesome to simulate because they make differential equations stiff, calling for computationally demanding implicit integration schemes. We suggest to tackle this issue by using exponential integrators, a long-standing class of integration schemes (first introduced in the 1960s) that in recent years has enjoyed a resurgence of interest. This scheme can be applied to multibody systems subject to stiff viscoelastic contacts, leading to integration errors similar to implicit Euler, but at much lower computational costs (between 2 to 100 times faster). In our tests with quadruped and biped robots, our method demonstrated a stable behavior with large time steps (10 ms) and stiff contacts (105\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$10^{5}$\end{document} N/m). Its excellent properties, especially for fast and coarse simulations, make it a valuable candidate for many applications in robotics, such as simulation, model predictive control, reinforcement learning, and controller design.
... The efficient computation of an important number of functions of matrices of moderate size is of great interest in many different fields [1,3,4,5,7,8,9,20,22,23,24,25,30,31,34,35,36,37,38]. Frequently, it suffices to compute their action on a vector [2,17,25,26,27,32] allowing to solve problems of large dimensions, or using an appropriate filtering technique the previous methods can also be used to compute functions of large sparse matrices [40]. ...
... Then, high order rational Chebyshev or Padé approximants, or polynomial approximations are frequently considered to approximate the formal solution following some tricks that allow to carry their computations with a reduced number of operations [1,8,18,35]. For example, a Taylor polynomial of degree 18 can be computed with only 5 matrix-matrix products [8] or a diagonal Padé approximation, that approximates e x up to order x 26 , can be computed with only 6 matrix-matrix products and one inverse [1]. ...
Preprint
Full-text available
We present a novel class of methods to compute functions of matrices or their action on vectors that are suitable for parallel programming. Solving appropriate simple linear systems of equations in parallel (or computing the inverse of several matrices) and with a proper linear combination of the results, allows us to obtain new high order approximations to the desired functions of matrices. An error analysis to obtain forward and backward error bounds is presented. The coefficients of each method, which depends on the number of processors, can be adjusted to improve the accuracy, the stability or to reduce round off errors of the methods. We illustrate this procedure by explicitly constructing some methods which are then tested on several numerical examples.
... In [1], the overscaling problem of the SSA was alleviated by using \| H k \| 1/k \leq \| H\| . In [3,22,24,25], the SSA was combined with new algorithms based on the Taylor polynomial to improve the computational efficiency. In [9], an accurate and efficient SSA was developed with matrix splines. ...
Article
Many numerical methods for evaluating matrix functions can be naturally viewed as computational graphs. Rephrasing these methods as directed acyclic graphs (DAGs) is a particularly effective approach to study existing techniques, improve them, and eventually derive new ones. The accuracy of these matrix techniques can be characterized by the accuracy of their scalar counterparts, thus designing algorithms for matrix functions can be regarded as a scalar-valued optimization problem. The derivatives needed during the optimization can be calculated automatically by exploiting the structure of the DAG, in a fashion analogous to backpropagation. This paper describes GraphMatFun.jl , a Julia package that offers the means to generate and manipulate computational graphs, optimize their coefficients, and generate Julia, MATLAB, and C code to evaluate them efficiently at a matrix argument. The software also provides tools to estimate the accuracy of a graph-based algorithm and thus obtain numerically reliable methods. For the exponential, for example, using a particular form (degree-optimal) of polynomials produces implementations that in many cases are cheaper, in terms of computational cost, than the Padé-based techniques typically used in mathematical software. The optimized graphs and the corresponding generated code are available online.
Article
In this paper, we develop efficient and accurate evaluation for the Lyapunov operator function φl(LA)[Q], where φl(⋅) is the function related to the exponential, LA is a Lyapunov operator and Q is a symmetric and full-rank matrix. An important application of the algorithm is to the matrix-valued exponential integrators for matrix differential equations such as differential Lyapunov equations and differential Riccati equations. The method is exploited by using the modified scaling and squaring procedure combined with the truncated Taylor series. A quasi-backward error analysis is presented to determine the value of the scaling parameter and the degree of the Taylor approximation. Numerical experiments show that the algorithm performs well in both accuracy and efficiency.
Preprint
Full-text available
In this paper, we develop efficient and accurate evaluation for the Lyapunov operator function $\varphi_l(\mathcal{L}_A)[Q],$ where $\varphi_l(\cdot)$ is the function related to the exponential, $\mathcal{L}_A$ is a Lyapunov operator and $Q$ is a symmetric and full-rank matrix. An important application of the algorithm is to the matrix-valued exponential integrators for matrix differential equations such as differential Lyapunov equations and differential Riccati equations. The method is exploited by using the modified scaling and squaring procedure combined with the truncated Taylor series. A quasi-backward error analysis is presented to determine the value of the scaling parameter and the degree of the Taylor approximation. Numerical experiments show that the algorithm performs well in both accuracy and efficiency.
Article
In this paper, we propose a μ-mode integrator for computing the solution of stiff evolution equations. The integrator is based on a d-dimensional splitting approach and uses exact (usually precomputed) one-dimensional matrix exponentials. We show that the action of the exponentials, i.e. the corresponding batched matrix-vector products, can be implemented efficiently on modern computer systems. We further explain how μ-mode products can be used to compute spectral transforms efficiently even if no fast transform is available. We illustrate the performance of the new integrator by solving, among the others, three-dimensional linear and nonlinear Schrödinger equations, and we show that the μ-mode integrator can significantly outperform numerical methods well-established in the field. We also discuss how to efficiently implement this integrator on both multi-core CPUs and GPUs. Finally, the numerical experiments show that using GPUs results in performance improvements between a factor of 10 and 20, depending on the problem.
Article
We present a practical algorithm to approximate the exponential of skew-Hermitian matrices up to round-off error based on an efficient computation of Chebyshev polynomials of matrices and the corresponding error analysis. It is based on Chebyshev polynomials of degrees 2, 4, 8, 12 and 18 which are computed with only 1, 2, 3, 4 and 5 matrix-matrix products, respectively. For problems of the form exp(−iA), with A a real and symmetric matrix, an improved version is presented that computes the sine and cosine of A with a reduced computational cost. The theoretical analysis, supported by numerical experiments, indicates that the new methods are more efficient than schemes based on rational Padé approximants and Taylor polynomials for all tolerances and time interval lengths. The new procedure is particularly recommended to be used in conjunction with exponential integrators for the numerical time integration of the Schrödinger equation.
Article
Full-text available
In this work an accurate and efficient method based on matrix splines for computing matrix exponential is given. An algorithm and a MATLAB im- plementation have been developed and compared with the state-of-the-art algorithms for computing the matrix exponential. We also developed a par- allel implementation for large scale problems. This implementation allowed us to get a much better performance when working with this kind of prob- lems.
Article
Full-text available
This paper presents a new family of methods for evaluating matrix polynomials more efficiently than the state-of-the-art Paterson–Stockmeyer method. Examples of the application of the methods to the Taylor polynomial approximation of matrix functions like the matrix exponential and matrix cosine are given. Their efficiency is compared with that of the best existing evaluation schemes for general polynomial and rational approximations, and also with a recent method based on mixed rational and polynomial approximants. For many years, the Paterson–Stockmeyer method has been considered the most efficient general method for the evaluation of matrix polynomials. In this paper we show that this statement is no longer true. Moreover, for many years rational approximations have been considered more efficient than polynomial approximations, although recently it has been shown that often this is not the case in the computation of the matrix exponential and matrix cosine. In this paper we show that in fact polynomial approximations provide a higher order of approximation than the state-of-the-art computational methods for rational approximations for the same cost in terms of matrix products.
Article
Full-text available
The computation of matrix trigonometric functions has received remarkable attention in the last decades due to its usefulness in the solution of systems of second order linear differential equations. Several state-of-the-art algorithms have been provided recently for computing these matrix functions. In this work, we present two efficient algorithms based on Taylor series with forward and backward error analysis for computing the matrix cosine. A MATLAB implementation of the algorithms is compared to state-of-the-art algorithms, with excellent performance in both accuracy and cost.
Article
Full-text available
The matrix exponential plays a fundamental role in linear differential equations arising in engineering, mechanics, and control theory. The most widely used, and the most generally efficient, technique for calculating the matrix exponential is a combination of "scaling and squaring" with a Pade approximation. For alternative scaling and squaring methods based on Taylor series, we present two modifications that provably reduce the number of matrix multiplications needed to satisfy the required accuracy bounds, and a detailed comparison of the several algorithmic variants is provided.
Article
Full-text available
This work presents a new algorithm for matrix exponential computation that significantly simplifies a Taylor scaling and squaring algorithm presented previously by the authors, preserving accuracy. A Matlab version of the new simplified algorithm has been compared with the original algorithm, providing similar results in terms of accuracy, but reducing processing time. It has also been compared with two state-of-the-art implementations based on Padé approximations, one commercial and the other implemented in Matlab, getting better accuracy and processing time results in the majority of cases.
Article
Full-text available
This working note describes how to install, test, and time version 3.0 of LAPACK, a linear algebra package for high-performance computers. Separate instructions are provided for the Unix and non-Unix versions of the test package. Further details are also given on the design of the test and timing programs.
Article
Full-text available
This work gives a new formula for the forward relative error of matrix exponential Taylor approximation and proposes new bounds for it depending on the matrix size and the Taylor approximation order, providing a new efficient scaling and squaring Taylor algorithm for the matrix exponential. A Matlab version of the new algorithm is provided and compared with Padé state-of-the-art algorithms obtaining higher accuracy in the majority of tests at similar or even lower cost.
Article
Full-text available
In principle, the exponential of a matrix could be computed in many ways. Methods involv-ing approximation theory, differential equations, the matrix eigenvalues, and the matrix characteristic polynomial have been proposed. In practice, consideration of computational stability and efficiency indicates that some of the methods are preferable to others but that none are completely satisfactory. Most of this paper was originally published in 1978. An update, with a separate bibliography, describes a few recent developments.
Article
A thorough and elegant treatment of the theory of matrix functions and numerical methods for computing them, including an overview of applications, new and unpublished research results, and improved algorithms. Key features include a detailed treatment of the matrix sign function and matrix roots; a development of the theory of conditioning and properties of the Frechet derivative; Schur decomposition; block Parlett recurrence; a thorough analysis of the accuracy, stability, and computational cost of numerical methods; general results on convergence and stability of matrix iterations; and a chapter devoted to the f(A)b problem. Ideal for advanced courses and for self-study, its broad content, references and appendix also make this book a convenient general reference. Contains an extensive collection of problems with solutions and MATLAB implementations of key algorithms.