Content uploaded by Jorge Sastre Martínez

Author content

All content in this area was uploaded by Jorge Sastre Martínez on Aug 25, 2018

Content may be subject to copyright.

Boosting the computation of the matrix exponential

J. Sastrea,∗, J. Ib´a˜nezb, E. Defezc

Universitat Polit`ecnica de Val`encia, Spain

aInstituto de Telecomunicaciones y Aplicaciones Multimedia

bInstituto de Instrumentaci´on para Imagen Molecular

cInstituto de Matem´atica Multidisciplinar

Abstract

This paper presents new Taylor algorithms for the computation of the ma-

trix exponential based on recent new matrix polynomial evaluation methods.

Those methods are more eﬃcient than the well known Paterson–Stockmeyer

method. The cost of the proposed algorithms is reduced with respect to

previous algorithms based on Taylor approximations. Tests have been per-

formed to compare the MATLAB implementations of the new algorithms to

a state-of-the-art Pad´e algorithm for the computation of the matrix expo-

nential, providing higher accuracy and cost performances.

Keywords:

Matrix exponential, scaling and squaring, Taylor series, eﬃcient matrix

polynomial evaluation.

1. Introduction

The computation of matrix functions has received remarkable attention in

the last decades because of its numerous applications in science and engineer-

ing [1]. From all the matrix functions the matrix exponential has been the

most studied function, and a large number of methods for its computation

have been proposed [2, 1].

In 2009 the authors submitted their ﬁrst work with Taylor based algo-

rithms for computing the matrix exponential [3]. Until then, Pad´e approx-

∗Corresponding author: jsastrem@upv.es. This work has been supported by Spanish

Ministerio de Econom´ıa y Competitividad and European Regional Development Fund

(ERDF) grant TIN2014-59294-P.

Preprint submitted to Applied Mathematics and Computation

imants for the matrix exponential were preferred to Taylor approximations

because Pad´e algorithms were more eﬃcient than the existing Taylor algo-

rithms, for similar accuracy [1]. Applying and improving the algorithms for

Pad´e approximants from [15] to Taylor approximations, the Taylor algorithms

from [3] showed to be generally more accurate than the Pad´e algorithm from

[15] in tests, with a slightly higher cost.

In [4] the authors presented a scaling and squaring Taylor algorithm for

computing the matrix exponential based on an improved mixed backward

and forward error analysis. It was more accurate than the state-of-the-art

Pad´e algorithm from [14] in the majority of tests, with a slightly higher

cost. Subsequently, [5] provided a formula for the forward relative error of

the matrix exponential Taylor approximation, and proposed to increase the

allowed error bounds depending on the matrix size and the Taylor approx-

imation order. This algorithm reduced the computational cost in exchange

for a small impact in accuracy. The method proposed in [6] simpliﬁed the

algorithm of [4], preserving accuracy, and showing to be more accurate than

the Pad´e algorithm from [14] in the majority of tests, being also more eﬃ-

cient in some cases. Finally, [8] used Taylor approximations combined with

spline techniques to increase accuracy, also increasing the cost. In this work,

we present new Taylor algorithms based on the eﬃcient matrix polynomial

evaluation methods from [9], increasing signiﬁcantly the eﬃciency of the pre-

vious Taylor methods. We will show that the new algorithms are generally

both more accurate and eﬃcient than the state-of-the-art Pad´e algorithm

from [14].

Throughout this paper Cn×ndenotes the set of complex matrices of size

n×n,Idenotes the identity matrix for this set, ρ(A) is the spectral radius

of matrix A, and Ndenotes the set of positive integers. The matrix norm k·k

denotes any subordinate matrix norm; in particular k·k1is the 1-norm. The

symbols d·e and b·c denote the smallest following and the largest previous

integer, respectively. The cost of the Taylor algorithms will be given in terms

of the number of evaluations of matrix products, denoting the cost of one

matrix product by M. Note that the multiplication by the matrix inverse in

Pad´e approximations is calculated as the solution of a multiple right-hand

side linear system. The cost of the solution of multiple right-hand side linear

systems AX =B, where matrices Aand Bare n×nwill be denoted by D.

Taking into account that, see [10, App. C]

D≈4/3M, (1)

2

the cost of evaluating rational approximations will be also given in terms of

M. All the given algorithms are intended for IEEE double precision arith-

metic. Their extension to diﬀerent precision arithmetics is straightforward.

This paper is organized as follows: Section 2 presents a general scaling and

squaring Taylor algorithm. Section 3 introduces eﬃcient evaluation formulas

for the Taylor matrix polynomial approximation of the matrix exponential

based on [9]. Section 4 presents the scaling and squaring error analysis. The

new algorithm is given in Section 5. Section 6 shows numerical results and

Section 7 gives some conclusions. Next theorem from [5] will be used in

Section 4 to bound the norm of matrix power series.

Theorem 1. Let hl(x) = Pk≥lbkxkbe a power series with radius of con-

vergence R, and let ˜

hl(x) = Pk≥l|bk|xk.For any matrix A∈Cn×nwith

ρ(A)< R, if akis an upper bound for ||Ak|| (||Ak|| ≤ ak),p∈N,1≤p≤l,

p0∈Nis the multiple of pwith l≤p0≤l+p−1, and

αp= max{a

1

k

k:k=p, l, l +1, l +2, . . . , p0−1, p0+1, p0+2, . . . , l +p−1},(2)

then ||hl(A)|| ≤ ˜

hl(αp).

2. General Taylor algorithm

The Taylor approximation of order mof the matrix exponential of A∈

Cn×n, denoted by Tm(A), is deﬁned by the expression

Tm(A) =

m

X

k=0

Ak

k!.(3)

The scaling and squaring algorithms with Taylor approximation (3) are based

on the approximation eA=e2−sA2s

≈(Tm(2−sA))2s[2], where the non-

negative integers mand sare chosen to achieve full machine accuracy at a

minimum cost.

A general scaling and squaring Taylor algorithm for computing the matrix

exponential is presented in Algorithm 1, where mMis the maximum allowed

value of m.

3

Table 1: Values of qkdepending on the selection of mMin [6].

k0 1 2 3 4 5 6 7 8 9

mMmk124691216202530

20 1 2 2 3 3 4 4 4

25 1 2 2 3 3 4 4 5 5

30 1 2 2 3 3 4 4 5 5 5

Algorithm 1 General scaling and squaring Taylor algorithm for computing

B=eA, where A∈Cn×nand mMis the maximum approximation order

allowed.

1: Preprocessing of matrix A.

2: Choose mk6mM, and an adequate scaling parameter s∈N∪ {0}for

the Taylor approximation with scaling.

3: Compute the matrix polynomial B=Tmk(A/2s)

4: for i=1:sdo

5: B=B2

6: end for

7: Postprocessing of matrix B.

In this paper the evaluation of the Taylor matrix polynomial of Step

3 is improved. The preprocessing and postprocessing steps (1 and 7) are

based on applying transformations to reduce the norm of matrix A, see [1],

and will not be discussed in this paper. In Step 2, the optimal order of

Taylor approximation mk6mMand the scaling parameter swill be chosen

improving the algorithm from [6].

In [6] the matrix polynomial Tm(2−sA) was evaluated using the Paterson–

Stockmeyer method evaluation formula (7) of [6], see [11]. The optimal Taylor

orders mfor that method were in the set mk={1,2,4,6,9,12,16,20,25,30, . . .},

k= 0,1, . . ., respectively, where the matrix powers A2,A3,···, Aqwere eval-

uated and stored to be used in all the computations. Table 1, see [6, Table

1], shows some optimal values of q, denoted by qk, used in [6] for orders mk,

k= 0,1,2, . . . , M, and mM= 20,25 or 30. In this work Tm(2−sA) will be

computed using new evaluation methods based on [9], more eﬃcient than

Paterson–Stockmeyer method.

Finally, after the evaluation of Tm(2−sA), srepeated squarings are ap-

4

plied in Steps 4–6. The postprocessing is applied in Step 7 to obtain the

matrix exponential approximation of the original matrix A. The computa-

tional cost of [6, Alg. 1] in terms of matrix products is

Cost(mk, s) = k+s. (4)

Note that if s= 0 then Cost(mk,0) = k. Using matrix polynomial evaluation

methods based on [9] the costs from (4) will be reduced.

3. Eﬃcient evaluation of the matrix exponential Taylor matrix ap-

proximation

In this section formulas for evaluating Taylor based approximations of

orders m= 8,15,21,24 and 30 are given based on the results from [9]. MAT-

LAB R2017a Symbolic Math Toolbox with 200 decimal digit arithmetic was

used in all the calculations. Note that for Tm(A) with orders m= 1, 2, 4,

the same evaluation formulas as in [6] will be used. i.e.

T1(A) = A+I, (5)

T2(A) = A2/2 + A+I, (6)

T4(A) = ((A2/4 + A)/3 + I)A2/2 + A+I. (7)

3.1. Evaluation of T8(A)

Following [9, Ex.3.1] we can evaluate T8(A) with the following evaluation

formulas

y02(A) = A2(c1A2+c2A),(8)

T8(A) = (y02(A) + c3A2+c4A)(y02 (A) + c5A2) (9)

+c6y02(A) + A2/2 + A+I,

where the coeﬃcients ci,i= 1,2,...,6, numbered correlatively, are given in

IEEE double precision arithmetic in Table 2, see [9, Table 4], A2is computed

once and stored to be reused in all the computations, and T8(A) can be

evaluated with a cost 3M. Note that with a cost 3M, taking s= 0 in

(4) and using Table 1 the maximum Taylor order available with Paterson–

Stockmeyer method is m= 6.

5

c14.980119205559973×10−3

c21.992047682223989×10−2

c37.665265321119147×10−2

c48.765009801785554×10−1

c51.225521150112075×10−1

c62.974307204847627×100

Table 2: Coeﬃcients for computing the matrix exponential Taylor approximation of order

m= 8 using (8) and (9), see [9, Table 4].

3.2. Evaluation of a Taylor based approximation of order 15

Following [9, Ex. 5.1] we can evaluate a Taylor based approximation

of order m= 15 of the matrix exponential with the following evaluation

formulas

y02(A) = A2(c1A2+c2A),(10)

y12(A) = (y02 (A) + c3A2+c4A)(y02(A) + c5A2) + c6y02 (A) + c7A2,(11)

y22(A) = (y12 (A) + c8A2+c9A)(y12(A) + c10 y02(A) + c11 A)

+c12y12(A) + c13y02 (A) + c14A2+c15 A+c16I, (12)

where, analogously to [9, Ex. 5.1], A2is computed once and stored to be

reused in all the computations, the degree of polynomial y22 (A) is m= 16 and

it can be evaluated with a cost 4M. Note that with Paterson–Stockmeyer

method and cost 4M, using (4) and Table 1, the maximum available order

for Tm(A) is m= 9.

Similarly to [9, Ex. 5.1], if we rewrite y22(A) as a matrix polynomial

of degree m= 16, and equate the coeﬃcients of the matrix powers Ai,

i= 0,1,...,15, of the rewritten y22(A) to the corresponding coeﬃcients

of T15(A), we obtain a nonlinear system of 16 equations with 16 unknown

coeﬃcients ci,i= 1,2,...,16.

Note that MATLAB Symbolic Math Toolbox solve function could not

give the general solution of the expressions (57)-(59) proposed in [9, Ex. 5.1],

returning one numerical approximation of the coeﬃcients instead. Moreover,

we had to use a special ordering of the coeﬃcients so that solve obtained

this numerical approximation, see [12, Sec. 4.2]. In the following we de-

scribe a method to obtain more solutions for the coeﬃcients ci. The stability

recommendations from [9, p. 243] propose to select the solution giving the

6

lesser error in the system of equations, when substituting the coeﬃcients ci

in IEEE double precision arithmetic.

y12(A) from (11) can be written as a polynomial of degree 8

y12(A) =

8

X

i=2

aiAi.(13)

Taking into account the four existing solutions from [9, Ex. 1] for evaluating

polynomials of degree 8 with cost 3M, see (25) and (26) from [9], it follows

that

y02(A) = ±A2(√a8A2+a7/(2√a8)A).(14)

If we write y22(A) as a polynomial of degree m= 16 and equate the co-

eﬃcients of the matrix powers Aito the Taylor coeﬃcients 1/i!, for i=

0,1,...,15, a system of 16 equations with 16 variables ai,i= 2,3,...,8,

and ci,i= 8,9,...,16, arises. This system can be solved using solve

function from MATLAB Symbolic Math Toolbox obtaining coeﬃcients ai,

i= 2,3,...,8, and ci,i= 8,9,...,16. Then, [9, Ex. 3.1] gives four solu-

tions for the coeﬃcients ci,i= 1,2,...,7, from (10) and (11) that allow to

evaluate (13) with such coeﬃcients ai. From now on, we will call these dif-

ferent solutions of coeﬃcients cibased on coeﬃcients aias nested solutions.

From all the four solutions we selected the one giving the lesser error in the

original system of equations when rounding the coeﬃcients to IEEE double

precision arithmetic, see Table 3. For that solution if we write y22(A) as a

polynomial of degree m= 16 the coeﬃcient of A16, denoted by b16, and its

relative error with respect to the corresponding Taylor polynomial coeﬃcient

are

b16 = 2.608368698098254 ×10−14,|b16 −1/16!|16! ≈0.454,(15)

where the coeﬃcient b16 is presented in IEEE double precision arithmetic.

Note that there is another real solution that gives |b16 −1/16!|16! ≈2.510 >

0.454, and then we discarded it. Note also that the relative error in (15) is

the same as in [9, Ex. 5.1], see (60) from [9].

The fact that

y22(A) = T15 (A) + b16A16 ,(16)

will have interesting implications in the error analysis of the algorithm. The

evaluation formulas (11) and (12) are slightly diﬀerent from (58) and (59) of

[9, Ex. 5.1] so that the coeﬃcient of the last matrix Ain (12) is 1, instead

of c2from [9, Table 9].

7

c14.018761610201036×10−4c92.224209172496374×100

c22.945531440279683×10−3c10 -5.792361707073261×100

c3-8.709066576837676×10−3c11 -4.130276365929783×10−2

c44.017568440673568×10−1c12 1.040801735231354×101

c53.230762888122312×10−2c13 -6.331712455883370×101

c65.768988513026145×100c14 3.484665863364574×10−1

c72.338576034271299×10−2c15 1

c82.381070373870987×10−1c16 1

Table 3: Coeﬃcients of y02,y12 ,y22 from (10)-(12) for computing a Taylor based approx-

imation of the matrix exponential of order m= 15.

3.3. Evaluation of a Taylor based approximation of order 21

In a similar way to [9, Ex. 5.1] we developed the following formulas to

evaluate a Taylor approximation of order m= 21 of the matrix exponential

y03(A) = A3(c1A3+c2A2+c3A),(17)

y13(A) = (y03 (A) + c4A3+c5A2+c6A)(y03(A) + c7A3+c8A2)

+c9y03(A) + c10 A3+c11A2,(18)

y23(A) = (y13 (A) + c12A3+c13 A2+c14A)(y13 (A) + c15y03(A) + c16A)

+c17y13(A) + c18y03 (A) + c19A3+c20 A2+A+I, (19)

where A2and A3are computed once and reused in all the computations, the

degree of polynomial y23(A) is m= 24, and it can be evaluated with a cost

5M. Note that with cost 5Mthe maximum available order for Tm(A) using

Paterson–Stockmeyer method is m= 12, see (4) and Table 1.

Proceeding analogously to Section 3.2, writing

y13(A) =

12

X

i=2

aiAi,(20)

and taking into account the solutions of (38) from [9], it follows that

y03(A) = ±A3(√a12 A3+a11/(2√a12 )A2+ (4a10a12 −a2

11)/(8a3/2

12 )A).(21)

Then, similarly to Section 3.2, we use function solve from the MATLAB

Symbolic Math Toolbox to obtain ai,i= 2,3,...,12, and ci,i= 12,13,...,20,

so that y23(x) = T21(x) + O(x22), for scalar x. Then, we use the solutions

8

c11.161658834444880×10−6c11 1.392249143769798×10−1

c24.500852739573010×10−6c12 -2.269101241269351×10−3

c35.374708803114821×10−5c13 -5.394098846866402×10−2

c42.005403977292901×10−3c14 3.112216227982407×10−1

c56.974348269544424×10−2c15 9.343851261938047×100

c69.418613214806352×10−1c16 6.865706355662834×10−1

c72.852960512714315×10−3c17 3.233370163085380×100

c8-7.544837153586671×10−3c18 -5.726379787260966×100

c91.829773504500424×100c19 -1.413550099309667×10−2

c10 3.151382711608315×10−2c20 -1.638413114712016×10−1

Table 4: Coeﬃcients of y03 ,y13,y23 from (17)-(19) for computing a Taylor based matrix

exponential approximation of order m= 21.

from [9, pp. 237–240] to evaluate (20) with (17) and (18), obtaining diﬀerent

nested solutions for coeﬃcients ci,i= 1,2,...,11. Again, according to the

stability recommendations from [9, p. 243], we selected the solution given in

Table 4. For that solution, if we write y23 (A) from (19) as a polynomial of

degree m= 24, the coeﬃcients of A22,A23 and A24, denoted by b22,b23 and

b24, respectively, and their relative errors with respect to the corresponding

Taylor polynomial coeﬃcients are

b22 = 5.010366348377648 ×10−22,|b22 −1/22!|22! ≈0.437,(22)

b23 = 2.822218236752230 ×10−23,|b23 −1/23!|23! ≈0.270,(23)

b24 = 1.821018669767511 ×10−24,|b24 −1/24!|24! ≈0.130,(24)

(25)

where the coeﬃcients b22,b23 and b24 are presented in IEEE double precision

arithmetic. Again, the fact that

y23(A) = T21 (A) + b22A22 +b23 A23 +b24A24 ,(26)

will have interesting implications in the error analysis of the algorithm.

3.4. Evaluation of Taylor approximations T24(A)and T30 (A)

The coeﬃcients for the evaluation of T24(A) corresponding to

y04(A) = A4(c1A4+c2A3+c3A2+c4A),(27)

y14(A) = (y04 (A) + c5A4+c6A3+c7A2+c8A)(y04(A) + c9A4+c10 A3

9

+c11A2) + c12 y04(A) + c13 A4+c14A3+c15 A2+c16A, (28)

T24(A) = y14 (A)(y04(A) + c17 A4+c18A3+c19 A2+c20A)

+c21A4+c22A3+c23A2+A+I , (29)

can be obtained in a similar way to Sections 3.2 and 3.3. For the values of

the coeﬃcients in IEEE double precision arithmetic see case m= 24 from

the nested function EFFEVPOL from the MATLAB implementation of the

proposed algorithm expmpol at

http://personales.upv.es/∼jorsasma/software/expmpol.m

Note that using (27)–(29), T24(A) can be evaluated with a cost 6M. For

6Mthe maximum available order with Paterson-Stockmeyer method is 16,

see Table 1.

Similarly, the coeﬃcients for the evaluation of T30(A) corresponding to

y05(A) = A5(c1A5+c2A4+c3A3+c4A2+c5A),(30)

y15(A) = (y05 (A) + c6A5+c7A4+c8A3+c9A2+c10A)

×(y05(A) + c11 A5+c12A4+c13 A3+c14A2)

+c15y05(A) + c16A5+c17 A4+c18A3+c19 A2+c20A, (31)

T30(A) = y15 (A)(y05(A) + c21 A5+c22A4+c23 A3+c24A2+c25A)

+c26A5+c27A4+c28A3+c29 A2+A+I, (32)

can be obtained. For the coeﬃcient values see case m= 30 from the nested

function EFFEVPOL from expmpol.

Note that using (30)–(32), T30(A) can be evaluated with a cost 7M, whereas

for that cost the maximum available order with Paterson–Stockmeyer method

is 20, see Table 1.

Finally, note that in the evaluation of Pad´e approximants from [14] many

of the products of the numerator can be reused for the denominator. In

general, this is not so advantageous if we compute the numerator and de-

nominator with the evaluation algorithms from [9]. For instance, to evaluate

r88 using the method of Section 3.1 to evaluate both numerator and denom-

inator polynomials of degree 8 we need one matrix product to evaluate A2,

two more products to evaluate the numerator and two more for the denomi-

nator. Therefore, the cost for evaluating r88 is 5M+ 1Dproviding an order

of the approximation 16. Note that with a lower cost of 5M,y23 from Section

3.3 gives an order of approximation 21.

10

4. Error analysis

For completeness of the exposition we summarize some results for the

error analysis of [6]. Denoting the remainder of the Taylor series as Rm(A) =

Pk≥m+1 Ak/k!, for a scaled matrix 2−sA,s∈N∪ {0}, we can write

Tm(2−sA)2s

=eAI+gm+1(2−sA)2s

=eA+2shm+1(2−sA),(33)

gm+1(2−sA) = −e−2−s

ARm(2−sA), hm+1 2−sA= log I+gm+1(2−sA),

(34)

where log denotes the principal logarithm, hm+1(X) is deﬁned in the set Ωm=

X∈Cn×n:ρe−XTm(X)−I<1, and both gm+1 (2−sA) and hm+1 (2−sA)

are holomorphic functions of Ain Ωmand then commute with A. As

showed in [4], ∆A=hm+1(2−sA) is the backward absolute error, and ∆E=

gm+1(2−sA) is the forward relative error, from the approximation of eAby

the Taylor series with scaling and squaring in exact arithmetic.

Using the scalar Taylor series in (34) one gets

gm+1(x) = X

k≥m+1

b(m)

kxk, hm+1(x) = X

k≥1

(−1)k+1(gm+1(x))k

k=X

k≥m+1

c(m)

kxk,

(35)

where b(m)

kand c(m)

kdepend on the order m. Moreover, b(m)

k=c(m)

k, k =

m+ 1, m + 2,...,2m+ 1, and if khm+1(2−sA)k 1 or kgm+1(2−sA)k 1,

then

∆A=hm+1(2−sA)≈gm+1(2−sA)=∆E, (36)

see [5].

Choosing sso that

hm+1 2−sA

≤max 1,

2−sA

u, (37)

where u=2−53 is the unit roundoﬀ in IEEE double precision arithmetic, then:

•If 2−skAk ≥ 1, then ∆A≤ kAkuand using (33) one gets (Tm(2−sA))2s=

eA+∆A≈eA,

•If 2−skAk<1, using (33)-(36) and the Taylor series, if (37) holds one

gets

11

Rm2−sA

=

e2−sAgm+1(2−sA)

(38)

≈

Tm2−sAhm+1(2−sA)

≤

Tm2−sA

u.

Hence, by (38), in IEEE double precision arithmetic one gets that

Tm(2−sA) + Rm(2−sA)≈Tm(2−sA).

Using MATLAB symbolic Math Toolbox, high precision arithmetic, 200

series terms and a zero ﬁnder we obtained the maximal values Θmof Θ =

k2−sAk, see Table 5, such that, using the notation of Theorem 1

||hm+1 (2−sA)|| ≤ ˜

hm+1 (Θ) = Pk≥m+1 |c(m)

k|Θk≤max{1,Θ}u. (39)

Hence, if ||2−sA|| ≤ Θmthen (37) holds. For orders m= 20,25,30 where

Θm>1 by (14) of [6] the next bound holds

||hm+1 2−sA|| ≤ ˜

hm+1 ||2−sA||=˜

hm+1 (Θ) ≤Θu, 0≤Θ≤Θm.(40)

The previous analysis is valid for the orders mkused in [6], and also for the

new evaluation formulas for T8(A), T24(A) and T30(A) from Section 3. For

y22(A) from (16) it is easy to show that all the previous error analysis is valid

substituting Rm(2−sA) in (34) for

˜

R15+(2−sA) = exp(2−sA)−y22(2−sA) = R15(2−sA)−b16 A16.(41)

Similarly, for y23 (A) from (26) all the previous error analysis is also valid

substituting Rm(2−sA) in (34) for

˜

R21+(2−sA) = exp(2−sA)−y22(2−sA) = R21(2−sA)−

24

X

i=22

biAi.(42)

Therefore, the new Θ15+ and Θ21+ values are obtained in a similar way to

the other values obtained in [6], where the suﬃx ”+” has been added. The

corresponding values are given in Table 5. Comparing those values to the

actual values obtained for T15(A), i.e. Θ15 ≈0.658 and T21 (A), i.e. Θ21 ≈

1.624, we see that both values are lower than the corresponding values for

Θ15+ and Θ21+. That is the reason for adding the suﬃx ”+” in m= 15+

and m= 21+, notation that will be used from now on. The higher values

12

Table 5: Maximal values Θm=k2−sAksuch that ˜

hm+1 (Θm)≤max{1,Θm}u, coeﬃcient

ratios c(m)

m+1/c(m)

m+2 and values u/c(m)

m+2. Note that m= 15+ and m= 21+ correspond to

approximations (16) and (26) that are not exactly T15(A) and T21(A).

mΘm|c(m)

m+1/c(m)

m+2| |u/c(m)

m+2|

1 1.490116111983279×10−83/2 3.33×10−16

2 8.733457513635361×10−64/3 8.88×10−16

4 1.678018844321752×10−36/5 1.60×10−14

8 1.773082199654024×10−210/9 4.48×10−11

15+ 6.950240768069781×10−11.15 5.87×10−3

21+ 1.682715644786316 1.03 2.93×105

24 2.219048869365090 26/25 1.79×109

30 3.539666348743690 32/31 9.42×1017

Θ15+ and Θ21+ will imply a lower order or scaling for certain matrices. For

instance, this may be the case for matrices with Θ15 ≤ kAk ≤ Θ15+ and

Θ21 ≤ kAk ≤ Θ21+, see Section 5. We have checked that bound (40) also

holds for the approximation y23(A) from (26) and for T24(A). Hence, the

optimal orders mto be used are in the set

m∗

k={1,2,4,8,15+,21+,24,30}.(43)

5. New Taylor algorithms

This section summarizes the results from the scaling algorithm from [6]

and gives the new algorithms. The scaling algorithm from [6] can be applied

directly to the Taylor approximations T8(A), T24(A), y22(A) from (16) and

y23(A) from (26) from Section 3 using the new values for all the parameters

from Table 5, see [6, Table 2]. For the values corresponding to m= 15+ and

21+ the remainders (41) and (42) were used in (34) and the coeﬃcients c(m)

m+1

and c(m)

m+2 from (35) were obtained symbolically.

For all norms appearing in the scaling algorithm we will use the 1-norm,

and mMwill be the maximum allowed Taylor order. In [6, Sec. 4.1] the min-

imum value of αpfrom Theorem 1, denoted by αmin, was used to determine

an initial scaling parameter. It was taken as, see (16) from [6],

αmin ≈max{a1/(m+1)

m+1 , a1/(m+2)

m+2 },(44)

13

where am+1 and am+2 were the 1–norm estimation of ||Am+1|| and ||Am+2||

using the block 1–norm estimation algorithm of [13]. This algorithm has a

cost O(n2), negligible compared to the cost of a matrix product, i.e. O(n3) if

the matrix Ais big enough. Alternatively, we can also use bounds for ||Am+1||

and ||Am+2|| based on products of norms of the matrix powers computed at

each step of the algorithm, similarly to [7, Alg. 3] and [8, Alg. 3]. Then, no

norm estimations are used. For instance, if m= 4 and A2is known, then

kAm+1k=kA5k ≤ kA2k2kAk.

In numerical tests with matrices 128×128, see Section 6, we detected that

the cost of the 1–norm estimation algorithm of [13] is not negligible. Then, a

MATLAB implementation of Algorithm 1 with no norm estimations is given.

Moreover, an implementation that reduces the number of norm estimations

compared to that of [6, Alg. 2] is also given. Both options are possible in

function expmpol(A,kmax,NormEst) from

http://personales.upv.es/∼jorsasma/software/expmpol.m

This function can use mM= 24 (kmax=6) and mM= 30 (kmax=7), and select-

ing the use of the norm estimation algorithms from [13] (NormEst=1), or not

(NormEst=0). expmpol does not consider preprocessing and postprocessing

of matrix A, as indicated in Section 2.

Algorithm 2, called ms selectNoNormEst, presents the proposed algo-

rithm for the selection of the order mand the scaling parameter sto com-

pute the matrix exponential using no norm estimations and mM= 24. It

corresponds to Step 2 of Algorithm 1. The selection of mand sis analo-

gous to that from [6, Alg. 2], but using bounds for ||Am+1 || and ||Am+2||

based on products of norms of matrix powers. The version of Algorithm 2

allowing both mM= 24 and mM= 30 is straightforward, and a MATLAB

implementation allowing both values of mMis given as the nested function

ms selectNoNormEst of expmpol.

In Steps 1–18, Algorithm 2 checks if any of the Taylor optimal orders

mk= 1,2,4,8,15+,21+ and mM= 24, satisﬁes (37) without scaling, i.e.

s= 0, using the bounds provided in Section 4 and considering the two ﬁrst

terms of the series of hm+1(x) from (35), in the same way as (19) from [6]

khm+1 (2−sA)k

|c(m)

m+2|≤

m+2

X

k=m+1

c(m)

k

c(m)

m+2

ak

2sk ≤max 1,

2−sA

u

|c(m)

m+2|,(45)

As mentioned above, we obtain bounds of ||Am+1 || and ||Am+2|| using prod-

ucts of norms of matrix powers previously computed, see Steps 6, 8, etc.

14

If no value of mk≤mMsatisﬁes (45) with s= 0, the algorithm computes

αmin using only the bounds for ||Am+1|| and ||Am+2||, and determines the

initial scaling parameter sin Steps 19–20. Then, if s > 0, the algorithm

checks in Steps 22–26 if the initial scaling parameter can be reduced, checking

if (45) holds with sred =s−1.

Finally, In Steps 27–32 we check if mM−1= 21+ can be used with the

same scaling s, reducing the cost, similarly to [6, Alg. 2].

After selecting mand s, the matrix polynomial evaluation formulas from

Section 3 will be used. The implementation of this part is straightforward

using the formulas and coeﬃcients from Section 3. A MATLAB implemen-

tation can be seen at function EFFEVPOL of expmpol. And, ﬁnally, the s

squarings from Steps 4-6 are done. See function expmpol for a complete

implementation of the algorithm.

For simplicity, the special case of nilpotent matrices, i.e. kAik= 0 ×I,

i= 2,3,4, or 5, is not included in Algorithm 2, but they are considered in

the MATLAB implementation ms selectNoNormEst in expmpol.

15

Algorithm 2 Algorithm ms selectNoNormEst: Given a matrix A∈Cn×n

this algorithm obtains the order mand scaling parameter sfor Taylor ap-

proximation of order m6mM= 24 using no norm estimations of matrix

powers.

Input: A∈Cn×n

Output: Order m, scaling parameter s, and some matrix powers Ai, 1 ≤i≤3.

1: Set Θm,c(m)

m+1/c(m)

m+2 and u/c(m)

m+2 values from Table 5

2: a1=||A||

3: s= 0 .Null scaling tests

4: if a1<Θ1,m= 1, quit .Test m= 1 with s= 0

5: A2=A2,a2=kA2k

6: a3=a2a1,a4=a2

2,b= max{1,||A||} · u/c(2)

4

7: if c(2)

3/c(2)

4·a3+a4≤b,m= 2, quit .Test m= 2 with s= 0

8: a5=a2

2a1,a6=a3

2,b= max{1,||A||} · u/c(4)

6

9: if c(4)

5/c(4)

6·a5+a6≤b,m= 4, quit .Test m= 4 with s= 0

10: a9=a4

2a1,a10 =a5

2,b= max{1,||A||} · u/c(8)

10

11: if c(8)

9/c(8)

10 ·a9+a10 ≤b,m= 8, quit .Test m= 8 with s= 0

12: a16 =a8

2,a17 =a8

2a1,b= max{1,||A||} · u/c(15)

17

13: if c(15)

16 /c(15)

17 ·a16 +a17 ≤b,m= 15, quit .Test m= 15+ with s= 0

14: A3=A2A,a3=kA3k

15: a22 = min{a11

2, a6

3a2

2, a7

3a1},a23 = min{a10

2a3, a7

3a2},b= max{1,||A||} · u/c(21)

23

16: if c(21)

22 /c(21)

23 ·a22 +a23 ≤b,m= 21, quit .Test m= 21+ with s= 0

17: a25 = min{a11

2a3, a7

3a2

2, a8

3a1},a26 = min{a13

2, a8

3a2},b= max{1,||A||} · u/c(24)

26

18: if c(24)

25 /c(24)

26 ·a25 +a26 ≤b,m= 24, quit .Test m= 24 with s= 0

19: αmin = max{a1/25

25 , a1/26

26 }

20: s=dlog2(αmin/Θm)e.Calculate initial scaling s

21: if s > 0then

22: sred =s−1.Check if (45) holds reducing the scaling

23: b= max{1,||A||/2sred )} · u/c(24)

26

24: if c(24)

25 /c(24)

26 ·a25/225sred +a26 /226sred ≤bthen

25: s=sred .(45) holds, then s=sred

26: end if

27: b= max{1,||A||/2s} · u/c(21)

23 .Test if the scaled matrix allows using

mM−1= 21+

28: if c(21)

22 /c(21)

23 ·a22/222s+a23 /223s≤bthen

29: m= 21, quit

30: else

31: m= 24, quit .The scaled matrix does not allow using mM−1

32: end if

33: end if

16

The selection of mand sin Algorithm 2 is analogous to that of [6, Alg. 2]

with the new optimal orders mfrom Section 3, and using no norm estimations

for kAm+1kand kAm+2k.

With respect to algorithms for the selection of mand susing norm esti-

mations of matrix powers [13], it is easy to implement a direct adaptation of

[6, Alg. 2] that selects the optimal mand sfor the new values m∗

kfrom (43)

using the corresponding parameter values from Table 5. We will denote this

algorithm as expmpol orig. However, [6, Alg. 2] computes norm estimations

of matrix powers that can be avoided. MATLAB function ms selectNormEst

has been developed reducing the number of norm estimations of matrix pow-

ers with respect to expmpol orig, see expmpol. It is similar to function

ms selectNormEst from [7] for selecting mand sfor computing the matrix

cosine with a Taylor algorithm. The reduction of estimations will increase

the eﬃciency for small matrices, see Section 6. The selection of mand s

in ms selectNormEst is based essentially on testing if the order m∗

ksatisﬁes

(45) using products of known norms of matrix powers and, if so, then, test-

ing if m∗

k−1also satisﬁes (45) using estimations of kAm∗

k−1+1kand kAm∗

k−1+2k.

General steps for ms selectNormEst with mM= 24 are:

1. Set s= 0.

2. Test if one order m=m∗

kfrom (43), with k= 2,3,4, i.e. m={4,8,15},

satisﬁes (45) with s= 0, using bounds of matrix powers based on

products of kAkand kA2k. If so, then: if (45) holds with m=m∗

k−1

using the estimation of kAm+1kand kAm+2k, then m=m∗

k−1, else

m=mk, and the algorithm quits. Note that, similarly to [6, Alg. 2],

we will not estimate kAm+2kif

c(m)

k

c(m)

m+2

kAm+1k

2s(m+1) >max 1,

2−sA

u

|c(m)

m+2|,(46)

see (45).

3. If m= 15+ satisﬁes (45) with s= 0 using estimations for kA16kand

kA17kthen: if (45) holds with m= 8 using estimations of kA9kand

kA10kthen m= 8, else m= 15+, and the algorithm quits. Testing

m= 15+ with norm estimations is necessary in this step because A3is

computed in the following step, and A3is not used for the evaluation

of the approximation y22(A) of order m= 15+ from (10)-(12).

4. A3is computed.

17

5. If m= 21+ satisﬁes (45) with s= 0 using products of known norms of

matrix powers, including the estimations from Step 3, then m= 21+ is

selected and the algorithm quits. For instance, a bound for kAm+1kis

kA22k ≤ kA16 kmin{kA3k2,kA2k3}, where kA16kwas estimated in Step

3.

6. If m= 24 satisﬁes (45) using products of known norms of matrix powers

then: if (45) holds with m= 21+ and estimations of kA22kand kA23k,

then m= 21+, else m= 24, and the algorithm quits.

7. If m= 24 satisﬁes (45) using estimations of kA25kand kA26kthen: if

(45) holds with m= 21+ and estimations of kA22kand kA23 k, then

m= 21+, else m= 24, and the algorithm quits.

8. The rest is similar to Steps 22-32 from Algorithm 2, combining norm

estimations of matrix powers and bounds of norms of matrix powers

based on products of known norms of matrix powers.

The extension of the previous algorithm to mM= 30 is straightforward

and a complete MATLAB implementation of the algorithm can be seen in

MATLAB function ms selectNormEst of expmpol, allowing mM= 24 and

mM= 30.

6. Numerical experiments and conclusions

In this section we compare a MATLAB implementation of the new algo-

rithm expmpol, available at

http://personales.upv.es/∼jorsasma/software/expmpol.m

with the following three legacy MATLAB implementations for the computa-

tion of the matrix exponential:

•expmspl: MATLAB implementation based on matrix splines from [8],

available at http://personales.upv.es/∼jorsasma/software/expmspl.m.

This function uses Taylor approximations with a ﬁxed maximum order

mM= 30.

•exptayns: MATLAB implementation of exptaynsv3 from [6], available

at http://personales.upv.es/∼jorsasma/software/exptaynsv3.m. For this

function the argument denoted as kmax determines the maximum Tay-

lor approximation order used.

18

•expm new: MATLAB implementation based on Pad´e approximation

from [14]. For this function the argument denoted as schur fact is not

used, i.e. an initial transformation to complex or real Schur form is

not used. Anyway, this kind of transformations can be also applied to

our algorithms, see Test 4 of [7, Sec. 3] for the application of a Schur

form to a Taylor algorithm for the matrix cosine. The accuracy results

were similar to those of a state-of-the-art Pad´e algorithm that used the

same Schur form, but with a lower cost.

We use the following notations for diﬀerent versions of the algorithms pro-

posed in this paper:

•expmpol orig: Direct MATLAB implementation of expmpol based on

[6, Alg. 2], i.e. evaluating the Taylor approximations with the methods

from Section 3, changing the optimal orders mkfor m∗

kfrom (43) in [6,

Alg. 2] and using norm estimations.

•expmpol: MATLAB implementation using ms selectNormEst from

Section 5 to select mand s(argument NormEst = 1 in function expmpol),

which reduces the number of norm estimations with respect to

expmpol orig.

•expmpoln: MATLAB implementation using ms selectNoNormEst from

Section 5 to select mand s(Algorithm 2, with no norm estimations,

argument NormEst = 0 in function expmpol).

The experimental results have been carried out on an Intel i7-6700HQ

@2.60GHz, 32GB RAM, using MATLAB R2017a. The accuracy was tested

by computing the relative error E = keA−˜

Xk1/

eA

1,where ˜

Xis the

computed approximation and the cost is given in terms of matrix products.

We used the following sets of matrices for testing:

1. One hundred diagonalizable matrices of size 128. These matrices have

the form VTDV , where Dis a diagonal matrix whose diagonal elements

are random values between −kand kwith diﬀerent integer values of k,

and Vis an orthogonal matrix obtained as V=H/16, where His the

Hadamard matrix.

2. Eighty matrices with multiple eigenvalues of size 128. These matrices

have the form VTDV , where Dis a block diagonal matrix whose di-

agonal blocks are Jordan blocks with random dimension and random

19

eigenvalues between −50 and 50, and Vis an orthogonal matrix ob-

tained as V=H/16, where His the Hadamard matrix.

3. Matrices 128 ×128, from the function matrix from the Matrix Compu-

tation Toolbox [17], and matrices from the MATLAB Eigtool package

available at http://www.cs.ox.ac.uk/pseudospectra/eigtool/. These ma-

trices appear in the state of the art in the exponential matrix compu-

tation [14, 15]. Matrices whose exponential cannot be represented in

double precision arithmetic due to overﬂow were excluded from all the

test matrices. One matrix from the Eigtool package where MATLAB

gave the warning “Matrix is singular to working precision” with func-

tion expm new was also excluded. Note that in Taylor methods there are

no matrix inversions. Finally, the test consisted of 31 matrices from the

Matrix Computation Toolbox and 7 matrices from the Eigtool package.

The “exact” value of the matrix exponential for matrix sets 1 and 2

was computed by using transformations eA=VTeDV, where VTeDVwas

computed using vpa function from MATLAB’s Symbolic Math Toolbox with

256 decimal digit precision. For matrix set 3, we used a quadruple precision

Taylor algorithm in Fortran with diﬀerent orders and scaling parameters for

each matrix to check the result correctness.

Table 6 shows the total cost in terms of matrix products for function

expmpol, and the cost comparison in % between this function and expmpoln,

expmspl,exptayns and expm new for Tests 1-3, taking mM= 30 in all the

Taylor based functions. Table 7 shows the same comparison for the execution

times.

Table 8 shows the total cost in terms of matrix products for function

expmpol with kmax=6 (mM= 24), and the cost comparison in % between

this function and expmpoln with mM= 24, expmspl (mM= 30), exptayns

(mM= 25) and expm new for Tests 1-3. Table 9 shows the same comparison

for the execution times. For the execution times in MATLAB to be accurate

the experiments were repeated 100 times and the mean values were used.

The standard deviation of the values in all cases was less than or equal to

6.56%.

Figures 1, 2 and 3 show the results obtained in the three case studies.

Subﬁgures 1a, 2a and 3a show the graphics of the normwise relative errors

of functions expmpol,expmpoln,expmspl,exptayns and expm new, with

all the Taylor functions using mM= 30. These graphics show the numeri-

cal stability of the four functions, the relative errors of all implementations,

20

and a solid line that represents the unit roundoﬀ multiplied by the relative

condition number of the exponential function at X[1, p. 55], respectively.

The relative condition number was computed using the MATLAB function

funm condest1 from the Matrix Function Toolbox [1, Appendix D]. For a

method to perform in a backward and forward stable manner, its error should

lie not far above this line on the graph [15, p. 1188]. Subﬁgures 1b, 2b and

3b show the performances of the four functions [16]. The αvalue varies be-

tween 1 and 5 with a step size equal to 0.1. Value pis the probability that

the considered algorithm has a relative error lower than or equal to α-times

the smallest error over all the methods. Subﬁgures 1c, 2c and 3c show the

ratios of errors of expmpol with respect to expmpoln,expmspl,exptayns

and expm new, ordering the matrices according to the ratio of relative errors

Eexpmpol/Eexpm new. Subﬁgures 1d, 2d and 3d show the ratios of the ma-

trix products of expmpol with respect to expmspl,exptayns and expm new.

Figure 4 shows the results obtained in the case study 3 when kmax=6

(m= 24) is considered for expmpol and expmpoln, and kmax=8 for exptayns

giving a similar maximum order (mM= 25). As mentioned above, expmspl

has a ﬁxed maximum order mM= 30.

According to the results shown in the above tables and ﬁgures we can

outline the following conclusions:

•All the functions performed in a numerically stable way in the three

case studies.

•The implementations based on Taylor series are more accurate in gen-

eral than the implementation based on Pad´e approximants. expmpol

and expmpoln were the most accurate functions in Test 2, behaving

similarly. In Test 1 and 3 the most accurate functions were expmspl

and expmtayns, behaving also similarly.

•The execution processing times for Tests 1-3 (matrices 128 ×128) and

function expm new were between 204.12% and 244.65% of the expmpol

times, and higher than all the Taylor functions.

•expmpol with mM= 24 gives the lowest cost in terms of matrix prod-

ucts from all the functions, being the cost of expm new between 123.51%

and 135.89% of the cost of expmpol, and the cost of the Paterson–

Stockmeyer based functions exptayns and expmspl between 127.62%

and 140.18% of the cost of expmpol.

21

Table 6: Cost of expmpol in terms of matrix products, denoted by C0=Cexpmpol,

and relative cost comparison R=Cfunction/C0(%) between expmpol and functions

expmpoln,expmpol orig,expmspl,exptayns and expm new. All the Taylor approxima-

tions use maximum order mM= 30.

C0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new

Test 1 948 104.01 100 125.84 119.41 127.14

Test 2 703 104.84 100 132.01 121.48 116.31

Test 3 270 108.52 100 132.84 128.15 121.73

•The cost in terms of matrix products of expmpoln (without estima-

tion) is between a 104.01% and a 109.27% of the cost of expmpol (with

estimation), but the execution times are between 51.08% and 65.68%

of the execution times of expmpol. Therefore, the cost of the estima-

tion algorithm is not negligible for matrices sized 128 ×128, and it is

important to reduce the number of estimations.

•The cost of expmpol orig in terms of matrix products is equal to the

cost of expmpol, but the execution times are between 122.08% and

154.13% of the execution times of expmpol. Therefore, the reduction

of the estimations in expmpol is noticeable with respect to the original

algorithm expmpol orig, with no increase of the number of matrix

products.

•in Test 3 with expmpol and expmpol with mM= 24, exptayns with

mM= 25 and expmspl with mM= 30, expmspl was the most accurate

function at a higher cost, and the second function was exptayns with

mM= 25.

•According to Tables 6-9, in expmpol maximum order mM= 24 is rec-

ommended for maximum eﬃciency. Similarly to [4] and [3], we checked

that using mM= 30 provides a higher accuracy at a slightly higher

cost.

7. Conclusions

In this paper we have given two Taylor algorithms for the computation of

the matrix exponential. They are based on the matrix polynomial evaluation

22

Table 7: Execution times of expmpol in seconds, denoted by t0=texpmpol, and rela-

tive execution time comparison R=tfunction/t0(%) between expmpol and functions

expmpoln,expmpol orig,expmspl,exptayns and expm new. All the Taylor approxima-

tions use maximum order mM= 30.

t0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new

Test 1 0.2244 60.39 141.10 156.80 123.88 232.08

Test 2 0.1798 57.62 146.25 161.32 127.49 229.63

Test 3 0.0825 65.68 154.13 184.78 143.71 206.19

Table 8: Cost of expmpol with mM= 24 in terms of matrix products, denoted by C0=

Cexpmpol, and relative cost comparison R=Cfunction/C0(%) between expmpol (mM=

24) and functions expmpoln (mM= 24), expmpol orig (mM= 24), expmspl (mM= 30),

exptayns (mM= 25) and expm new.

C0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new

Test 1 887 107.22 100 134.50 127.62 135.89

Test 2 662 104.38 100 140.18 129.00 123.51

Test 3 259 109.27 100 138.48 133.59 126.90

Table 9: Execution times of expmpol with mM= 24 in seconds, denoted by t0=texpmpol,

and relative execution time comparison R=tfunction/t0(%) between expmpol (mM=

24) and functions expmpoln (mM= 24), expmpol orig (mM= 24), expmspl (mM= 30),

exptayns (mM= 25) and expm new.

t0Rexpmpoln Rexpmpol orig Rexpmpspl Rexptayns Rexpm new

Test 1 0.2112 56.55 122.08 165.38 119.29 244.65

Test 2 0.1730 51.08 126.39 166.27 120.43 236.35

Test 3 0.0820 60.44 137.85 183.76 135.12 204.12

methods from [9] and on an improved version of the Taylor scaling algorithm

from [4], simpliﬁed in [6]. These algorithms achieve maximum eﬃciency using

maximum order mM= 24, and maximum accuracy with mM= 30.

In the last years Taylor algorithms have shown to be signiﬁcantly more

accurate than Pad´e algorithms, being also more eﬃcient in some cases. With

the matrix polynomial evaluation methods from [9] they are now considerably

more eﬃcient, and the proposed Taylor algorithms were superior in tests to

the state-of-the-art Pad´e algorithm from [14] in both accuracy and eﬃciency.

Taylor methods based on the Paterson–Stockmeyer matrix polynomial

23

0 20 40 60 80 100

Matrix

10-16

10-15

10-14

10-13

Er

cond*u

expmpol

expmpoln

expmspl

exptayns

expm_new

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

expmpol

expmpoln

expmspl

exptayns

expm_new

(b) Perfomances.

0 20 40 60 80 100

Matrix

100

101

Ratio relative errors

E(expmpol)/E(expmpoln)

E(expmpol)/E(expmspl)

E(expmpol)/E(exptayns)

E(expmpol)/E(expm_new)

(c) Ratio of relative errors.

0 20 40 60 80 100

Matrix

0.5

0.6

0.7

0.8

0.9

1

Ratio matrix products

M(expmpol)/M(expmpoln)

M(expmpol)/M(expmspl)

M(expmpol)/M(exptayns)

M(expmpol)/M(expm_new)

(d) Ratio of matrix products.

Figure 1: Experimental results with Test 1. All the Taylor functions use maximum order

mM= 30.

evaluation method seem to be more accurate in certain cases than the pro-

posed algorithms, based on the methods from [9]. Future work is addressed

to:

•Increasing the accuracy of the methods based on [9].

•Increasing the eﬃciency searching Taylor approximations based on eval-

uation formulas (62)-(65) from [9] of the type ykj(A) with k≥2 and

j≥2 with higher orders of approximation than the ones given in this

paper for the same cost.

24

0 20 40 60 80

Matrix

10-16

10-15

10-14

10-13

10-12

Er

cond*u

expmpol

expmpoln

expmspl

exptayns

expm_new

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

expmpol

expmpoln

expmspl

exptayns

expm_new

(b) Perfomances.

0 20 40 60 80

Matrix

100

101

Ratio relative errors

E(expmpol)/E(expmpoln)

E(expmpol)/E(expmspl)

E(expmpol)/E(exptayns)

E(expmpol)/E(expm_new)

(c) Ratio of relative errors.

0 20 40 60 80

Matrix

0.5

0.6

0.7

0.8

0.9

1

Ratio matrix products

M(expmpol)/M(expmpoln)

M(expmpol)/M(expmspl)

M(expmpol)/M(exptayns)

M(expmpol)/M(expm_new)

(d) Ratio of matrix products.

Figure 2: Experimental results with Test 2. All the Taylor functions use maximum order

mM= 30.

8. References

[1] N.J. Higham, Functions of Matrices: Theory and Computation, Society

for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.

[2] C.B. Moler, C.V. Loan, Nineteen dubious ways to compute the expo-

nential of a matrix, twenty-ﬁve years later, SIAM Rev. 45 (2003) 3–49.

[3] J. Sastre, J. J. Ib´a˜nez, E. Defez, P. A. Ruiz, Eﬃcient scaling-squaring

Taylor method for computing matrix exponential, SIAM J. Sci. Comput.

37 (1) (2015) A439–A455.

25

0 10 20 30

Matrix

10-20

10-15

Er

cond*u

expmpol

expmpoln

expmspl

exptayns

expm_new

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

expmpol

expmpoln

expmspl

exptayns

expm_new

(b) Perfomances.

0 10 20 30

Matrix

10-1

100

101

Ratio relative errors

E(expmpol)/E(expmpoln)

E(expmpol)/E(expmspl)

E(expmpol)/E(exptayns)

E(expmpol)/E(expm_new)

(c) Ratio of relative errors.

0 10 20 30 40

Matrix

0.6

0.7

0.8

0.9

1

1.1

1.2

Ratio matrix products

M(expmpol)/M(expmpoln)

M(expmpol)/M(expmspl)

M(expmpol)/M(exptayns)

M(expmpol)/M(expm_new)

(d) Ratio of matrix products.

Figure 3: Experimental results with Test 3. All the Taylor functions use maximum order

mM= 30.

[4] J. Sastre, J. Ib´a˜nez, E. Defez and P. Ruiz, Accurate matrix exponential

computation to solve coupled diﬀerential models in engineering, Math.

Comput. Model., 54 (2011) 1835–1840.

[5] J. Sastre, J. Ib´a˜nez, E. Defez and P. Ruiz, Accurate and eﬃcient matrix

exponential computation, Int. J. Comput. Math., 91 (1) (2014), 97–112

[6] P. Ruiz, J. Sastre, J. Ib´a˜nez, E. Defez, High performance computing

of the matrix exponential, J. Comput. Appl. Math., 291 (2016), pp.

370-379.

[7] J. Sastre, J. Ib´a˜nez, P. Alonso, J. Peinado, E. Defez, Two algorithms

26

0 10 20 30

Matrix

10-20

10-15

Er

cond*u

expmpol

expmpoln

expmspl

exptayns

expm_new

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

expmpol

expmpoln

expmspl

exptayns

expm_new

(b) Perfomances.

0 10 20 30

Matrix

10-1

100

101

Ratio relative errors

E(expmpol)/E(expmpoln)

E(expmpol)/E(expmspl)

E(expmpol)/E(exptayns)

E(expmpol)/E(expm_new)

(c) Ratio of relative errors.

0 10 20 30 40

Matrix

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

Ratio matrix products

M(expmpol)/M(expmpoln)

M(expmpol)/M(expmspl)

M(expmpol)/M(exptayns)

M(expmpol)/M(expm_new)

(d) Ratio of matrix products.

Figure 4: Experimental results with Test 3: expmpol and expmpol with mM= 24,

exptayns with mM= 25 and expmspl with mM= 30.

for computing the matrix cosine function, Appl. Math. Comput., 312

(2017), pp. 66-77.

[8] E. Defez, J. Ib´a˜nez, J. Sastre, J. Peinado, P. Alonso, , A

new eﬃcient and accurate spline algorithm for the matrix expo-

nential computation, J. Comput. Appl. Math. (2017), in press,

https://doi.org/10.1016/j.cam.2017.11.029.

[9] J. Sastre, Eﬃcient evaluation of matrix polynomials, Linear Algebra

Appl., 539(2018), pp. 229–250.

[10] S. Blackford, J. Dongarra, Installation guide for LAPACK, LAPACK

27

Working Note 41, Department of Computer Science University of Ten-

nessee, 1999.

[11] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar mul-

tiplications necessary to evaluate polynomials, SIAM J. Comput.,

2(1)(1973), pp. 60–66.

[12] J. Sastre, On the polynomial approximation of matrix functions, early

unpublished version of [9] submitted to AMC, Feb. 19, (2016), available

at http://personales.upv.es/∼jorsasma/AMC-S-16-00951.pdf

[13] N. J. Higham, Fortran codes for estimating the one-norm of a real or

complex matrix, with applications to condition estimation, ACM Trans.

Math. Softw. 14 (4) (1988) 381-396.

[14] A.H. Al-Mohy, N.J. Higham, A new scaling and squaring algorithm for

the matrix exponential, SIAM J. Matrix Anal. Appl. 31 (3) (2009) 970–

989.

[15] N.J. Higham, The scaling and squaring method for the matrix exponen-

tial revisited, SIAM J. Matrix Anal. Appl., 26(4):1179–1193, 2005.

[16] E.D. Dolan and J.J. Mor´e, Benchmarking optimization software with

performance proﬁles, Math. Programming, 91:201–213, 2002.

[17] N. J. Higham, The Matrix Computation Toolbox,

http://www.ma.man.ac.uk/ higham/mctoolbox.

28