Content uploaded by Jorge Sastre Martínez

Author content

All content in this area was uploaded by Jorge Sastre Martínez on Nov 24, 2017

Content may be subject to copyright.

Eﬃcient evaluation of matrix polynomials

J. Sastrea

aInstituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Polit`ecnica de

Val`encia, Camino de Vera s/n, 46022-Valencia (Spain)

Abstract

This paper presents a new family of methods for evaluating matrix polynomi-

als more eﬃciently than the state-of-the-art Paterson–Stockmeyer method.

Examples of the application of the methods to the Taylor polynomial approx-

imation of matrix functions like the matrix exponential and matrix cosine are

given. Their eﬃciency is compared with that of the best existing evaluation

schemes for general polynomial and rational approximations, and also with

a recent method based on mixed rational and polynomial approximants. For

many years, the Paterson–Stockmeyer method has been considered the most

eﬃcient general method for the evaluation of matrix polynomials. In this pa-

per we show that this statement is no longer true. Moreover, for many years

rational approximations have been considered more eﬃcient than polynomial

approximations, although recently it has been shown that often this is not

the case in the computation of the matrix exponential and matrix cosine. In

this paper we show that in fact polynomial approximations provide a higher

order of approximation than the state-of-the-art computational methods for

rational approximations for the same cost in terms of matrix products.

Keywords: matrix, polynomial, rational, mixed rational and polynomial,

approximation, computation, matrix function.

PACS: 87.64.Aa

1. Introduction

In this paper we propose a new family of methods for evaluating matrix

polynomials more eﬃciently than the state-of-the-art Paterson–Stockmeyer

method combined with Horner’s method [1], [2, Sec. 4.2]. The proposed

Email address: jsastrem@upv.es (J. Sastre)

Preprint submitted to Elsevier November 24, 2017

methods are applied to compute eﬃciently Taylor polynomial approxima-

tions of matrix functions. The computation of matrix functions is a research

ﬁeld with applications in many areas of science and many algorithms for

their computation have been proposed [2, 3]. Among all matrix functions,

the matrix exponential has attracted special attention, see [4, 5, 6] and the

references therein, and lately the matrix cosine, see [7, 8] and the references

therein. The main methods for computing matrix functions are those based

on rational approximations, like Pad´e or Chebyshev approximations, polyno-

mial approximations, like Taylor approximation, similarity transformations

and matrix iterations [2]. Moreover, a new kind of approximations based on

mixed rational and polynomial approximants has been proposed in [9].

Recently, it has been shown that using the combination of Horner and

Paterson–Stockmeyer methods [1], [2, Sec. 4.2], polynomial approximations

may be more eﬃcient than rational Pad´e approximations for both the matrix

exponential and cosine [6, 8]. In this paper we show that using the proposed

matrix polynomial evaluation methods, polynomial approximations are more

accurate than existing state-of-the-art methods for evaluating both polyno-

mial and rational approximants for the same computing cost. Moreover, we

show that the new methods are more eﬃcient than the recent mixed ratio-

nal and polynomial approximation [9] in some cases, and examples for the

computation of the matrix exponential and the matrix cosine are given.

Throughout this paper dxedenotes the lowest integer not less than x,

bxcdenotes the highest integer not exceeding x,Ndenotes the set of positive

integer numbers, Cn×nand Rn×ndenote the sets of complex and real matrices

of size n×n, respectively, Idenotes the identity matrix for both sets, and

Rk,m denotes the space of rational functions with numerator and denominator

of degrees at most kand m, respectively.

Note that the multiplication by the matrix inverse in matrix rational

approximations is calculated as the solution of a multiple right-hand side

linear system. Therefore, the cost of evaluating polynomial and rational

approximations will be given in terms of the number of matrix products,

denoted by M, and the cost of the solution of multiple right-hand side linear

systems AX =B, where matrices Aand Bare n×n, denoted by D. From

[10, App. C] it follows that, see [9, p. 11940]:

D≈4/3M. (1)

This paper is organized as follows. Section 2 recalls some results for

eﬃcient Taylor, Pad´e, and mixed rational and polynomial approximation

2

of general matrix functions. Section 3 deals with the new matrix polyno-

mial evaluation methods giving examples for the computation of the matrix

exponential and the matrix cosine. Section 4 compares the new techniques

with eﬃcient state-of-the-art evaluation schemes for polynomial, rational and

mixed rational and polynomial approximants. Section 5 gives examples for

the matrix exponential computation even more eﬃcient than the ones given

in Section 3, suggesting more general formulas for evaluating matrix polyno-

mials. Finally, conclusions are given in Section 6.

2. Polynomial, rational, and mixed rational and polynomial ap-

proximants

This section summarizes some results of the computational costs of Tay-

lor, Pad´e, and the mixed rational and polynomial approximants given in

[9].

2.1. Taylor approximation of matrix functions

If f(A) is a matrix function deﬁned by a Taylor series according to Theo-

rem 4.7 of [2, p. 76] where Ais a complex square matrix, then we will denote

by Tm(A) the matrix polynomial deﬁned by the truncated Taylor series of

degree mof f(A). For scalar x∈Cit follows that

f(x)−Tm(x) = O(xm+1),(2)

about the origin, and, from now on, we will refer to mas the order of the

Taylor approximation. The most eﬃcient method in the literature to evaluate

a matrix polynomial

Pm(A) =

m

X

i=0

biAi,(3)

is the combination of Horner and Paterson–Stockmeyer methods [1] given by

P Sm(A) = ···bmAs+bm−1As−1+. . . +bm−s+1 A+bm−sI

×As+bm−s−1As−1+bm−s−2As−2+. . . +bm−2s+1 A+bm−2sI

×As+bm−2s−1As−1+bm−2s−2As−2+. . . +bm−3s+1 A+bm−3sI

.

.

.

×As+bs−1As−1+bs−2As−2+· ·· +b1A+b0I, (4)

3

m∗12469121620253036

CP S 0 1 2 3 4 5 6 7 8 9 10

Table 1: Cost CPS in terms of matrix products for the evaluation of polynomial Pm(A)

with Horner and Paterson–Stockmeyer methods for the ﬁrst eleven values m∗that maxi-

mize the polynomial degree obtained for a given cost.

where the integer s > 0 divides mand the matrix powers A2, A3, . . . , As, are

computed and stored previously.

Table 1 shows the maximum values of mthat can be obtained for a given

number of matrix products in Tm(A) using Paterson–Stockmeyer method,

corresponding to m=s2and m=s(s+ 1), for s∈N. The cost of evaluating

(4), denoted by CP S, for the values in m∗is given by [9, Eq. (6)]

CP S = (r+s−2)M, with r=m/s, m ∈m∗.(5)

Table 1 presents the cost CP S of evaluating (4) in terms of matrix products

for the ﬁrst eleven values of m∗. For orders m /∈m∗we evaluate Pm(A) =

P Sm0(A) using (4) taking m0= min{m1∈m∗, m1> m}and setting the

coeﬃcients bi= 0 in (4) for i=m0, m0−1, . . . , m + 1, at the same cost

as evaluating P Sm0(A). Note that because of the way the polynomial is

evaluated, the cost of using (4) is lower than that of Paterson–Stockmeyer

as implemented in [2, Sec. 4.2] (compare (5) and [2, Eq. (4.3)]).

The matrix exponential is the most studied matrix function [4], [2, Chap.

10]. For A∈Cn×nthe matrix exponential of Acan be deﬁned by the Taylor

series

exp(A) = X

i≥0

Ai

i!.(6)

Another matrix function that has received attention recently is the matrix

cosine, which can be deﬁned analogously by means of its Taylor series

cos(A) = X

i≥0

(−1)iA2i

(2i)!.(7)

Several eﬃcient algorithms based on Taylor approximations have been pro-

posed recently for the computation of the matrix exponential and cosine

[6, 8].

4

m+1 2 3 4 6 8 10 12 15 18 21

CR1.33 2.33 3.33 4.33 5.33 6.33 7.33 8.33 9.33 10.33 11.33

dR2 4 6 8 12 16 20 24 30 36 42

Table 2: Cost CRin terms of matrix products for diagonal rational approximation rmm(A)

taking D= 4/3M. Approximation order dRif rmm is a Pad´e approximant of a given

function f.

2.2. Pad´e approximations of matrix functions

The rational scalar function rkm(x) = pkm(x)/qkm(x) is a [k/m] Pad´e

approximant of the scalar function f(x) if rk,m ∈ Rk,m, qkm (0) = 1, and

f(x)−rkm(x) = O(xk+m+1 ).(8)

From now on, dRwill denote the degree of the last term of the Taylor series

of fabout the origin that rkm (x) agrees with, i.e. dR=k+m, and we will

refer to dRas the order of the Pad´e approximation. Table 2 (see [9, Table 2])

shows the maximum values of mthat can be obtained for a given number of

matrix products in rmm(A), denoted by the set m+, and the corresponding

computing cost, denoted by CRgiven by

CR= (2r+s−3)M+D≈(2r+s−1−2/3)M, r =m/s, (9)

where stakes whichever value s=d√2meor s=b√2mcthat divides mand

gives the smaller CR. Table 2 also gives the corresponding order dRof the

approximation rmm(x) if it is a Pad´e approximant of a given function f(x),

i.e. dR= 2m.

Finally, it is important to note that for a given f, k and m, a [k/m] Pad´e

approximant might not exist. Moreover, when computing rational approxi-

mations rkm of a function ffor a given square matrix A, we must verify that

the matrix qkm(A) is nonsingular, and, for an accurate computation, that it

is well conditioned. This is not the case for polynomial approximations, since

they do not require matrix inversions.

2.3. Mixed rational and polynomial approximants.

For a square matrix Athe method proposed in [9] is based on using

aggregations of mixed rational and polynomial approximants of the type

tijs(A) = ···u(i)

s(A)(v(i)

s(A))−1+u(i−1)

s(A)(v(i−1)

s(A))−1+u(i−2)

s(A)

(v(i−2)

s(A))−1+· ·· +u(1)

s(A)(v(1)

s(A))−1+wjs(A).(10)

5

where v(k)

s(A), u(k)

s(A), k= 1,2, . . . , i, are polynomials of Aof degrees at

most s,wjs(A) is a polynomial of Awith degree at most js, and i≥0, s≥0

and j≥0. Note that if i= 0 we consider that tij s(A) = wj s(A), having no

rational part. In [9, Sec. 4] a method to obtain tijs from rational approxi-

mations is given. Similarly to rational approximations, each multiplication

by a matrix inverse is calculated as the solution of a multiple right-hand side

linear system. Therefore, when computing tijs (A) it is important to verify

that the matrices v(1)

s(A), v(2)

s(A), . . . , v(i)

s(A) are nonsingular and well con-

ditioned. The total cost for computing (10), denoted by CRP , is given by,

see [9, Sec. 5]

CRP = (s+j−2)M+iD ≈(s+j−2+4i/3)M, j > 0, s > 0, i ≥0.(11)

Note that for the case where approximation (10) is intended to reproduce

the ﬁrst terms of the Taylor series of a given function f, it is equivalent to a

[(i+j)s/is] Pad´e approximant, and then, whenever it exists, tijs for scalar

x∈Csatisﬁes

f(x)−tijs(x) = O(x(2i+j)s+1).(12)

In that case we denote by dRP the order of the mixed rational and polynomial

approximation

dRP = (2i+j)s. (13)

Table 3 (see [9, Table 3]) shows for tijs(A) the approximation order dRP

if tijs reproduces the ﬁrst terms of the Taylor series of a given function f,

and the cost CRP in terms of matrix products for the values of i, j, s that

maximize dRP for a given cost. See [9] for a complete description.

3. On the evaluation of matrix polynomials. Application to the

approximation of matrix functions

This section gives new general methods for evaluating matrix polynomi-

als in a more eﬃcient way than the combination of Horner and Paterson–

Stockmeyer methods. Examples for computing the Taylor matrix polynomial

approximation of degree mof the matrix exponential and the matrix cosine

are given. These examples allow us to compute both approximations at a

lower cost than Horner and Paterson–Stockmeyer methods. Note that in this

section we used MATLAB R2017a for all the computations.

6

dRP 1 2 3 4 6 9 10 12 15 16 20 21

i0 0 1 0 1 1 2 1 2 1 2 3

j1 1 1 2 1 1 1 1 1 2 1 1

s1 2 1 2 2 3 2 4 3 4 4 3

CRP 0 1 1.33 2 2.33 3.33 3.67 4.33 4.67 5.33 5.67 6

dRP 25 28 30 35 36 42 45 49 54 55 56 63

i2 3 2 3 4 3 4 3 4 5 3 4

j1 1 1 1 1 1 1 1 1 1 1 1

s5 4 6 5 4 6 5 7 6 5 8 7

CRP 6.67 7 7.67 8 8.33 9 9.33 10 10.33 10.67 11 11.33

Table 3: Approximation order dRP if the mixed rational and polynomial approximation

tijs (A) from Section 2.3 reproduces the dRP ﬁrst terms of the Taylor series of a given

function f, cost in terms of matrix products CRP for the mixed rational and polynomial

approximation tijs(A), taking D= 4/3M, and values of i, j and s, that maximize dRP for

a given cost.

Example 3.1. Let

y02(A) = A2(c4A2+c3A),(14)

y12(A) = (y02 (A) + d2A2+d1A)(y02(A) + e2A2) (15)

+e0y02(A) + f2A2+f1A+f0I,

where c4,c3,d2,d1,e2,e0,f2,f1and f0are scalar coeﬃcients. In order to

evaluate a matrix polynomial (3) of degree m= 8, taking y12(A) = Pm(A)

and equating the coeﬃcients of the matrix powers Ai,i= 8,7,...,0, the

following system of equations arises

c4c4A8=b8A8,(16)

2c3c4A7=b7A7,(17)

(c4(d2+e2) + c3c3)A6=b6A6,(18)

(c4d1+c3(d2+e2))A5=b5A5,(19)

(d2e2+c3d1+c4e0)A4=b4A4,(20)

(d1e2+c3e0)A3=b3A3,(21)

f2A2=b2A2,(22)

f1A=b1A, (23)

f0I=b0I. (24)

Note that for clarity the coeﬃcient indices were chosen so that the sum

of the indices is equal to the exponent of the power of Athat coeﬃcient is

7

multiplying. For instance, for (16) one gets 4 + 4 = 8, for (17) one gets

3 + 4 = 7, for (18) one gets 4 + 2 = 6 and 3 + 3 = 6, and so on.

We can solve the previous system using the equations (16)-(24) from top

to bottom. Using (16)-(19), one gets

c4=±pb8,(25)

c3=b7/(2c4),(26)

d2+e2= (b6−c2

3)/c4,(27)

d1= (b5−c3(d2+e2))/c4.(28)

If b86= 0 then c46= 0 and therefore c4,c3, the sum d2+e2and d1can be

obtained explicitly. From now on we will denote de2=d2+e2to simplify the

notation and to remark that this quantity can be computed explicitly. Using

(20) it follows that

e0= (b4−c3d1−de2e2+e2

2)/c4,(29)

where using (25)-(28) e0is a polynomial of second order in the variable e2.

Hence, using (21) and (29) one gets

d1e2+c3e0=b3⇒ −b3+d1e2+c3(b4−c3d1−de2e2+e2

2)/c4= 0 (30)

which is an equation of second order in the variable e2, and therefore, using

(25)-(28), the equation on the right-hand side of (30) has the solutions

e2=

c3

c4de2−d1±rd1−c3

c4de22+ 4c3

c4b3+c2

3

c4d1−c3

c4b4

2c3/c4

,(31)

i.e., two solutions if we take c4=√b8from (25), and other two solutions if

we take c4=−√b8. Substituting the four solutions of e2in (27) and (29),

four solutions are obtained for d2=de2−e2and e0, respectively, and from

(22)-(24) it follows that

f2=b2, f1=b1, f0=b0.(32)

The cost of evaluating (15) is 3M, i.e. one matrix product to compute

and store A2, and then two matrix products to compute (14) and (15), being

8

exp cos

c44.980119205559973×10−32.186201576339059×10−7

c31.992047682223989×10−2-2.623441891606870×10−5

d27.665265321119147×10−26.257028774393310×10−3

d18.765009801785554×10−1-4.923675742167775×10−1

e21.225521150112075×10−11.441694411274536×10−4

e02.974307204847627×1005.023570505224926×101

Table 4: One possible choice for the coeﬃcients in (14) and (15) for Taylor approximation

of exponential and cosine of order m= 8.

y12(A)a polynomial of degree 8. From Table 1, the polynomial of maximum

degree that can be computed with Horner and Paterson–Stockmeyer methods

and cost 3Mis the lower value dPS = 6.

Table 4 shows one of the four solutions in IEEE double precision arith-

metic for the coeﬃcients of the Taylor approximation of the exponential

and cosine, where bi= 1/i!, and bi= (−1)i/(2i)!, respectively, for i=

0,1,...,8. Note that all the four solutions are real, avoiding complex arith-

metic if A∈Rn×n. In order to check the stability of the double precision

arithmetic solutions ci,diand eifrom Table 4, they were substituted in

equations (16)-(21) to compute the relative error for each coeﬃcient bi, for

i= 3,4,...,8. For instance, from (21) it follows that the relative error for

b3is |b3−(d1e2+c3e0)|/|b3|.We checked that all the relative errors for all

bi, for i= 3,4,...,8, were below the unit roundoﬀ in IEEE double precision

arithmetic, i.e. u= 2−53 ≈1.11.

Note that if we take

y12(A) = (y02 (A)+d2A2+d1A)(y02(A)+e2A2+e1A)+f2A2+f1A+f0I, (33)

instead of (15), the four solutions for the corresponding coeﬃcients for the

exponential and cosine Taylor approximations of order m= 8 are complex.

Therefore, if Ais real, using (33) instead of (15) is not eﬃcient for the

computation of either matrix function since it is necessary to use complex

arithmetic for evaluating (33).

9

Following Example 3.1 we can take in general

y0s(A) = As

s

X

i=1

cs+iAi,(34)

y1s(A) = y0s(A) +

s

X

i=1

diAi! y0s(A) +

s

X

i=2

eiAi!

+e0y0s(A) +

s

X

i=0

fiAi,(35)

where Ai,i= 2,3, . . . , s, can be computed once and stored to be reused in

all the computations, and, then, y1s(A) is a matrix polynomial of degree,

denoted by dy1s, and computing cost, denoted by Cy1s

dy1s= 4s, Cy1s=s+ 1, s = 2,3,.... (36)

Note that (14) and (15) are a particular case of (34) and (35) where s= 2.

Again, in order to evaluate a matrix polynomial Pm(A) of degree m= 4s, we

take y1s(A) = Pm(A), and equate the coeﬃcients of the matrix powers Ai,

i=m, m −1,...,0, from y1s(A) and Pm(A). The solution for the coeﬃcients

taking s= 2 is given in Example 3.1, where the substitution of variables gives

a polynomial equation in es=e2of degree 2 with the exact solution given by

(31). In the following a general solution is given for s > 2. The sequations

corresponding to the coeﬃcients of the powers A4s−k, for k= 0,1, . . . , s −1

are, respectively

k

X

i=0

c2s−ic2s+i−k=b4s−k, k = 0,1, . . . , s −1.(37)

Since (37), is a triangular system, if b4s6= 0 then c2s6= 0 and it follows that:

c2s=±pb4s

c2s−1=b4s−1/(2c2s),(38)

c2s−k= (b4s−k−

k−1

X

i=1

c2s−ic2s+i−k)/(2c2s), k = 2,3, . . . , s −1.

Note that if b4s<0, to prevent c2sfrom being complex we can compute

y1s(A) = −Pm(A) using (35), where c2s=−b4s>0 which gives Pm(A) =

−y1s(A).

10

Taking again dei=di+eifor abbreviation, and de1=d1, since there is

no coeﬃcient e1in (35), the equations corresponding to the coeﬃcients of

powers A3s−k, for k= 0,1, . . . , s −1, are, respectively

s

X

j=s−k

c3s−k−jdej+

s−k−1

X

i=1

c2s−k−ics+i=b3s−k, k = 0,1, . . . , s −2,(39)

s

X

j=s−k

c3s−k−jdej=b3s−k, k =s−1,

and using (38) it follows that

des= (b3s−

s−1

X

i=1

c2s−ics+i)/c2s,

des−k= (b3s−k−

s

X

j=s+1−k

c3s−k−jdej−

s−1−k

X

i=1

c2s−k−ics+i)/c2s,(40)

k= 1,2, . . . , s −2,

d1= (b2s+1 −

s

X

j=2

c2s+1−jdej)/c2s,

where, if c2s6= 0, each sum dei=di+ei,i=s, s −1,...,2, and the coeﬃcient

d1can be obtained explicitly using the coeﬃcients ci,i=s+ 1, s + 2,...,2s

obtained from (38).

The equations corresponding to the coeﬃcients of powers A2s−k, for k=

0,1, . . . , s −1, are

k

X

i=0

ds−ies−k+i+gk+e0c2s−k=b2s−k, k = 0,1...,s−1,(41)

where

gk=

s−1−k

X

i=1

cs+ides−i−k, k = 0,1, . . . , s −2, gs−1= 0,(42)

and the coeﬃcients gkcan be computed explicitly using (38) and (40).

Using (41) with k= 0 it follows that

esdes−e2

s+g0+e0c2s=b2s⇔e0= (b2s−g0−esdes+e2

s)/c2s,(43)

11

provided that c2s6= 0. Hence, since des,g0and c2scan be computed using

(38) and (40), the coeﬃcient e0is a polynomial of second order in the variable

es. Using now (41) with k= 1 one gets

es−1(des−2es) + esdes−1+g1+e0c2s−1=b2s−1,(44)

and then if ds6=esit follows that des−2es=ds−es6= 0 and

es−1= (b2s−1−g1−e0c2s−1−esdes−1)/(des−2es),(45)

where es−1is a rational function of es, since by (43) e0is a polynomial of

esof second order, and all the remaining quantities can be computed using

(38), (40) and (42). Note that analogously, using (41) with k= 2 it follows

that

es−2(des−2es) + esdes−2+es−1des−1−e2

s−1+g2+e0c2s−2=b2s−2,(46)

and then, again if ds6=esit follows that

es−2= (b2s−2−g2−e0c2s−2−esdes−2−es−1des−1+e2

s−1)/(des−2es),(47)

where similarly es−2is also a rational function of essince by (43) and (45)

one gets that e0is a polynomial of es, and es−1is a rational function of es,

and all the remaining quantities can be computed using (38), (40) and (42).

Note that from (45) and (47) it follows that the rational function es−2has

denominator (des−2es)3.

Analogously, it is easy to show that

es−k=b2s−k−gk−e0c2s−k−esdes−k

−

dk/2e−1

X

i=1 es−ides−k+i−es−k+i(des−i−2es−i)(48)

+

0/(des−2es),odd k, 2< k ≤s−2,

−es−k/2des−k/2−e2

s−k/2/(des−2es),even k, 2< k ≤s−2,

(49)

where es−kis also a rational function of eswith denominator (des−2es)ik,s

where ik,s >0 is an integer number depending on kand s.

12

The last equation of this group is

0 = −bs+1 +e0cs+1 +esd1+

ds/2e−1

X

i=1

(es−ide1+i−e1+i(des−i−2es−i))

+(0,even s > 2,

−es+1

2des+1

2−e2

s+1

2

,odd s > 2,(50)

Using the expressions (45), (47) and (48) obtained for es−k, for k=

1,2, . . .,s−2, as rational functions of esand e0in (43) as a polynomial of es,

it follows that expression (50) is a rational function of es, and multiplying it

by (des−2es)is, where isis an integer number depending on s, expression

(50) can be written as a polynomial of es, provided that des−2es=ds−

es6= 0. Hence, it has as many solutions as the resulting polynomial degree.

Substituting these solutions in the expressions (45), (47) and (48) obtained

for es−k,k= 1,2, . . . , s −2, and e0from (43) the coeﬃcients e0and es−k,

k= 1,2, . . . , s −2, can be obtained. The coeﬃcients di, for i= 1,2, . . . , s,

can be obtained using the coeﬃcients ei, for i= 0,2,3, . . . , s, and (40). The

solution for the coeﬃcients with s= 3 and s= 4 gives polynomial equations

in the variable esof degrees 4 and 6, respectively, and for s≥5 larger degree

polynomials are obtained, and then, there are even more solutions for es.

Finally, from the equations involving Ai, for i=s, s −1,...,0, it is easy

to show that

fs−k=bs−k−

s−k−2

X

i=1

dies−k−i(51)

fi=bi, i = 2,1,0.

Using (36) and Table 1, Table 5 shows the maximum orders that can be

achieved for a given cost C(M) in terms of matrix products with Horner

and Paterson–Stockmeyer methods and the method given by y1s(A) using

(34) and (35). Note that y1s(A) allows to evaluate a polynomial of degree

greater than Horner and Paterson–Stockmeyer methods for a cost from 3M

to 9M, i.e. polynomial degrees from dy1s= 8 to 32 corresponding to s=

2,3,...,8, in y1s(A). We checked that there were at least 4 real solutions for

all the coeﬃcients in (34) and (35) when y1s(A) was equal to the exponential

and cosine Taylor approximations of the corresponding degrees dy1s, avoiding

complex arithmetic if Ais a real square matrix.

13

C(M) 3 4 5 6 7 8 9 10 11 12

dP S 6 9 12 16 20 25 30 36 42 49

dy1s8 12 16 20 24 28 32 36 40 44

Table 5: Order of the approximation dP S that can be achieved using Horner and Paterson–

Stockmeyer methods and order dy1susing method given by (34) and (35) for a given cost

Cin terms of matrix products.

3.1. Combination of y1s(A)with Horner and Paterson–Stockmeyer methods

The following proposition combines Horner and Paterson–Stockmeyer

evaluation formula (4) with (35) to increase the degree of the resulting poly-

nomial to be evaluated:

Proposition 1. Let z1ps(x)be

z1ps(x) = ···y1s(x)xs+ap−1xs−1+ap−2xs−2+. . . +ap−s+1x+ap−s

×xs+ap−s−1xs−1+ap−s−2xs−2+. . . +ap−2s+1 x+ap−2s

×xs+ap−2s−1xs−1+ap−2s−2xs−2+. . . +ap−3s+1 x+ap−3s

.

.

.

×xs+as−1xs−1+as−2xs−2+· ·· +a1x+a0,(52)

where pis a multiple of sand y1s(x)is computed with (34) and (35). Then

the degree of z1ps(x)and its computational cost for x=A∈Cn×nare

dz1ps = 4s+p, Cz1ps = (1 + s+p/s)M. (53)

Proof. The value of dz1ps follows from (36) and (52). For the value of Cz1ps

note that the matrix powers Ai,i= 2,3, . . . , s, to be evaluated for Horner and

Paterson–Stockmeyer evaluation formulas can be reused to compute y1s(A),

and note also that one matrix product is needed to compute y1s(A)Asin

(52). Then, if pis a multiple of s, using (36) and (52) it follows the value of

Cz1ps in (53).

If we apply the evaluation formula (52) to evaluate a polynomial of degree

m+p, i.e. Pm+p(A), it follows that

z1ps(A) = y1s(A)Ap+

p−1

X

i=0

aiAi=Pm+p(A) =

m+p

X

i=0

biAi.(54)

14

m8 12 16 20 20 25 30 30 36 42 42 49 56 56 ···

s23445556667778···

p0 0 0 4 0 5 10 6 12 18 14 21 28 24 · ··

CP S (M) 4 5 6 7 7 8 9 9 10 11 11 12 13 13 ·· ·

Cz1ps (M) 3 4 5 6 6 7 8 8 9 10 10 11 12 12 ···

Table 6: Parameters sand pfor z1ps(x) from (52) to obtain the same approximation order

mas Horner and Paterson–Stockmeyer methods with a saving of 1 matrix product, where

CP S is the cost for evaluating (4) and Cz1ps is the cost for computing z1ps(x), both costs

in terms of matrix products. The ﬁrst row shows the maximum values of mobtained in

z1ps(x) for a given number of matrix products.

Therefore, the coeﬃcients ai,i= 0,1, . . . , p−1, are directly the corresponding

coeﬃcients bi,i= 0,1, . . . , p −1, from (54), and the coeﬃcients from y1s(A)

can be obtained changing bito bi+pin (38), (40), (43), (45), (47), (48), (50),

(51).

Using (53) Table 6 shows the parameters sand pto evaluate a polynomial

of maximum degree mfor a given cost using z1ps(A) from (52), and it is

compared to the cost of Paterson–Stockmeyer method for the same values

of m. Except for m= 8, all the values are in the set m∗from Table 1,

and for all of them one matrix product is saved with respect to using only

the Paterson–Stockmeyer method. The evaluation scheme z1ps(A) allows to

evaluate polynomials of higher degree than that of the Paterson–Stockmeyer

method for a cost greater than or equal to 3M. Note that for a cost lower

than or equal to 5Mthe maximum degree is obtained using

z1,p=0, s(A) = y1s(A),(55)

from (35). Therefore, z1ps(A) can be considered as a generalization of y1s(A).

In order to evaluate polynomials of degrees diﬀerent from those given in

Table 6 other combinations z1ps(A) of the new method with the Paterson–

Stockmeyer method can be used, where pis not a multiple of s. For instance,

a polynomial of degree m= 23 can be written as

P23(x) = z1,7,4(A) = (y1,4(x)x3+a6x2+a5x+a4)x4+a3x3+a2x2+a1x+a0,

(56)

where the coeﬃcients of y1,4(x) can be obtained similarly to those of y1s(x)

in (54).

15

c10 -6.140022498994532×10−17 e4-2.785084196756015×10−9

c9-9.210033748491798×10−16 e3-4.032817333361947×10−8

c8-1.980157255925737×10−14 e2-5.100472475630675×10−7

c7-4.508311519886735×10−13 e0-1.023463999572971×10−3

c6-1.023660713518307×10−11 f54.024189993755686×10−13

d5-1.227011356117036×10−10 f47.556768134694921×10−12

d4-6.770221628797445×10−9f31.305311326377090×10−10

d3-1.502070379373464×10−7f22.087675698786810×10−9

d2-3.013961104055248×10−6f12.505210838544172×10−8

d1-5.893435534477677×10−5f02.755731922398589×10−7

e5-3.294026127901678×10−10

Table 7: One real solution for coeﬃcients from (34) and (35) for computing Taylor ap-

proximation of the exponential of order m= 30 with (52) taking s= 5 and p= 10. Note

that in this case coeﬃcients in (54) are bi= 1/i!, i= 0,1,...,30.

Example 3.2. Table 7 presents one solution for the coeﬃcients for an exam-

ple of z1ps(x)from (52) combining (34) and (35) with Horner and Paterson–

Stockmeyer methods with p= 10 and s= 5 to compute Taylor approximation

of the matrix exponential of order m= 30.

From (53) the cost of computing z1,10,5(A)is Cz1,10,5= 8M, 1 matrix

product less than using Horner and Paterson–Stockmeyer methods, see Table

6.

Analogously, using z1ps (x)from (52) with (34) and (35), we computed the

coeﬃcients from (34) and (35) for computing Taylor exponential and cosine

approximation polynomials for all the approximation orders min Table 6

up to approximation order m= 81. This process gave always several real

solutions for all the coeﬃcients involved. The maximum degree used in the

Taylor approximation of the matrix exponential in double precision arithmetic

from [6] is m= 30, and in the matrix cosine in [8] is m= 16. Note that

the values from Table 7 can be directly used to evaluate Taylor approximation

of order m= 30 in the algorithm from [6]. We also checked that using

z1,p=0, s(A) = y1s(A)from (35) gave also real coeﬃcients for computing Taylor

exponential and cosine approximation polynomials with s= 2,3,4. Hence,

if Ais a real square matrix, using z1ps(A)we can compute the exponential

and cosine approximations using real arithmetic saving 1Mwith respect to

the algorithms in [6, 8] for Taylor polynomial degrees m∈m∗from Table 1,

m≥12.

Finally, similarly to Example 3.1 we checked the stability of the solutions

16

of the coeﬃcients in IEEE double precision arithmetic from Table 7, substi-

tuting them in the system of equations (37),(39) taking dei=di+eiwhere di

and eiare the values from Table 7, (41) and (51). Analogously, in all cases

the relative error |bi−1/i!|i!,i=p, p +1, . . . , m +p, see (54), was lower than

the unit roundoﬀ u.

In a similar way we also checked the stability for the computation of the

exponential Taylor polynomial approximation for all the degrees mfrom Table

6 up to m= 81 obtaining the following results:

•There were 4 real solutions for all orders except for m= 25, with 12

real solutions, m= 49,64, and 56 (with parameters s= 8,p= 24)

with 8 real solutions, and m= 42 (with p= 14,s= 7) with 20 real

solutions.

•The solutions for eswere in decreasing module from m= 12 with |es|

of order 10−2to m= 81 with |es|of order 10−44.

•In the case m= 42 (with p= 14,s= 7) the 20 solutions had all

positive values es∈[2.23 ×10−16,8.07 ×10−16]. Taking the solutions

esin double precision arithmetic, from the 20 solutions there were 12

solutions that gave a maximum relative error for all coeﬃcients biless

than 3u, being stable. However, 8 solutions showed certain signs of

instability, giving a maximum relative error for coeﬃcients bibetween

5.04×10−12 and 2.99 ×10−10 > u. Therefore, it is important to select a

solution for esin double precision arithmetic that gives relative errors

for all coeﬃcients biof order u.

We checked also the stability for the Taylor approximation of the matrix

exponential in all the cases from Table 5 and found that the worst case

was m= 28 with s= 7. This is not a case of practical use since, from

Table 5 it has a cost 8M, and from Table 6, using z1ps(A)with p= 10

and s= 5 gives the greater order m= 30 for the same cost, and that

option was checked above to be stable. However, we checked its stability

as a worst case study. This case gave 3 real solutions, where one of

them had multiplicity 10. For the coeﬃcients using the two solutions

eswith multiplicity 1 the maximum relative errors for all coeﬃcients

biwhere of order 10−15 > u. We also checked the scalar case A= 1,

giving relative errors |exp(1) −y1,s=7(1)|/exp(1) = 4.36 ×10−16 and

3.70×10−15 , respectively. However, using the solution with multiplicity

17

10 gave a maximum relative error 10.75 ≫ufor coeﬃcient b8. For

the rest of coeﬃcients the maximum relative error was 1.49 ×10−14,

and for |exp(1) −y1,s=7(1)|/exp(1) = 9.81 ×10−5, so the accuracy was

much lower when using the solution of eswith multiplicity 10.

Therefore, it is necessary to check the stability of the solutions for es

before using the method to evaluate a given polynomial. In general, we

propose to select the solution for esin double precision arithmetic that

gives the lowest maximum relative error for all coeﬃcients bi. If there is

no solution giving relative errors of order ufor a given polynomial with

degree m, a diﬀerent parameter selection from Tables 6 and 5 should be

tested, since in Table 5 for m > 16 there are two possibilities for pand

sthat gives each value of m.

4. Comparison with existing methods

Using (36), (53) and Tables 1, 2 and 3, 5 and Table 6, it follows Table 8

that shows the approximation orders that can be obtained with Taylor poly-

nomial approximations evaluated using Horner and Paterson–Stockmeyer

methods P Sm(A), y1s(x) from (35), z1ps(A) from (52), Pad´e rational ap-

proximation from Section 2.2, and the mixed rational and polynomial ap-

proximation from Section 2.3, for a given cost in terms of matrix products, if

each approximation reproduces the ﬁrst terms of the Taylor series of a given

function f, whenever all the approximations exist. Note that the cost of

solving the multiple right-hand side linear system in rational approximations

was taken as 4/3M.

Table 8 shows that the polynomial approximation that allows for the

highest approximation order is y1s(A) for a cost C≤6Mand z1ps(A) for C≥

3M. Note that in Section 3.1 for C≤5Mwe took z1ps(A) = z1,p=0,s(A) =

y1s(A), see (55). Hence, the approximation orders allowed by z1ps(A) for C≥

3Mare higher than the approximation orders available with both Paterson–

Stockmeyer and rational Pad´e method. The highest order for C≥6Mis

given by the mixed rational and polynomial approximation tijs(A) (10). In

the following section particular examples are given in order to increase the

eﬃciency of polynomial approximations even more.

18

C(M)P Sm(A)y1s(A)z1ps(A)CR(M)rmm (A)CRP (M)tijs (A)

3 6 8 8 3.33 6 3.33 9

4 9 12 12 4.33 8 4.33 12

5 12 16 16 5.33 12 5.33 16

6 16 20 20 6.33 16 6 21

7 20 24 25 7.33 20 7 28

8 25 28 30 8.33 24 8 35

9 30 32 36 9.33 30 9 42

10 36 36 42 10.33 36 10 49

11 42 40 49 11.33 42 11 56

Table 8: Maximum approximation orders if any of the approximations reproduce the ﬁrst

terms of the Taylor series of a given function ffor a given cost Cfor polynomial ap-

proximations, CRfor rational approximations and CRP for mixed rational and polynomial

approximants, where rational approximations are computed as in Section 2.2 and mixed

rational and polynomial approximants are evaluated as in Section 2.3. The polynomial

approximations considered are Horner and Paterson–Stockmeyer P Sm(A) from Section

2.1, and y1s(A) and z1ps(A) from Section 3. Bold style is applied to the maximum degrees

over all polynomial approximations, and to tijs(A) when it provides the maximum degree

over all approximations with an integer cost.

5. General expressions

This section gives examples that suggest new general expressions for eval-

uating matrix polynomials more eﬃciently than the evaluation schemes given

in Section 3.

Example 5.1. Consider

y02(A) = A2(c16 A2+c15A),(57)

y12(A) = (y02 (A) + c14A2+c13 A)(y02(A) + c12A2+c11I) + c10y02 (A),(58)

y22(A) = (y12 (A) + c9A2+c8A)(y12(A) + c7y02 (A) + c6A)

+c5y12(A) + c4y02 (A) + c3A2+c2A+c1I, (59)

where the coeﬃcients are numbered correlatively and A2is computed once

and stored to be reused in all the computations. It is easy to show that the

degree of polynomial y22(A)is m= 16 and it can be evaluated with a cost

Cy22 = 4M.

Using function solve from MATLAB Symbolic Math Toolbox, Table 9

gives one solution for the coeﬃcients to compute the exponential Taylor ap-

proximation Pm(A)of order m= 15, i.e. bi= 1/i!,i= 0,1,...,15. For the

19

c16 4.018761610201036×10−4c82.116367017255747×100

c15 2.945531440279683×10−3c7-5.792361707073261×100

c14 8.712167566050691×10−2c6-1.491449188999246×10−1

c13 4.017568440673568×10−1c51.040801735231354×101

c12 -6.352311335612147×10−2c4-6.331712455883370×101

c11 2.684264296504340×10−1c33.484665863364574×10−1

c10 1.857143141426026×101c2-1.224230230553340×10−1

c92.381070373870987×10−1c11

Table 9: Coeﬃcients of y02,y12 ,y22 from (57)-(59) for computing the matrix exponential

Taylor approximation of order m= 15.

solution given in Table 9 if we write y22(A)as a polynomial Pm(A)of degree

m= 16 the relative error for b16 with respect to the corresponding Taylor

polynomial coeﬃcient is

(b16 −1/16!)16! = −0.454,(60)

showing three signiﬁcant digits.

We selected diﬀerent possibilities for a new coeﬃcient c0added in (57)-

(59), trying compute the matrix exponential and the matrix cosine Taylor

approximations of order 16, for instance changing (58) for

y12(A) = (y02 (A)+c14A2+c13A+c0I)(y02(A)+c12A2+c11I)+c10 y02(A),(61)

and other options. However, sometimes MATLAB could not ﬁnd an explicit

solution for the coeﬃcients, and the other times MATLAB gave solutions

with numeric instability.

Note that in Example 5.1 the degree of yk,2(A), k= 1,2, is twice the de-

gree of the polynomial yk−1,2(A), increasing the cost by just 1Mwhen com-

puting yk,2(A) using yk−1,2(A). Therefore, the polynomial degree increases

exponentially while the cost increases linearly. Following this idea Proposi-

tion 2 gives expressions yks(A), k≥1 more general than (34) and (35) where

the degree of the polynomial yks (A) is twice the degree of the polynomial

yk−1,s(A), k≥1, while the cost increases by 1Mwhen computing yks(A)

using yk−1,s(A):

20

Proposition 2. Let

y0s(x) = xs

s

X

i=1

c(0,1)

ixi+

s

X

i=0

c(0,2)

ixi,(62)

y1s(x) = 0

X

i=0

c(1,1)

iyis(x) +

s

X

i=0

c(1,2)

ixi! 0

X

i=0

c(1,3)

iyis(x) +

s

X

i=0

c(1,4)

ixi!

+

0

X

i=0

c(1,5)

iyis(x) +

s

X

i=0

c(1,6)

ixi,(63)

y2s(x) = 1

X

i=0

c(2,1)

iyis(x) +

s

X

i=0

c(2,2)

ixi! 1

X

i=0

c(2,3)

iyis(x) +

s

X

i=0

c(2,4)

ixi!

+

1

X

i=0

c(2,5)

iyis(x) +

s

X

i=0

c(2,6)

ixi,(64)

.

.

.

yks(x) = k−1

X

i=0

c(k,1)

iyis(x) +

s

X

i=0

c(k,2)

ixi! k−1

X

i=0

c(k,3)

iyis(x) +

s

X

i=0

c(k,4)

ixi!

+

k−1

X

i=0

c(k,5)

iyis(x) +

s

X

i=0

c(k,6)

ixi,(65)

where yks(x)is a polynomial of x. Then, the maximum polynomial degree,

denoted by dyks, and the computing cost if x=A,A∈Cn×nin terms of

matrix products, denoted by Cyks are given by

dyks = 2k+1s, Cyks = (s+k)M, (66)

Proof. From (62), the maximum degree of the polynomial y0s(x) is 2s.

Then using (62)-(65) the maximum degree of yis(x), i≤kis 2(i+1)s.

If x=A,A∈Cn×n, then the cost of computing yks(A) is s−1 matrix

products for computing Ai, for i= 2,3, . . . , s, and one matrix product in

each iteration from (62)-(65), i.e. k+ 1. Therefore, Cyks = (s+k)M.

Note that (34) and (35) are particular cases of Proposition 2 where k= 1

and some coeﬃcients c(l,j)

i,l= 0,1, in (62) and (63) are zero. Similarly,

(57)-(59) are particular cases of (62)-(64) where k= 2, s= 2 and some

coeﬃcients c(l,j)

i,l= 0,1,2, are also zero.

21

If we write (65) in powers of xas

yks(x) =

m

X

i=0

aixi,(67)

then ai,i= 0,1, . . . , m, are functions of the coeﬃcients c(l,j)

i, for all i, j, l in

(62)-(65). Hence, it is possible to evaluate matrix polynomial Pm(A) using

(62)-(65) if the system of equations

am(c(l,j)

i) = bm,

am−1(c(l,j)

i) = bm−1,(68)

.

.

.

a0(c(l,j)

i) = b0,

for all coeﬃcients c(l,j)

ifrom (62)-(65) involved in each coeﬃcient ai,i=

0,1, . . . , m, has at least one solution, where biare the polynomial coeﬃcients

of Pm(A). We have obtained a general solution for evaluating polynomials

using (34) and (35) corresponding to particular cases of (62) and (63). And

we obtained one solution for computing the exponential Taylor approxima-

tion of order 15 with (57)-(59). Future work is addressed to obtain general

solutions for evaluating matrix polynomials of diﬀerent degrees using (62)-

(65), and to study if at least there are particular solutions for evaluating

polynomials such that the Taylor polynomial approximation of certain de-

grees for diﬀerent matrix functions. That is the case of Example 5.1 which

provides formulas for computing the exponential Taylor approximation poly-

nomial of order m= 15 with a cost C= 4M. From Table 8 it follows that

with a cost of 4MPaterson–Stockmeyer method allows to compute the ma-

trix exponential Taylor approximation polynomial of order only m= 9, Pad´e

rational method rmm(A) allows an order less than 8, the mixed rational and

polynomial approximation tijs(A) allows an order less than 12, and the new

method based on (34) and (35) allows an order m= 12.

In the following example we consider the computation of the Taylor ex-

ponential approximation of order 16 by using the product of two polynomials

of degree 8, both evaluated using (14) and (15).

Example 5.2. Let

h2m1(A) = Pm1(A)P0

m1(A) + β0=

m1

X

i=0

biAi

m1

X

i=0

b0

iAi+β0,(69)

22

b82.186201576339059×10−7b0

82.186201576339059×10−7

b79.839057366529322×10−7b0

72.514016785489562×10−6

b61.058964584814256×10−5b0

63.056479369585950×10−5

b51.554700173279057×10−4b0

53.197607034851565×10−4

b42.256892506343887×10−3b0

42.585006547542889×10−3

b32.358987357109499×10−2b0

31.619043970183846×10−2

b21.673139636901279×10−1b0

28.092036376147299×10−2

b17.723603212944010×10−1b0

13.229486011362677×10−1

b03.096467971936040×100β01

Table 10: Coeﬃcients from (69) for computing the matrix exponential Taylor approxima-

tion of order m= 16 where coeﬃcient b0

8=b8and b0

0= 0.

c44.675683454147702×10−4c0

44.675683454147702×10−4

c31.052151783051235×10−3c0

32.688394980266927×10−3

d2-3.289442879547955×10−2d0

22.219811707032801×10−2

d12.868706220817633×10−1d0

13.968985915411500×10−1

e25.317514832355802×10−2e0

22.771400028062960×10−2

e07.922322450524197×100e0

01.930814505527068×100

f21.673139636901279×10−1f0

28.092036376147299×10−2

f17.723603212944010×10−1f0

11.614743005681339×10−1

f03.096467971936040×100f0

00

Table 11: Coeﬃcients from system (16)-(24) for evaluating polynomials y1(A) = Pm1(A)

and y0

1(x) = P0

m1(A) from (69) with coeﬃcients given by Table 10. Note that f0

0= 0 since

y0

1(0) = b0

0= 0.

where we took m1= 8,b0

8=b8,b0

0= 0 and h2m1(0) = β0, and, therefore,

Pm1(A)and P0

m1(A)are both polynomials as (3) of degree 8, and h2m1(A)

can be written as a polynomial of degree 16 with 17 coeﬃcients, i.e. bi,

i= 0,1,...,8,b0

i,i= 1,...,7and β0. Using the MATLAB Symbolic Math

Toolbox solve function, Table 10 presents one solution for the coeﬃcients of

an example where h2m1(A) = P16

i=0 Ai/i!, i.e. the exponential Taylor polyno-

mial approximation of degree m= 16.

Note that one can evaluate both polynomials Pm1(A)and P0

m1(A)using

an evaluation scheme (14) and (15), see Example 3.1. Finally, from (69) it

follows that β0= 1 so that h2m1(0) = exp(0) = 1. Table 11 shows one solution

for the coeﬃcients from (16)-(24) using (25)-(32) taking y1s(A) = Pm1(A),

and the coeﬃcients taking y0

1s(A) = P0

m1(A), corresponding to c0

4,c0

3,d0

2,d0

1,

e0

2,e0

0,f0

2,f0

1and f0

0.

23

C(M) 6 7 8 9 10 11 12

dP S 16 20 25 30 36 42 49

dz1ps 20 25 30 36 42 49 56

dhm116 24 32 40 48 56 64

Table 12: Order of the approximation dPS that can be obtained using Horner and

Paterson–Stockmeyer methods, order dz1ps that can be obtained using z1ps(A) from (52),

and order dhm1that can be obtained using method given by hm1(A) from (69), using (34)

and (35) for evaluating the polynomials therein, for a given cost Cin terms of matrix

products, whenever the solutions for the coeﬃcients from (69), (34) and (35) exist.

In general, if we evaluate both polynomials Pm1(A) and P0

m1(A) by using

(34) and (35) with m1= 4s, if there exists a solution for the coeﬃcients bi

and b0

ifor Pm1(A) and P0

m1(A), using (36) the degree of the matrix polynomial

h2m1(A) and its computing cost are

dh2m1= 8s, Ch2m1= (s+ 4)M. (70)

Table 12 shows the comparison of the polynomial degrees that can be

obtained by Horner and Paterson–Stockmeyer methods, z1ps(A) from (52)

and h2m1(A) given by (69) varying m1, for a given cost, whenever a solution

for all the coeﬃcients involved in h2m1(A) exists. Since for C > 6Mthey

would be more eﬃcient than Paterson–Stockmeyer method and for C > 7M

they would be more eﬃcient than the method given by (52), it is worth

studying if there exist evaluation schemes like (69) in general, or if at least

they exist for the polynomial approximation of speciﬁc matrix functions or

for the evaluation of matrix polynomials in the applications. Moreover, in

order to obtain a polynomial degree equal to 2m1, note that one can think

of other possibilities to have 2m1+ 1 coeﬃcients in h2m1(A) diﬀerent from

selecting bm1=b0

m1and b0

0as in Example 5.2.

Note that similarly to Section 3.1 Paterson–Stockmeyer method can be

combined with any other method proposed above. And analogously to Ex-

ample 5.2, we can also obtain new methods for evaluating matrix polynomi-

als and matrix polynomial approximations using products of the evaluation

schemes proposed above whenever a solution for the all the coeﬃcients in-

volved exists. The same powers Ai,i= 1,2, . . . , s, should be used in each

evaluation scheme involved, so that they can be reused in all the computa-

tions. It is important to note that even in the case of the well known Pad´e

24

approximations, for a given function f, k and m, a [k/m] Pad´e approximant

rk,m might not exist, see Section 2.2. Therefore, the existence of particular

cases of the methods proposed in this section for computing matrix functions

arising often in the applications is useful if they are more eﬃcient than the

existing methods in those concrete cases. That is the case of Example 5.1

with the matrix exponential Taylor approximation of order 15 which can be

computed with just 4M.

6. Conclusions

This paper proposes the new general evaluation schemes for matrix poly-

nomials given by y0s(A) (34), y1s(A) (35) and z1ps(A) (52), and a method to

check their stability was given. It was shown that these evaluation schemes

allow to evaluate polynomials of degree higher than that of the Paterson–

Stockmeyer method for the same cost. It was also shown that they provide

a greater Taylor approximation order than diagonal Pad´e approximation for

the same cost. Moreover, the new evaluation schemes are more eﬃcient than

the recent mixed rational and polynomial approximation from [9] for several

orders of approximation.

Through Examples 5.1 and 5.2, we suggest the study of more general poly-

nomial evaluation schemes that can be even more eﬃcient, and applications

to the Taylor approximation of matrix functions were given.

With the proposed methods we can state that the combination of Horner

and Paterson–Stockmeyer methods is no longer the most eﬃcient general

method for evaluating matrix polynomials, and that Pad´e approximations

are no longer more accurate than polynomial approximations for the same

cost either.

Future work is:

•To determine if it is possible to ﬁnd general solutions for evaluating

matrix polynomials using (62)-(65) with s≥2 and k≥2, or at least

particular solutions for cases of interest as in Example 5.1.

•To study if there are general solutions, or at least particular solutions

for the matrix polynomial evaluation using products of the new pro-

posed matrix polynomial evaluation schemes, similarly to Example 5.2.

25

7. Acknowledgements

This work has been supported by Spanish Ministerio de Econom´ıa y

Competitividad and European Regional Development Fund (ERDF) grant

TIN2014-59294-P. We thank the anonymous referee who revised this paper

so thoroughly and carefully.

[1] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar mul-

tiplications necessary to evaluate polynomials, SIAM J. Comput.,

2(1)(1973), pp. 60–66.

[2] N. J. Higham, Functions of Matrices: Theory and Computation, Society

for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.

[3] G. H. Golub, C. V. Loan, Matrix Computations, 3rd Ed., Johns Hopkins

Studies in Math. Sci., The Johns Hopkins University Press, 1996.

[4] C. B. Moler, C. V. Loan, Nineteen dubious ways to compute the expo-

nential of a matrix, twenty-ﬁve years later, SIAM Rev., 45 (2003), pp.

3–49.

[5] A. H. Al-Mohy, N. J. Higham, A new scaling and squaring algorithm

for the matrix exponential, SIAM J. Matrix Anal. Appl., 31(3)(2009)

970–989.

[6] P. Ruiz, J. Sastre, J. Ib´a˜nez, E. Defez, High performance computing

of the matrix exponential, J. Comput. Appl. Math., 291 (2016), pp.

370-379.

[7] A. H. Al-Mohy, N. J. Higham, S. Relton, New Algorithms for Computing

the Matrix Sine and Cosine Separately or Simultaneously. SIAM J. Sci.

Comput., 37(1)(2015), pp. A456-A487.

[8] J. Sastre, J. Ib´a˜nez, P. Alonso, J. Peinado, E. Defez, Two algorithms

for computing the matrix cosine function, Appl. Math. Comput., 312

(2017), pp. 66-77.

[9] J. Sastre, Eﬃcient mixed rational and polynomial approximation of ma-

trix functions, Appl. Math. Comput., 218(24)(2012), pp. 11938–11946.

[10] S. Blackford, J. Dongarra, Installation guide for LAPACK, LAPACK

Working Note 41, Department of Computer Science University of Ten-

nessee, 1999.

26