Content uploaded by Jorge Sastre Martínez
Author content
All content in this area was uploaded by Jorge Sastre Martínez on Nov 24, 2017
Content may be subject to copyright.
Efficient evaluation of matrix polynomials
J. Sastrea
aInstituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Polit`ecnica de
Val`encia, Camino de Vera s/n, 46022-Valencia (Spain)
Abstract
This paper presents a new family of methods for evaluating matrix polynomi-
als more efficiently than the state-of-the-art Paterson–Stockmeyer method.
Examples of the application of the methods to the Taylor polynomial approx-
imation of matrix functions like the matrix exponential and matrix cosine are
given. Their efficiency is compared with that of the best existing evaluation
schemes for general polynomial and rational approximations, and also with
a recent method based on mixed rational and polynomial approximants. For
many years, the Paterson–Stockmeyer method has been considered the most
efficient general method for the evaluation of matrix polynomials. In this pa-
per we show that this statement is no longer true. Moreover, for many years
rational approximations have been considered more efficient than polynomial
approximations, although recently it has been shown that often this is not
the case in the computation of the matrix exponential and matrix cosine. In
this paper we show that in fact polynomial approximations provide a higher
order of approximation than the state-of-the-art computational methods for
rational approximations for the same cost in terms of matrix products.
Keywords: matrix, polynomial, rational, mixed rational and polynomial,
approximation, computation, matrix function.
PACS: 87.64.Aa
1. Introduction
In this paper we propose a new family of methods for evaluating matrix
polynomials more efficiently than the state-of-the-art Paterson–Stockmeyer
method combined with Horner’s method [1], [2, Sec. 4.2]. The proposed
Email address: jsastrem@upv.es (J. Sastre)
Preprint submitted to Elsevier November 24, 2017
methods are applied to compute efficiently Taylor polynomial approxima-
tions of matrix functions. The computation of matrix functions is a research
field with applications in many areas of science and many algorithms for
their computation have been proposed [2, 3]. Among all matrix functions,
the matrix exponential has attracted special attention, see [4, 5, 6] and the
references therein, and lately the matrix cosine, see [7, 8] and the references
therein. The main methods for computing matrix functions are those based
on rational approximations, like Pad´e or Chebyshev approximations, polyno-
mial approximations, like Taylor approximation, similarity transformations
and matrix iterations [2]. Moreover, a new kind of approximations based on
mixed rational and polynomial approximants has been proposed in [9].
Recently, it has been shown that using the combination of Horner and
Paterson–Stockmeyer methods [1], [2, Sec. 4.2], polynomial approximations
may be more efficient than rational Pad´e approximations for both the matrix
exponential and cosine [6, 8]. In this paper we show that using the proposed
matrix polynomial evaluation methods, polynomial approximations are more
accurate than existing state-of-the-art methods for evaluating both polyno-
mial and rational approximants for the same computing cost. Moreover, we
show that the new methods are more efficient than the recent mixed ratio-
nal and polynomial approximation [9] in some cases, and examples for the
computation of the matrix exponential and the matrix cosine are given.
Throughout this paper dxedenotes the lowest integer not less than x,
bxcdenotes the highest integer not exceeding x,Ndenotes the set of positive
integer numbers, Cn×nand Rn×ndenote the sets of complex and real matrices
of size n×n, respectively, Idenotes the identity matrix for both sets, and
Rk,m denotes the space of rational functions with numerator and denominator
of degrees at most kand m, respectively.
Note that the multiplication by the matrix inverse in matrix rational
approximations is calculated as the solution of a multiple right-hand side
linear system. Therefore, the cost of evaluating polynomial and rational
approximations will be given in terms of the number of matrix products,
denoted by M, and the cost of the solution of multiple right-hand side linear
systems AX =B, where matrices Aand Bare n×n, denoted by D. From
[10, App. C] it follows that, see [9, p. 11940]:
D≈4/3M. (1)
This paper is organized as follows. Section 2 recalls some results for
efficient Taylor, Pad´e, and mixed rational and polynomial approximation
2
of general matrix functions. Section 3 deals with the new matrix polyno-
mial evaluation methods giving examples for the computation of the matrix
exponential and the matrix cosine. Section 4 compares the new techniques
with efficient state-of-the-art evaluation schemes for polynomial, rational and
mixed rational and polynomial approximants. Section 5 gives examples for
the matrix exponential computation even more efficient than the ones given
in Section 3, suggesting more general formulas for evaluating matrix polyno-
mials. Finally, conclusions are given in Section 6.
2. Polynomial, rational, and mixed rational and polynomial ap-
proximants
This section summarizes some results of the computational costs of Tay-
lor, Pad´e, and the mixed rational and polynomial approximants given in
[9].
2.1. Taylor approximation of matrix functions
If f(A) is a matrix function defined by a Taylor series according to Theo-
rem 4.7 of [2, p. 76] where Ais a complex square matrix, then we will denote
by Tm(A) the matrix polynomial defined by the truncated Taylor series of
degree mof f(A). For scalar x∈Cit follows that
f(x)−Tm(x) = O(xm+1),(2)
about the origin, and, from now on, we will refer to mas the order of the
Taylor approximation. The most efficient method in the literature to evaluate
a matrix polynomial
Pm(A) =
m
X
i=0
biAi,(3)
is the combination of Horner and Paterson–Stockmeyer methods [1] given by
P Sm(A) = ···bmAs+bm−1As−1+. . . +bm−s+1 A+bm−sI
×As+bm−s−1As−1+bm−s−2As−2+. . . +bm−2s+1 A+bm−2sI
×As+bm−2s−1As−1+bm−2s−2As−2+. . . +bm−3s+1 A+bm−3sI
.
.
.
×As+bs−1As−1+bs−2As−2+· ·· +b1A+b0I, (4)
3
m∗12469121620253036
CP S 0 1 2 3 4 5 6 7 8 9 10
Table 1: Cost CPS in terms of matrix products for the evaluation of polynomial Pm(A)
with Horner and Paterson–Stockmeyer methods for the first eleven values m∗that maxi-
mize the polynomial degree obtained for a given cost.
where the integer s > 0 divides mand the matrix powers A2, A3, . . . , As, are
computed and stored previously.
Table 1 shows the maximum values of mthat can be obtained for a given
number of matrix products in Tm(A) using Paterson–Stockmeyer method,
corresponding to m=s2and m=s(s+ 1), for s∈N. The cost of evaluating
(4), denoted by CP S, for the values in m∗is given by [9, Eq. (6)]
CP S = (r+s−2)M, with r=m/s, m ∈m∗.(5)
Table 1 presents the cost CP S of evaluating (4) in terms of matrix products
for the first eleven values of m∗. For orders m /∈m∗we evaluate Pm(A) =
P Sm0(A) using (4) taking m0= min{m1∈m∗, m1> m}and setting the
coefficients bi= 0 in (4) for i=m0, m0−1, . . . , m + 1, at the same cost
as evaluating P Sm0(A). Note that because of the way the polynomial is
evaluated, the cost of using (4) is lower than that of Paterson–Stockmeyer
as implemented in [2, Sec. 4.2] (compare (5) and [2, Eq. (4.3)]).
The matrix exponential is the most studied matrix function [4], [2, Chap.
10]. For A∈Cn×nthe matrix exponential of Acan be defined by the Taylor
series
exp(A) = X
i≥0
Ai
i!.(6)
Another matrix function that has received attention recently is the matrix
cosine, which can be defined analogously by means of its Taylor series
cos(A) = X
i≥0
(−1)iA2i
(2i)!.(7)
Several efficient algorithms based on Taylor approximations have been pro-
posed recently for the computation of the matrix exponential and cosine
[6, 8].
4
m+1 2 3 4 6 8 10 12 15 18 21
CR1.33 2.33 3.33 4.33 5.33 6.33 7.33 8.33 9.33 10.33 11.33
dR2 4 6 8 12 16 20 24 30 36 42
Table 2: Cost CRin terms of matrix products for diagonal rational approximation rmm(A)
taking D= 4/3M. Approximation order dRif rmm is a Pad´e approximant of a given
function f.
2.2. Pad´e approximations of matrix functions
The rational scalar function rkm(x) = pkm(x)/qkm(x) is a [k/m] Pad´e
approximant of the scalar function f(x) if rk,m ∈ Rk,m, qkm (0) = 1, and
f(x)−rkm(x) = O(xk+m+1 ).(8)
From now on, dRwill denote the degree of the last term of the Taylor series
of fabout the origin that rkm (x) agrees with, i.e. dR=k+m, and we will
refer to dRas the order of the Pad´e approximation. Table 2 (see [9, Table 2])
shows the maximum values of mthat can be obtained for a given number of
matrix products in rmm(A), denoted by the set m+, and the corresponding
computing cost, denoted by CRgiven by
CR= (2r+s−3)M+D≈(2r+s−1−2/3)M, r =m/s, (9)
where stakes whichever value s=d√2meor s=b√2mcthat divides mand
gives the smaller CR. Table 2 also gives the corresponding order dRof the
approximation rmm(x) if it is a Pad´e approximant of a given function f(x),
i.e. dR= 2m.
Finally, it is important to note that for a given f, k and m, a [k/m] Pad´e
approximant might not exist. Moreover, when computing rational approxi-
mations rkm of a function ffor a given square matrix A, we must verify that
the matrix qkm(A) is nonsingular, and, for an accurate computation, that it
is well conditioned. This is not the case for polynomial approximations, since
they do not require matrix inversions.
2.3. Mixed rational and polynomial approximants.
For a square matrix Athe method proposed in [9] is based on using
aggregations of mixed rational and polynomial approximants of the type
tijs(A) = ···u(i)
s(A)(v(i)
s(A))−1+u(i−1)
s(A)(v(i−1)
s(A))−1+u(i−2)
s(A)
(v(i−2)
s(A))−1+· ·· +u(1)
s(A)(v(1)
s(A))−1+wjs(A).(10)
5
where v(k)
s(A), u(k)
s(A), k= 1,2, . . . , i, are polynomials of Aof degrees at
most s,wjs(A) is a polynomial of Awith degree at most js, and i≥0, s≥0
and j≥0. Note that if i= 0 we consider that tij s(A) = wj s(A), having no
rational part. In [9, Sec. 4] a method to obtain tijs from rational approxi-
mations is given. Similarly to rational approximations, each multiplication
by a matrix inverse is calculated as the solution of a multiple right-hand side
linear system. Therefore, when computing tijs (A) it is important to verify
that the matrices v(1)
s(A), v(2)
s(A), . . . , v(i)
s(A) are nonsingular and well con-
ditioned. The total cost for computing (10), denoted by CRP , is given by,
see [9, Sec. 5]
CRP = (s+j−2)M+iD ≈(s+j−2+4i/3)M, j > 0, s > 0, i ≥0.(11)
Note that for the case where approximation (10) is intended to reproduce
the first terms of the Taylor series of a given function f, it is equivalent to a
[(i+j)s/is] Pad´e approximant, and then, whenever it exists, tijs for scalar
x∈Csatisfies
f(x)−tijs(x) = O(x(2i+j)s+1).(12)
In that case we denote by dRP the order of the mixed rational and polynomial
approximation
dRP = (2i+j)s. (13)
Table 3 (see [9, Table 3]) shows for tijs(A) the approximation order dRP
if tijs reproduces the first terms of the Taylor series of a given function f,
and the cost CRP in terms of matrix products for the values of i, j, s that
maximize dRP for a given cost. See [9] for a complete description.
3. On the evaluation of matrix polynomials. Application to the
approximation of matrix functions
This section gives new general methods for evaluating matrix polynomi-
als in a more efficient way than the combination of Horner and Paterson–
Stockmeyer methods. Examples for computing the Taylor matrix polynomial
approximation of degree mof the matrix exponential and the matrix cosine
are given. These examples allow us to compute both approximations at a
lower cost than Horner and Paterson–Stockmeyer methods. Note that in this
section we used MATLAB R2017a for all the computations.
6
dRP 1 2 3 4 6 9 10 12 15 16 20 21
i0 0 1 0 1 1 2 1 2 1 2 3
j1 1 1 2 1 1 1 1 1 2 1 1
s1 2 1 2 2 3 2 4 3 4 4 3
CRP 0 1 1.33 2 2.33 3.33 3.67 4.33 4.67 5.33 5.67 6
dRP 25 28 30 35 36 42 45 49 54 55 56 63
i2 3 2 3 4 3 4 3 4 5 3 4
j1 1 1 1 1 1 1 1 1 1 1 1
s5 4 6 5 4 6 5 7 6 5 8 7
CRP 6.67 7 7.67 8 8.33 9 9.33 10 10.33 10.67 11 11.33
Table 3: Approximation order dRP if the mixed rational and polynomial approximation
tijs (A) from Section 2.3 reproduces the dRP first terms of the Taylor series of a given
function f, cost in terms of matrix products CRP for the mixed rational and polynomial
approximation tijs(A), taking D= 4/3M, and values of i, j and s, that maximize dRP for
a given cost.
Example 3.1. Let
y02(A) = A2(c4A2+c3A),(14)
y12(A) = (y02 (A) + d2A2+d1A)(y02(A) + e2A2) (15)
+e0y02(A) + f2A2+f1A+f0I,
where c4,c3,d2,d1,e2,e0,f2,f1and f0are scalar coefficients. In order to
evaluate a matrix polynomial (3) of degree m= 8, taking y12(A) = Pm(A)
and equating the coefficients of the matrix powers Ai,i= 8,7,...,0, the
following system of equations arises
c4c4A8=b8A8,(16)
2c3c4A7=b7A7,(17)
(c4(d2+e2) + c3c3)A6=b6A6,(18)
(c4d1+c3(d2+e2))A5=b5A5,(19)
(d2e2+c3d1+c4e0)A4=b4A4,(20)
(d1e2+c3e0)A3=b3A3,(21)
f2A2=b2A2,(22)
f1A=b1A, (23)
f0I=b0I. (24)
Note that for clarity the coefficient indices were chosen so that the sum
of the indices is equal to the exponent of the power of Athat coefficient is
7
multiplying. For instance, for (16) one gets 4 + 4 = 8, for (17) one gets
3 + 4 = 7, for (18) one gets 4 + 2 = 6 and 3 + 3 = 6, and so on.
We can solve the previous system using the equations (16)-(24) from top
to bottom. Using (16)-(19), one gets
c4=±pb8,(25)
c3=b7/(2c4),(26)
d2+e2= (b6−c2
3)/c4,(27)
d1= (b5−c3(d2+e2))/c4.(28)
If b86= 0 then c46= 0 and therefore c4,c3, the sum d2+e2and d1can be
obtained explicitly. From now on we will denote de2=d2+e2to simplify the
notation and to remark that this quantity can be computed explicitly. Using
(20) it follows that
e0= (b4−c3d1−de2e2+e2
2)/c4,(29)
where using (25)-(28) e0is a polynomial of second order in the variable e2.
Hence, using (21) and (29) one gets
d1e2+c3e0=b3⇒ −b3+d1e2+c3(b4−c3d1−de2e2+e2
2)/c4= 0 (30)
which is an equation of second order in the variable e2, and therefore, using
(25)-(28), the equation on the right-hand side of (30) has the solutions
e2=
c3
c4de2−d1±rd1−c3
c4de22+ 4c3
c4b3+c2
3
c4d1−c3
c4b4
2c3/c4
,(31)
i.e., two solutions if we take c4=√b8from (25), and other two solutions if
we take c4=−√b8. Substituting the four solutions of e2in (27) and (29),
four solutions are obtained for d2=de2−e2and e0, respectively, and from
(22)-(24) it follows that
f2=b2, f1=b1, f0=b0.(32)
The cost of evaluating (15) is 3M, i.e. one matrix product to compute
and store A2, and then two matrix products to compute (14) and (15), being
8
exp cos
c44.980119205559973×10−32.186201576339059×10−7
c31.992047682223989×10−2-2.623441891606870×10−5
d27.665265321119147×10−26.257028774393310×10−3
d18.765009801785554×10−1-4.923675742167775×10−1
e21.225521150112075×10−11.441694411274536×10−4
e02.974307204847627×1005.023570505224926×101
Table 4: One possible choice for the coefficients in (14) and (15) for Taylor approximation
of exponential and cosine of order m= 8.
y12(A)a polynomial of degree 8. From Table 1, the polynomial of maximum
degree that can be computed with Horner and Paterson–Stockmeyer methods
and cost 3Mis the lower value dPS = 6.
Table 4 shows one of the four solutions in IEEE double precision arith-
metic for the coefficients of the Taylor approximation of the exponential
and cosine, where bi= 1/i!, and bi= (−1)i/(2i)!, respectively, for i=
0,1,...,8. Note that all the four solutions are real, avoiding complex arith-
metic if A∈Rn×n. In order to check the stability of the double precision
arithmetic solutions ci,diand eifrom Table 4, they were substituted in
equations (16)-(21) to compute the relative error for each coefficient bi, for
i= 3,4,...,8. For instance, from (21) it follows that the relative error for
b3is |b3−(d1e2+c3e0)|/|b3|.We checked that all the relative errors for all
bi, for i= 3,4,...,8, were below the unit roundoff in IEEE double precision
arithmetic, i.e. u= 2−53 ≈1.11.
Note that if we take
y12(A) = (y02 (A)+d2A2+d1A)(y02(A)+e2A2+e1A)+f2A2+f1A+f0I, (33)
instead of (15), the four solutions for the corresponding coefficients for the
exponential and cosine Taylor approximations of order m= 8 are complex.
Therefore, if Ais real, using (33) instead of (15) is not efficient for the
computation of either matrix function since it is necessary to use complex
arithmetic for evaluating (33).
9
Following Example 3.1 we can take in general
y0s(A) = As
s
X
i=1
cs+iAi,(34)
y1s(A) = y0s(A) +
s
X
i=1
diAi! y0s(A) +
s
X
i=2
eiAi!
+e0y0s(A) +
s
X
i=0
fiAi,(35)
where Ai,i= 2,3, . . . , s, can be computed once and stored to be reused in
all the computations, and, then, y1s(A) is a matrix polynomial of degree,
denoted by dy1s, and computing cost, denoted by Cy1s
dy1s= 4s, Cy1s=s+ 1, s = 2,3,.... (36)
Note that (14) and (15) are a particular case of (34) and (35) where s= 2.
Again, in order to evaluate a matrix polynomial Pm(A) of degree m= 4s, we
take y1s(A) = Pm(A), and equate the coefficients of the matrix powers Ai,
i=m, m −1,...,0, from y1s(A) and Pm(A). The solution for the coefficients
taking s= 2 is given in Example 3.1, where the substitution of variables gives
a polynomial equation in es=e2of degree 2 with the exact solution given by
(31). In the following a general solution is given for s > 2. The sequations
corresponding to the coefficients of the powers A4s−k, for k= 0,1, . . . , s −1
are, respectively
k
X
i=0
c2s−ic2s+i−k=b4s−k, k = 0,1, . . . , s −1.(37)
Since (37), is a triangular system, if b4s6= 0 then c2s6= 0 and it follows that:
c2s=±pb4s
c2s−1=b4s−1/(2c2s),(38)
c2s−k= (b4s−k−
k−1
X
i=1
c2s−ic2s+i−k)/(2c2s), k = 2,3, . . . , s −1.
Note that if b4s<0, to prevent c2sfrom being complex we can compute
y1s(A) = −Pm(A) using (35), where c2s=−b4s>0 which gives Pm(A) =
−y1s(A).
10
Taking again dei=di+eifor abbreviation, and de1=d1, since there is
no coefficient e1in (35), the equations corresponding to the coefficients of
powers A3s−k, for k= 0,1, . . . , s −1, are, respectively
s
X
j=s−k
c3s−k−jdej+
s−k−1
X
i=1
c2s−k−ics+i=b3s−k, k = 0,1, . . . , s −2,(39)
s
X
j=s−k
c3s−k−jdej=b3s−k, k =s−1,
and using (38) it follows that
des= (b3s−
s−1
X
i=1
c2s−ics+i)/c2s,
des−k= (b3s−k−
s
X
j=s+1−k
c3s−k−jdej−
s−1−k
X
i=1
c2s−k−ics+i)/c2s,(40)
k= 1,2, . . . , s −2,
d1= (b2s+1 −
s
X
j=2
c2s+1−jdej)/c2s,
where, if c2s6= 0, each sum dei=di+ei,i=s, s −1,...,2, and the coefficient
d1can be obtained explicitly using the coefficients ci,i=s+ 1, s + 2,...,2s
obtained from (38).
The equations corresponding to the coefficients of powers A2s−k, for k=
0,1, . . . , s −1, are
k
X
i=0
ds−ies−k+i+gk+e0c2s−k=b2s−k, k = 0,1...,s−1,(41)
where
gk=
s−1−k
X
i=1
cs+ides−i−k, k = 0,1, . . . , s −2, gs−1= 0,(42)
and the coefficients gkcan be computed explicitly using (38) and (40).
Using (41) with k= 0 it follows that
esdes−e2
s+g0+e0c2s=b2s⇔e0= (b2s−g0−esdes+e2
s)/c2s,(43)
11
provided that c2s6= 0. Hence, since des,g0and c2scan be computed using
(38) and (40), the coefficient e0is a polynomial of second order in the variable
es. Using now (41) with k= 1 one gets
es−1(des−2es) + esdes−1+g1+e0c2s−1=b2s−1,(44)
and then if ds6=esit follows that des−2es=ds−es6= 0 and
es−1= (b2s−1−g1−e0c2s−1−esdes−1)/(des−2es),(45)
where es−1is a rational function of es, since by (43) e0is a polynomial of
esof second order, and all the remaining quantities can be computed using
(38), (40) and (42). Note that analogously, using (41) with k= 2 it follows
that
es−2(des−2es) + esdes−2+es−1des−1−e2
s−1+g2+e0c2s−2=b2s−2,(46)
and then, again if ds6=esit follows that
es−2= (b2s−2−g2−e0c2s−2−esdes−2−es−1des−1+e2
s−1)/(des−2es),(47)
where similarly es−2is also a rational function of essince by (43) and (45)
one gets that e0is a polynomial of es, and es−1is a rational function of es,
and all the remaining quantities can be computed using (38), (40) and (42).
Note that from (45) and (47) it follows that the rational function es−2has
denominator (des−2es)3.
Analogously, it is easy to show that
es−k=b2s−k−gk−e0c2s−k−esdes−k
−
dk/2e−1
X
i=1 es−ides−k+i−es−k+i(des−i−2es−i)(48)
+
0/(des−2es),odd k, 2< k ≤s−2,
−es−k/2des−k/2−e2
s−k/2/(des−2es),even k, 2< k ≤s−2,
(49)
where es−kis also a rational function of eswith denominator (des−2es)ik,s
where ik,s >0 is an integer number depending on kand s.
12
The last equation of this group is
0 = −bs+1 +e0cs+1 +esd1+
ds/2e−1
X
i=1
(es−ide1+i−e1+i(des−i−2es−i))
+(0,even s > 2,
−es+1
2des+1
2−e2
s+1
2
,odd s > 2,(50)
Using the expressions (45), (47) and (48) obtained for es−k, for k=
1,2, . . .,s−2, as rational functions of esand e0in (43) as a polynomial of es,
it follows that expression (50) is a rational function of es, and multiplying it
by (des−2es)is, where isis an integer number depending on s, expression
(50) can be written as a polynomial of es, provided that des−2es=ds−
es6= 0. Hence, it has as many solutions as the resulting polynomial degree.
Substituting these solutions in the expressions (45), (47) and (48) obtained
for es−k,k= 1,2, . . . , s −2, and e0from (43) the coefficients e0and es−k,
k= 1,2, . . . , s −2, can be obtained. The coefficients di, for i= 1,2, . . . , s,
can be obtained using the coefficients ei, for i= 0,2,3, . . . , s, and (40). The
solution for the coefficients with s= 3 and s= 4 gives polynomial equations
in the variable esof degrees 4 and 6, respectively, and for s≥5 larger degree
polynomials are obtained, and then, there are even more solutions for es.
Finally, from the equations involving Ai, for i=s, s −1,...,0, it is easy
to show that
fs−k=bs−k−
s−k−2
X
i=1
dies−k−i(51)
fi=bi, i = 2,1,0.
Using (36) and Table 1, Table 5 shows the maximum orders that can be
achieved for a given cost C(M) in terms of matrix products with Horner
and Paterson–Stockmeyer methods and the method given by y1s(A) using
(34) and (35). Note that y1s(A) allows to evaluate a polynomial of degree
greater than Horner and Paterson–Stockmeyer methods for a cost from 3M
to 9M, i.e. polynomial degrees from dy1s= 8 to 32 corresponding to s=
2,3,...,8, in y1s(A). We checked that there were at least 4 real solutions for
all the coefficients in (34) and (35) when y1s(A) was equal to the exponential
and cosine Taylor approximations of the corresponding degrees dy1s, avoiding
complex arithmetic if Ais a real square matrix.
13
C(M) 3 4 5 6 7 8 9 10 11 12
dP S 6 9 12 16 20 25 30 36 42 49
dy1s8 12 16 20 24 28 32 36 40 44
Table 5: Order of the approximation dP S that can be achieved using Horner and Paterson–
Stockmeyer methods and order dy1susing method given by (34) and (35) for a given cost
Cin terms of matrix products.
3.1. Combination of y1s(A)with Horner and Paterson–Stockmeyer methods
The following proposition combines Horner and Paterson–Stockmeyer
evaluation formula (4) with (35) to increase the degree of the resulting poly-
nomial to be evaluated:
Proposition 1. Let z1ps(x)be
z1ps(x) = ···y1s(x)xs+ap−1xs−1+ap−2xs−2+. . . +ap−s+1x+ap−s
×xs+ap−s−1xs−1+ap−s−2xs−2+. . . +ap−2s+1 x+ap−2s
×xs+ap−2s−1xs−1+ap−2s−2xs−2+. . . +ap−3s+1 x+ap−3s
.
.
.
×xs+as−1xs−1+as−2xs−2+· ·· +a1x+a0,(52)
where pis a multiple of sand y1s(x)is computed with (34) and (35). Then
the degree of z1ps(x)and its computational cost for x=A∈Cn×nare
dz1ps = 4s+p, Cz1ps = (1 + s+p/s)M. (53)
Proof. The value of dz1ps follows from (36) and (52). For the value of Cz1ps
note that the matrix powers Ai,i= 2,3, . . . , s, to be evaluated for Horner and
Paterson–Stockmeyer evaluation formulas can be reused to compute y1s(A),
and note also that one matrix product is needed to compute y1s(A)Asin
(52). Then, if pis a multiple of s, using (36) and (52) it follows the value of
Cz1ps in (53).
If we apply the evaluation formula (52) to evaluate a polynomial of degree
m+p, i.e. Pm+p(A), it follows that
z1ps(A) = y1s(A)Ap+
p−1
X
i=0
aiAi=Pm+p(A) =
m+p
X
i=0
biAi.(54)
14
m8 12 16 20 20 25 30 30 36 42 42 49 56 56 ···
s23445556667778···
p0 0 0 4 0 5 10 6 12 18 14 21 28 24 · ··
CP S (M) 4 5 6 7 7 8 9 9 10 11 11 12 13 13 ·· ·
Cz1ps (M) 3 4 5 6 6 7 8 8 9 10 10 11 12 12 ···
Table 6: Parameters sand pfor z1ps(x) from (52) to obtain the same approximation order
mas Horner and Paterson–Stockmeyer methods with a saving of 1 matrix product, where
CP S is the cost for evaluating (4) and Cz1ps is the cost for computing z1ps(x), both costs
in terms of matrix products. The first row shows the maximum values of mobtained in
z1ps(x) for a given number of matrix products.
Therefore, the coefficients ai,i= 0,1, . . . , p−1, are directly the corresponding
coefficients bi,i= 0,1, . . . , p −1, from (54), and the coefficients from y1s(A)
can be obtained changing bito bi+pin (38), (40), (43), (45), (47), (48), (50),
(51).
Using (53) Table 6 shows the parameters sand pto evaluate a polynomial
of maximum degree mfor a given cost using z1ps(A) from (52), and it is
compared to the cost of Paterson–Stockmeyer method for the same values
of m. Except for m= 8, all the values are in the set m∗from Table 1,
and for all of them one matrix product is saved with respect to using only
the Paterson–Stockmeyer method. The evaluation scheme z1ps(A) allows to
evaluate polynomials of higher degree than that of the Paterson–Stockmeyer
method for a cost greater than or equal to 3M. Note that for a cost lower
than or equal to 5Mthe maximum degree is obtained using
z1,p=0, s(A) = y1s(A),(55)
from (35). Therefore, z1ps(A) can be considered as a generalization of y1s(A).
In order to evaluate polynomials of degrees different from those given in
Table 6 other combinations z1ps(A) of the new method with the Paterson–
Stockmeyer method can be used, where pis not a multiple of s. For instance,
a polynomial of degree m= 23 can be written as
P23(x) = z1,7,4(A) = (y1,4(x)x3+a6x2+a5x+a4)x4+a3x3+a2x2+a1x+a0,
(56)
where the coefficients of y1,4(x) can be obtained similarly to those of y1s(x)
in (54).
15
c10 -6.140022498994532×10−17 e4-2.785084196756015×10−9
c9-9.210033748491798×10−16 e3-4.032817333361947×10−8
c8-1.980157255925737×10−14 e2-5.100472475630675×10−7
c7-4.508311519886735×10−13 e0-1.023463999572971×10−3
c6-1.023660713518307×10−11 f54.024189993755686×10−13
d5-1.227011356117036×10−10 f47.556768134694921×10−12
d4-6.770221628797445×10−9f31.305311326377090×10−10
d3-1.502070379373464×10−7f22.087675698786810×10−9
d2-3.013961104055248×10−6f12.505210838544172×10−8
d1-5.893435534477677×10−5f02.755731922398589×10−7
e5-3.294026127901678×10−10
Table 7: One real solution for coefficients from (34) and (35) for computing Taylor ap-
proximation of the exponential of order m= 30 with (52) taking s= 5 and p= 10. Note
that in this case coefficients in (54) are bi= 1/i!, i= 0,1,...,30.
Example 3.2. Table 7 presents one solution for the coefficients for an exam-
ple of z1ps(x)from (52) combining (34) and (35) with Horner and Paterson–
Stockmeyer methods with p= 10 and s= 5 to compute Taylor approximation
of the matrix exponential of order m= 30.
From (53) the cost of computing z1,10,5(A)is Cz1,10,5= 8M, 1 matrix
product less than using Horner and Paterson–Stockmeyer methods, see Table
6.
Analogously, using z1ps (x)from (52) with (34) and (35), we computed the
coefficients from (34) and (35) for computing Taylor exponential and cosine
approximation polynomials for all the approximation orders min Table 6
up to approximation order m= 81. This process gave always several real
solutions for all the coefficients involved. The maximum degree used in the
Taylor approximation of the matrix exponential in double precision arithmetic
from [6] is m= 30, and in the matrix cosine in [8] is m= 16. Note that
the values from Table 7 can be directly used to evaluate Taylor approximation
of order m= 30 in the algorithm from [6]. We also checked that using
z1,p=0, s(A) = y1s(A)from (35) gave also real coefficients for computing Taylor
exponential and cosine approximation polynomials with s= 2,3,4. Hence,
if Ais a real square matrix, using z1ps(A)we can compute the exponential
and cosine approximations using real arithmetic saving 1Mwith respect to
the algorithms in [6, 8] for Taylor polynomial degrees m∈m∗from Table 1,
m≥12.
Finally, similarly to Example 3.1 we checked the stability of the solutions
16
of the coefficients in IEEE double precision arithmetic from Table 7, substi-
tuting them in the system of equations (37),(39) taking dei=di+eiwhere di
and eiare the values from Table 7, (41) and (51). Analogously, in all cases
the relative error |bi−1/i!|i!,i=p, p +1, . . . , m +p, see (54), was lower than
the unit roundoff u.
In a similar way we also checked the stability for the computation of the
exponential Taylor polynomial approximation for all the degrees mfrom Table
6 up to m= 81 obtaining the following results:
•There were 4 real solutions for all orders except for m= 25, with 12
real solutions, m= 49,64, and 56 (with parameters s= 8,p= 24)
with 8 real solutions, and m= 42 (with p= 14,s= 7) with 20 real
solutions.
•The solutions for eswere in decreasing module from m= 12 with |es|
of order 10−2to m= 81 with |es|of order 10−44.
•In the case m= 42 (with p= 14,s= 7) the 20 solutions had all
positive values es∈[2.23 ×10−16,8.07 ×10−16]. Taking the solutions
esin double precision arithmetic, from the 20 solutions there were 12
solutions that gave a maximum relative error for all coefficients biless
than 3u, being stable. However, 8 solutions showed certain signs of
instability, giving a maximum relative error for coefficients bibetween
5.04×10−12 and 2.99 ×10−10 > u. Therefore, it is important to select a
solution for esin double precision arithmetic that gives relative errors
for all coefficients biof order u.
We checked also the stability for the Taylor approximation of the matrix
exponential in all the cases from Table 5 and found that the worst case
was m= 28 with s= 7. This is not a case of practical use since, from
Table 5 it has a cost 8M, and from Table 6, using z1ps(A)with p= 10
and s= 5 gives the greater order m= 30 for the same cost, and that
option was checked above to be stable. However, we checked its stability
as a worst case study. This case gave 3 real solutions, where one of
them had multiplicity 10. For the coefficients using the two solutions
eswith multiplicity 1 the maximum relative errors for all coefficients
biwhere of order 10−15 > u. We also checked the scalar case A= 1,
giving relative errors |exp(1) −y1,s=7(1)|/exp(1) = 4.36 ×10−16 and
3.70×10−15 , respectively. However, using the solution with multiplicity
17
10 gave a maximum relative error 10.75 ≫ufor coefficient b8. For
the rest of coefficients the maximum relative error was 1.49 ×10−14,
and for |exp(1) −y1,s=7(1)|/exp(1) = 9.81 ×10−5, so the accuracy was
much lower when using the solution of eswith multiplicity 10.
Therefore, it is necessary to check the stability of the solutions for es
before using the method to evaluate a given polynomial. In general, we
propose to select the solution for esin double precision arithmetic that
gives the lowest maximum relative error for all coefficients bi. If there is
no solution giving relative errors of order ufor a given polynomial with
degree m, a different parameter selection from Tables 6 and 5 should be
tested, since in Table 5 for m > 16 there are two possibilities for pand
sthat gives each value of m.
4. Comparison with existing methods
Using (36), (53) and Tables 1, 2 and 3, 5 and Table 6, it follows Table 8
that shows the approximation orders that can be obtained with Taylor poly-
nomial approximations evaluated using Horner and Paterson–Stockmeyer
methods P Sm(A), y1s(x) from (35), z1ps(A) from (52), Pad´e rational ap-
proximation from Section 2.2, and the mixed rational and polynomial ap-
proximation from Section 2.3, for a given cost in terms of matrix products, if
each approximation reproduces the first terms of the Taylor series of a given
function f, whenever all the approximations exist. Note that the cost of
solving the multiple right-hand side linear system in rational approximations
was taken as 4/3M.
Table 8 shows that the polynomial approximation that allows for the
highest approximation order is y1s(A) for a cost C≤6Mand z1ps(A) for C≥
3M. Note that in Section 3.1 for C≤5Mwe took z1ps(A) = z1,p=0,s(A) =
y1s(A), see (55). Hence, the approximation orders allowed by z1ps(A) for C≥
3Mare higher than the approximation orders available with both Paterson–
Stockmeyer and rational Pad´e method. The highest order for C≥6Mis
given by the mixed rational and polynomial approximation tijs(A) (10). In
the following section particular examples are given in order to increase the
efficiency of polynomial approximations even more.
18
C(M)P Sm(A)y1s(A)z1ps(A)CR(M)rmm (A)CRP (M)tijs (A)
3 6 8 8 3.33 6 3.33 9
4 9 12 12 4.33 8 4.33 12
5 12 16 16 5.33 12 5.33 16
6 16 20 20 6.33 16 6 21
7 20 24 25 7.33 20 7 28
8 25 28 30 8.33 24 8 35
9 30 32 36 9.33 30 9 42
10 36 36 42 10.33 36 10 49
11 42 40 49 11.33 42 11 56
Table 8: Maximum approximation orders if any of the approximations reproduce the first
terms of the Taylor series of a given function ffor a given cost Cfor polynomial ap-
proximations, CRfor rational approximations and CRP for mixed rational and polynomial
approximants, where rational approximations are computed as in Section 2.2 and mixed
rational and polynomial approximants are evaluated as in Section 2.3. The polynomial
approximations considered are Horner and Paterson–Stockmeyer P Sm(A) from Section
2.1, and y1s(A) and z1ps(A) from Section 3. Bold style is applied to the maximum degrees
over all polynomial approximations, and to tijs(A) when it provides the maximum degree
over all approximations with an integer cost.
5. General expressions
This section gives examples that suggest new general expressions for eval-
uating matrix polynomials more efficiently than the evaluation schemes given
in Section 3.
Example 5.1. Consider
y02(A) = A2(c16 A2+c15A),(57)
y12(A) = (y02 (A) + c14A2+c13 A)(y02(A) + c12A2+c11I) + c10y02 (A),(58)
y22(A) = (y12 (A) + c9A2+c8A)(y12(A) + c7y02 (A) + c6A)
+c5y12(A) + c4y02 (A) + c3A2+c2A+c1I, (59)
where the coefficients are numbered correlatively and A2is computed once
and stored to be reused in all the computations. It is easy to show that the
degree of polynomial y22(A)is m= 16 and it can be evaluated with a cost
Cy22 = 4M.
Using function solve from MATLAB Symbolic Math Toolbox, Table 9
gives one solution for the coefficients to compute the exponential Taylor ap-
proximation Pm(A)of order m= 15, i.e. bi= 1/i!,i= 0,1,...,15. For the
19
c16 4.018761610201036×10−4c82.116367017255747×100
c15 2.945531440279683×10−3c7-5.792361707073261×100
c14 8.712167566050691×10−2c6-1.491449188999246×10−1
c13 4.017568440673568×10−1c51.040801735231354×101
c12 -6.352311335612147×10−2c4-6.331712455883370×101
c11 2.684264296504340×10−1c33.484665863364574×10−1
c10 1.857143141426026×101c2-1.224230230553340×10−1
c92.381070373870987×10−1c11
Table 9: Coefficients of y02,y12 ,y22 from (57)-(59) for computing the matrix exponential
Taylor approximation of order m= 15.
solution given in Table 9 if we write y22(A)as a polynomial Pm(A)of degree
m= 16 the relative error for b16 with respect to the corresponding Taylor
polynomial coefficient is
(b16 −1/16!)16! = −0.454,(60)
showing three significant digits.
We selected different possibilities for a new coefficient c0added in (57)-
(59), trying compute the matrix exponential and the matrix cosine Taylor
approximations of order 16, for instance changing (58) for
y12(A) = (y02 (A)+c14A2+c13A+c0I)(y02(A)+c12A2+c11I)+c10 y02(A),(61)
and other options. However, sometimes MATLAB could not find an explicit
solution for the coefficients, and the other times MATLAB gave solutions
with numeric instability.
Note that in Example 5.1 the degree of yk,2(A), k= 1,2, is twice the de-
gree of the polynomial yk−1,2(A), increasing the cost by just 1Mwhen com-
puting yk,2(A) using yk−1,2(A). Therefore, the polynomial degree increases
exponentially while the cost increases linearly. Following this idea Proposi-
tion 2 gives expressions yks(A), k≥1 more general than (34) and (35) where
the degree of the polynomial yks (A) is twice the degree of the polynomial
yk−1,s(A), k≥1, while the cost increases by 1Mwhen computing yks(A)
using yk−1,s(A):
20
Proposition 2. Let
y0s(x) = xs
s
X
i=1
c(0,1)
ixi+
s
X
i=0
c(0,2)
ixi,(62)
y1s(x) = 0
X
i=0
c(1,1)
iyis(x) +
s
X
i=0
c(1,2)
ixi! 0
X
i=0
c(1,3)
iyis(x) +
s
X
i=0
c(1,4)
ixi!
+
0
X
i=0
c(1,5)
iyis(x) +
s
X
i=0
c(1,6)
ixi,(63)
y2s(x) = 1
X
i=0
c(2,1)
iyis(x) +
s
X
i=0
c(2,2)
ixi! 1
X
i=0
c(2,3)
iyis(x) +
s
X
i=0
c(2,4)
ixi!
+
1
X
i=0
c(2,5)
iyis(x) +
s
X
i=0
c(2,6)
ixi,(64)
.
.
.
yks(x) = k−1
X
i=0
c(k,1)
iyis(x) +
s
X
i=0
c(k,2)
ixi! k−1
X
i=0
c(k,3)
iyis(x) +
s
X
i=0
c(k,4)
ixi!
+
k−1
X
i=0
c(k,5)
iyis(x) +
s
X
i=0
c(k,6)
ixi,(65)
where yks(x)is a polynomial of x. Then, the maximum polynomial degree,
denoted by dyks, and the computing cost if x=A,A∈Cn×nin terms of
matrix products, denoted by Cyks are given by
dyks = 2k+1s, Cyks = (s+k)M, (66)
Proof. From (62), the maximum degree of the polynomial y0s(x) is 2s.
Then using (62)-(65) the maximum degree of yis(x), i≤kis 2(i+1)s.
If x=A,A∈Cn×n, then the cost of computing yks(A) is s−1 matrix
products for computing Ai, for i= 2,3, . . . , s, and one matrix product in
each iteration from (62)-(65), i.e. k+ 1. Therefore, Cyks = (s+k)M.
Note that (34) and (35) are particular cases of Proposition 2 where k= 1
and some coefficients c(l,j)
i,l= 0,1, in (62) and (63) are zero. Similarly,
(57)-(59) are particular cases of (62)-(64) where k= 2, s= 2 and some
coefficients c(l,j)
i,l= 0,1,2, are also zero.
21
If we write (65) in powers of xas
yks(x) =
m
X
i=0
aixi,(67)
then ai,i= 0,1, . . . , m, are functions of the coefficients c(l,j)
i, for all i, j, l in
(62)-(65). Hence, it is possible to evaluate matrix polynomial Pm(A) using
(62)-(65) if the system of equations
am(c(l,j)
i) = bm,
am−1(c(l,j)
i) = bm−1,(68)
.
.
.
a0(c(l,j)
i) = b0,
for all coefficients c(l,j)
ifrom (62)-(65) involved in each coefficient ai,i=
0,1, . . . , m, has at least one solution, where biare the polynomial coefficients
of Pm(A). We have obtained a general solution for evaluating polynomials
using (34) and (35) corresponding to particular cases of (62) and (63). And
we obtained one solution for computing the exponential Taylor approxima-
tion of order 15 with (57)-(59). Future work is addressed to obtain general
solutions for evaluating matrix polynomials of different degrees using (62)-
(65), and to study if at least there are particular solutions for evaluating
polynomials such that the Taylor polynomial approximation of certain de-
grees for different matrix functions. That is the case of Example 5.1 which
provides formulas for computing the exponential Taylor approximation poly-
nomial of order m= 15 with a cost C= 4M. From Table 8 it follows that
with a cost of 4MPaterson–Stockmeyer method allows to compute the ma-
trix exponential Taylor approximation polynomial of order only m= 9, Pad´e
rational method rmm(A) allows an order less than 8, the mixed rational and
polynomial approximation tijs(A) allows an order less than 12, and the new
method based on (34) and (35) allows an order m= 12.
In the following example we consider the computation of the Taylor ex-
ponential approximation of order 16 by using the product of two polynomials
of degree 8, both evaluated using (14) and (15).
Example 5.2. Let
h2m1(A) = Pm1(A)P0
m1(A) + β0=
m1
X
i=0
biAi
m1
X
i=0
b0
iAi+β0,(69)
22
b82.186201576339059×10−7b0
82.186201576339059×10−7
b79.839057366529322×10−7b0
72.514016785489562×10−6
b61.058964584814256×10−5b0
63.056479369585950×10−5
b51.554700173279057×10−4b0
53.197607034851565×10−4
b42.256892506343887×10−3b0
42.585006547542889×10−3
b32.358987357109499×10−2b0
31.619043970183846×10−2
b21.673139636901279×10−1b0
28.092036376147299×10−2
b17.723603212944010×10−1b0
13.229486011362677×10−1
b03.096467971936040×100β01
Table 10: Coefficients from (69) for computing the matrix exponential Taylor approxima-
tion of order m= 16 where coefficient b0
8=b8and b0
0= 0.
c44.675683454147702×10−4c0
44.675683454147702×10−4
c31.052151783051235×10−3c0
32.688394980266927×10−3
d2-3.289442879547955×10−2d0
22.219811707032801×10−2
d12.868706220817633×10−1d0
13.968985915411500×10−1
e25.317514832355802×10−2e0
22.771400028062960×10−2
e07.922322450524197×100e0
01.930814505527068×100
f21.673139636901279×10−1f0
28.092036376147299×10−2
f17.723603212944010×10−1f0
11.614743005681339×10−1
f03.096467971936040×100f0
00
Table 11: Coefficients from system (16)-(24) for evaluating polynomials y1(A) = Pm1(A)
and y0
1(x) = P0
m1(A) from (69) with coefficients given by Table 10. Note that f0
0= 0 since
y0
1(0) = b0
0= 0.
where we took m1= 8,b0
8=b8,b0
0= 0 and h2m1(0) = β0, and, therefore,
Pm1(A)and P0
m1(A)are both polynomials as (3) of degree 8, and h2m1(A)
can be written as a polynomial of degree 16 with 17 coefficients, i.e. bi,
i= 0,1,...,8,b0
i,i= 1,...,7and β0. Using the MATLAB Symbolic Math
Toolbox solve function, Table 10 presents one solution for the coefficients of
an example where h2m1(A) = P16
i=0 Ai/i!, i.e. the exponential Taylor polyno-
mial approximation of degree m= 16.
Note that one can evaluate both polynomials Pm1(A)and P0
m1(A)using
an evaluation scheme (14) and (15), see Example 3.1. Finally, from (69) it
follows that β0= 1 so that h2m1(0) = exp(0) = 1. Table 11 shows one solution
for the coefficients from (16)-(24) using (25)-(32) taking y1s(A) = Pm1(A),
and the coefficients taking y0
1s(A) = P0
m1(A), corresponding to c0
4,c0
3,d0
2,d0
1,
e0
2,e0
0,f0
2,f0
1and f0
0.
23
C(M) 6 7 8 9 10 11 12
dP S 16 20 25 30 36 42 49
dz1ps 20 25 30 36 42 49 56
dhm116 24 32 40 48 56 64
Table 12: Order of the approximation dPS that can be obtained using Horner and
Paterson–Stockmeyer methods, order dz1ps that can be obtained using z1ps(A) from (52),
and order dhm1that can be obtained using method given by hm1(A) from (69), using (34)
and (35) for evaluating the polynomials therein, for a given cost Cin terms of matrix
products, whenever the solutions for the coefficients from (69), (34) and (35) exist.
In general, if we evaluate both polynomials Pm1(A) and P0
m1(A) by using
(34) and (35) with m1= 4s, if there exists a solution for the coefficients bi
and b0
ifor Pm1(A) and P0
m1(A), using (36) the degree of the matrix polynomial
h2m1(A) and its computing cost are
dh2m1= 8s, Ch2m1= (s+ 4)M. (70)
Table 12 shows the comparison of the polynomial degrees that can be
obtained by Horner and Paterson–Stockmeyer methods, z1ps(A) from (52)
and h2m1(A) given by (69) varying m1, for a given cost, whenever a solution
for all the coefficients involved in h2m1(A) exists. Since for C > 6Mthey
would be more efficient than Paterson–Stockmeyer method and for C > 7M
they would be more efficient than the method given by (52), it is worth
studying if there exist evaluation schemes like (69) in general, or if at least
they exist for the polynomial approximation of specific matrix functions or
for the evaluation of matrix polynomials in the applications. Moreover, in
order to obtain a polynomial degree equal to 2m1, note that one can think
of other possibilities to have 2m1+ 1 coefficients in h2m1(A) different from
selecting bm1=b0
m1and b0
0as in Example 5.2.
Note that similarly to Section 3.1 Paterson–Stockmeyer method can be
combined with any other method proposed above. And analogously to Ex-
ample 5.2, we can also obtain new methods for evaluating matrix polynomi-
als and matrix polynomial approximations using products of the evaluation
schemes proposed above whenever a solution for the all the coefficients in-
volved exists. The same powers Ai,i= 1,2, . . . , s, should be used in each
evaluation scheme involved, so that they can be reused in all the computa-
tions. It is important to note that even in the case of the well known Pad´e
24
approximations, for a given function f, k and m, a [k/m] Pad´e approximant
rk,m might not exist, see Section 2.2. Therefore, the existence of particular
cases of the methods proposed in this section for computing matrix functions
arising often in the applications is useful if they are more efficient than the
existing methods in those concrete cases. That is the case of Example 5.1
with the matrix exponential Taylor approximation of order 15 which can be
computed with just 4M.
6. Conclusions
This paper proposes the new general evaluation schemes for matrix poly-
nomials given by y0s(A) (34), y1s(A) (35) and z1ps(A) (52), and a method to
check their stability was given. It was shown that these evaluation schemes
allow to evaluate polynomials of degree higher than that of the Paterson–
Stockmeyer method for the same cost. It was also shown that they provide
a greater Taylor approximation order than diagonal Pad´e approximation for
the same cost. Moreover, the new evaluation schemes are more efficient than
the recent mixed rational and polynomial approximation from [9] for several
orders of approximation.
Through Examples 5.1 and 5.2, we suggest the study of more general poly-
nomial evaluation schemes that can be even more efficient, and applications
to the Taylor approximation of matrix functions were given.
With the proposed methods we can state that the combination of Horner
and Paterson–Stockmeyer methods is no longer the most efficient general
method for evaluating matrix polynomials, and that Pad´e approximations
are no longer more accurate than polynomial approximations for the same
cost either.
Future work is:
•To determine if it is possible to find general solutions for evaluating
matrix polynomials using (62)-(65) with s≥2 and k≥2, or at least
particular solutions for cases of interest as in Example 5.1.
•To study if there are general solutions, or at least particular solutions
for the matrix polynomial evaluation using products of the new pro-
posed matrix polynomial evaluation schemes, similarly to Example 5.2.
25
7. Acknowledgements
This work has been supported by Spanish Ministerio de Econom´ıa y
Competitividad and European Regional Development Fund (ERDF) grant
TIN2014-59294-P. We thank the anonymous referee who revised this paper
so thoroughly and carefully.
[1] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar mul-
tiplications necessary to evaluate polynomials, SIAM J. Comput.,
2(1)(1973), pp. 60–66.
[2] N. J. Higham, Functions of Matrices: Theory and Computation, Society
for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008.
[3] G. H. Golub, C. V. Loan, Matrix Computations, 3rd Ed., Johns Hopkins
Studies in Math. Sci., The Johns Hopkins University Press, 1996.
[4] C. B. Moler, C. V. Loan, Nineteen dubious ways to compute the expo-
nential of a matrix, twenty-five years later, SIAM Rev., 45 (2003), pp.
3–49.
[5] A. H. Al-Mohy, N. J. Higham, A new scaling and squaring algorithm
for the matrix exponential, SIAM J. Matrix Anal. Appl., 31(3)(2009)
970–989.
[6] P. Ruiz, J. Sastre, J. Ib´a˜nez, E. Defez, High performance computing
of the matrix exponential, J. Comput. Appl. Math., 291 (2016), pp.
370-379.
[7] A. H. Al-Mohy, N. J. Higham, S. Relton, New Algorithms for Computing
the Matrix Sine and Cosine Separately or Simultaneously. SIAM J. Sci.
Comput., 37(1)(2015), pp. A456-A487.
[8] J. Sastre, J. Ib´a˜nez, P. Alonso, J. Peinado, E. Defez, Two algorithms
for computing the matrix cosine function, Appl. Math. Comput., 312
(2017), pp. 66-77.
[9] J. Sastre, Efficient mixed rational and polynomial approximation of ma-
trix functions, Appl. Math. Comput., 218(24)(2012), pp. 11938–11946.
[10] S. Blackford, J. Dongarra, Installation guide for LAPACK, LAPACK
Working Note 41, Department of Computer Science University of Ten-
nessee, 1999.
26