ArticlePDF Available

Abstract

Recently, two general methods for evaluating matrix polynomials requiring one matrix product less than the Paterson–Stockmeyer method were proposed, where the cost of evaluating a matrix polynomial is given asymptotically by the total number of matrix product evaluations. An analysis of the stability of those methods was given and the methods have been applied to Taylor-based implementations for computing the exponential, the cosine and the hyperbolic tangent matrix functions. Moreover, a particular example for the evaluation of the matrix exponential Taylor approximation of degree 15 requiring four matrix products was given, whereas the maximum polynomial degree available using Paterson–Stockmeyer method with four matrix products is 9. Based on this example, a new family of methods for evaluating matrix polynomials more efficiently than the Paterson–Stockmeyer method was proposed, having the potential to achieve a much higher efficiency, i.e., requiring less matrix products for evaluating a matrix polynomial of certain degree, or increasing the available degree for the same cost. However, the difficulty of these family of methods lies in the calculation of the coefficients involved for the evaluation of general matrix polynomials and approximations. In this paper, we provide a general matrix polynomial evaluation method for evaluating matrix polynomials requiring two matrix products less than the Paterson-Stockmeyer method for degrees higher than 30. Moreover, we provide general methods for evaluating matrix polynomial approximations of degrees 15 and 21 with four and five matrix product evaluations, respectively, whereas the maximum available degrees for the same cost with the Paterson–Stockmeyer method are 9 and 12, respectively. Finally, practical examples for evaluating Taylor approximations of the matrix cosine and the matrix logarithm accurately and efficiently with these new methods are given.
mathematics
Article
Efficient Evaluation of Matrix Polynomials beyond the
Paterson–Stockmeyer Method
Jorge Sastre 1and Javier Ibáñez 2,*


Citation: Sastre, J.; Ibáñez, J. Efficient
Evaluation of Matrix Polynomials
beyond the Paterson–Stockmeyer
Method. Mathematics 2021,9, 1600.
https://doi.org/10.3390/math9141600
Academic Editors: Luca Gemignani
Received: 14 May 2021
Accepted: 1 July 2021
Published: 7 July 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1Instituto de Telecomunicaciones y Aplicaciones Multimedia, Universitat Politècnica de València,
Camino de Vera s/n, 46022 Valencia, Spain; jsastrem@upv.es
2Instituto de Instrumentación para Imagen Molecular, Universitat Politècnica de València,
Camino de Vera s/n, 46022 Valencia, Spain
*Correspondence: jjibanez@upv.es
Abstract:
Recently, two general methods for evaluating matrix polynomials requiring one matrix
product less than the Paterson–Stockmeyer method were proposed, where the cost of evaluating
a matrix polynomial is given asymptotically by the total number of matrix product evaluations.
An analysis of the stability of those methods was given and the methods have been applied to
Taylor-based implementations for computing the exponential, the cosine and the hyperbolic tangent
matrix functions. Moreover, a particular example for the evaluation of the matrix exponential
Taylor approximation of degree 15 requiring four matrix products was given, whereas the maximum
polynomial degree available using Paterson–Stockmeyer method with four matrix products is 9.
Based on this example, a new family of methods for evaluating matrix polynomials more efficiently
than the Paterson–Stockmeyer method was proposed, having the potential to achieve a much higher
efficiency, i.e., requiring less matrix products for evaluating a matrix polynomial of certain degree, or
increasing the available degree for the same cost. However, the difficulty of these family of methods
lies in the calculation of the coefficients involved for the evaluation of general matrix polynomials
and approximations. In this paper, we provide a general matrix polynomial evaluation method for
evaluating matrix polynomials requiring two matrix products less than the Paterson-Stockmeyer
method for degrees higher than 30. Moreover, we provide general methods for evaluating matrix
polynomial approximations of degrees 15 and 21 with four and five matrix product evaluations,
respectively, whereas the maximum available degrees for the same cost with the Paterson–Stockmeyer
method are
9 and 12,
respectively. Finally, practical examples for evaluating Taylor approximations
of the matrix cosine and the matrix logarithm accurately and efficiently with these new methods
are given.
Keywords:
efficient; matrix polynomial evaluation; matrix function; Taylor approximation; cosine;
logarithm
1. Introduction
The authors of [
1
] presented a new family of methods for the evaluation of matrix
polynomials more efficiently than the state-of-the-art method from [
2
] by Paterson and
Stockmeyer (see [
3
], Section 4.2). These methods are based on the multiplication of matrix
polynomials to get a new matrix polynomial with degree given by the sum of the degrees
of the original matrix polynomials. The main difficulty in these methods lies in obtaining
the coefficients involved for the evaluation of general matrix polynomials. In this sense,
the authors of [
1
] (Section 3) gave two concrete general methods for evaluating matrix
polynomials requiring one less matrix product than the Paterson–Stockmeyer method.
Regarding the cost of evaluating matrix polynomials, since the cost of a matrix product,
denoted by
M
, is
O(n3)
for
n×n
matrices, and both the cost of the sum of two matrices
and the cost of a product of a matrix by a scalar are
O(n2)
, similarly to [
3
] (Section 4.2) the
Mathematics 2021,9, 1600. https://doi.org/10.3390/math9141600 https://www.mdpi.com/journal/mathematics
Mathematics 2021,9, 1600 2 of 23
overall cost of the evaluation of matrix polynomials will be approximated asymptotically
by the total number of matrix products.
The stability of the two methods from [
1
] (Section 3) was also analyzed and ap-
plications to the Taylor approximation of the exponential and cosine matrix functions
were given.
The two general polynomial evaluation methods from [
1
] (Section 3) named above
were applied in [
4
,
5
] and for the evaluation of Taylor polynomial approximations of the
matrix exponential and the matrix cosine more efficiently than the state-of-the-art Padé
methods from [6,7].
Moreover, the authors of [
1
] (Example 5.1) provided a polynomial evaluation formula
for computing the matrix exponential Taylor approximation of degree 15 with cost 4
M
,
whereas the maximum available degree for that cost using the two general methods from [
1
]
(Section 3) named above is 12. Based on this example, the authors of [
5
] (Section 3) proposed
other new particular evaluation formulae for computing the matrix exponential Taylor
polynomial approximations of degrees 15, 21, 24, and 30 that had cost 4
M
, 5
M
, 6
M
, and
7
M
, respectively. In [
8
], the authors proposed a particular method for evaluating the matrix
exponential Taylor approximations with the lower degrees 12, 18, and 22 with cost 4
M
, 5
M
,
and 6
M
, respectively (see [
8
], Table 3). Note that general methods for the evaluation of
matrix polynomials more efficiently than Paterson–Stockmeyer method are provided in [
1
],
whereas [
8
] deals with the concrete case of the matrix exponential Taylor approximation.
Moreover, (7) from [
9
] is equivalent to the particular case of taking
s=
1 in (62)–(65)
from [
1
]. This work was submitted in October 2016, and evaluation Formulas (62)–(65)
were first introduced as (25)–(28) of the early unpublished version [
10
] of February 2016,
whereas [
8
] is an updated version of the unpublished reference [
9
] released more than one
year later, i.e., October 2017.
In this paper, we generalize the results from [
5
] (Section 3) given there for the particular
case of the matrix exponential Taylor approximation of degrees 15, 21, 24, and 30. These
generalizations consist of giving general procedures for:
Evaluating polynomial approximations of matrix functions of degrees 15 and 21 with
cost 4Mand 5M, respectively.
Evaluating matrix polynomials of degrees 6swith s=3, 4, . . . with cost (s+2)M.
Evaluating matrix polynomials of degrees greater than 30 with two matrix products
less than the Paterson–Stockmeyer method.
Finally, examples for computing Taylor approximations of the matrix cosine and the
matrix logarithm efficiently and accurately using those evaluation formulae are given.
Regarding Taylor approximations, if
f(X) =
i0
aiXi,
is the Taylor series of the matrix function f(·), where XCn×n, then
Tm(X) =
m
i=0
aiXi,
is its Taylor approximation of order
m
(for the convergence of matrix Taylor series, see
Theorem 4.7 of [3], p. 76).
From [
11
] (Section 1), a matrix
XCn×n
is a logarithm of
BCn×n
if
eX=B
.
Therefore, any nonsingular matrix has infinitely many logarithms and we will focus on the
principal logarithm, denoted by
log(B)
. For a matrix
BCn×n
with no eigenvalues on
R
the principal logarithm is the unique logarithm whose eigenvalues have imaginary parts
Mathematics 2021,9, 1600 3 of 23
lying in the interval
(π
,
π)
. Therefore, in the given examples, we will assume that
B
has
no eigenvalues on Rand we will take the logarithm Taylor series
log(B) = log(IA) =
i>1
Ai/i, where A =IB. (1)
The exponential matrix has been studied in numerous papers (see
[3] (Chap. 10),
and [5,6,8,12] and the references therein). This matrix function can be defined by
exp(X) =
i0
Xi
i!. (2)
The matrix cosine has received attention recently (see [
4
,
7
] and the references therein).
This matrix function can be defined by
cos(X) =
i>0
(1)iX2i
(2i)!=
i>0
(1)iYi
(2i)!,Y=X2. (3)
Note that if we truncate the Taylor series on the right-hand side of
(3)
by the term
i=m, then the order of the corresponding cosine Taylor approximation is 2m.
Regarding the cost in matrix rational approximations, note that the multiplication
by the corresponding matrix inverse is calculated by solving a multiple right-hand side
linear system. From
[13] (Appendix C)
, it follows that the cost of the solution of multiple
right-hand side linear systems
AX =B
, where matrices
A
and
B
are
n×n
, denoted by
D
(see [14], p. 11940) is
D4/3M. (4)
Therefore, using
(4)
, the cost of computing rational approximations will be also given
in terms of M.
In this article, the following notation will be used:
dxe
denotes the smallest integer
greater than or equal to
x
, and
bxc
the largest integer less than or equal to
x
.
u
denotes
the unit roundoff in IEEE double precision arithmetic (see [
15
], Section 2.2). The set of
positive integers is denoted as
N
. The set of real and complex matrices of size
n×n
are
denoted, respectively, by
Rn×n
and
Cn×n
. The identity matrix for both sets is denoted as
I
.
The dependence of a variable yon the variables
x1,x2, . . . , xn
is denoted by
y=y(x1,x2, . . . , xn).
In Section 2, we recall some results for computing matrix polynomials using the
Paterson–Stockmeyer method and summarize the matrix polynomial evaluation methods
from [
1
]. In Section 3, we describe the general methods for computing polynomial ap-
proximations of degrees 15, 21, and 6
s
with
s=
3, 4,
. . .
and give examples for the Taylor
approximation of the cosine and logarithm matrix functions. Finally, in Section 4, we give
some conclusions. In this paper, we provide a method to evaluate matrix polynomials with
two matrix products less than the Paterson–Stockmeyer method and one matrix product
less than the methods from [
1
] (Section 3). Moreover, in this paper, we provide methods to
evaluate polynomial approximations of matrix functions of degrees 15 and 21 with cost 3
M
and 4
M
. These methods are interesting because the maximum available degrees using the
other method proposed in this paper are 12 and 18, respectively. All of the new methods
proposed can be used in the applications for computing approximations of matrix functions
or evaluating matrix polynomials more efficiently than using the state-of-the-art methods.
Mathematics 2021,9, 1600 4 of 23
2. Efficient Evaluation of Matrix Polynomials
2.1. Paterson–Stockmeyer Method
The Paterson–Stockmeyer method [2] for computing a matrix polynomial
Pm(A) =
m
i=0
ciAi, (5)
consists of calculating Pm(A)as
PSm(A) = ·· ·cmAs+cm1As1+. . . +cms+1A+cmsI
×As+cms1As1+cms2As2+. . . +cm2s+1A+cm2sI
×As+cm2s1As1+cm2s2As2+. . . +cm3s+1A+cm3sI
.
.
.
×As+cs1As1+cs2As2+. . . +c1A+c0I, (6)
where
PSm(A)
denotes the Paterson–Stockmeyer evaluation Formula
(6)
and
s>
0 is an
integer that divides
m
. Given a number of matrix products, the maximum degrees of
Pm(A)
that are available using the Paterson–Stockmeyer method are the following:
m=s2, and m=s(s+1), (7)
where
sN
, denoted by
m
,
m={
1, 2, 4, 6, 9, 12,
. . .}
[
14
] (Section 2.1). The cost
CPS
for
computing (6) for the values of
m
are given by [
14
] (Equation (5)), which appear in [
14
]
(Table 1). In [
16
], the optimality of the rule
m= (CPS s+
2
)s
, where
s=bCPS/
2
c+
1,
was demonstrated. This rule gives the same results as
(7)
, since if
CPS
is even then
CPS =
2
s
1, and in that case
m=s(s+
1
)
, and if
CPS
is odd then
CPS =
2
s
, and then
m=s2
.
Note that, for positive integers
m/m
,
Pm(A) = PSm0(A)
can be evaluated using
(6)
taking
m0=min{m1m
,
m1>m}
and setting some coefficients as zero [
1
] (Section 2.1).
2.2. General Polynomial Evaluation Methods beyond the Paterson–Stockmeyer Method
The authors of [
1
] (Example 3.1) give a method to compute
P8(A)
from
(5)
with a cost
of 3Mwith the following evaluation formulae
y02(A) = A2(q4A2+q3A), (8)
y12(A)=(y02(A) + r2A2+r1A)(y02(A) + s2A2)(9)
+s0y02(A) + t2A2+t1A+t0I,
where
q4
,
q3
,
r2
,
r1
,
s2
,
s0
,
t2
,
t1
, and
t0
are complex numbers. In order to compute
(5)
with
m=
8, if we equate
y12(A) = Pm(A)
from
(5)
, then the system of eight equations with
eight coefficients from (16)–(24) from [
1
] arises. In this system, some coefficients can be
obtained directly from the polynomial Pm(A)coefficients as
q4=±c8, (10)
q3=c7/(2q4), (11)
t2=c2, (12)
t1=c1, (13)
t0=c0, (14)
and the remaining equations can be reduced by variable substitution to a quadratic equation
on
s2
. This equation gives two solutions for
q4=c8
and two more solutions for
q4=
Mathematics 2021,9, 1600 5 of 23
c8
. The remaining coefficients can be obtained from
s2
,
q4
, and
q3
. From
(11)
, one gets
q46=0 giving condition
c86=0, (15)
for coefficient c8in Pm(A)from (5).
In order to check the stability of the solutions of
qi
,
ri
, and
si
rounded to IEEE double
precision arithmetic, the authors of [
1
] (Example 3.1) proposed to compute the relative
error for each coefficient
ci
, for
i=
3, 4,
. . .
, 8 substituting those solutions into the original
system of Equations (16)–(24) from [1]. For instance, from (10), it follows that the relative
error for c8using q4rounded to IEEE double precision arithmetic is
|c8˜
q2
4|/|c8|,
where
˜
q4
is the value
q4=±c8
rounded to IEEE double precision arithmetic. Then, if the
relative errors for all the expressions of the coefficients ciare of order the unit roundoff in
IEEE double precision arithmetic, i.e.,
u=
2
53
1.11
×
10
16
, then the solution is stable.
In [
1
] (Table 4), one of the solutions rounded to IEEE double precision arithmetic
for evaluating the Taylor polynomial of the exponential and cosine functions is shown.
These solutions were substituted into the original system of equations to calculate the
relative error for
ci
, for
i=
3, 4,
. . .
, 8 (see [
1
], Example 3.1), giving a relative error of order
u
, turning out to be stable solutions. Moreover, the numerical tests from [
1
] (Example 3.2)
and [4,5]
also show that if the relative error for each coefficient is
O(u)
, then the polynomial
evaluation formulae are accurate, and if the relative errors are
O(
10
u)
or greater, then the
polynomial evaluation formulae are not so accurate.
The authors of [
1
] (Section 3) also provided a more general method for computing
matrix polynomials Pm(A)from (5) of degree m=4sbased on the evaluation formulae
y0s(A) = Ass
i=1
qs+iAi, (16)
y1s(A) = y0s(A) +
s
i=1
riAi! y0s(A) +
s
i=2
siAi!
+s0y0s(A) +
s
i=0
tiAi, (17)
where
s
2,
qs+i
,
ri
,
si
and
ti
are scalar coefficients,
q2s=±c4s6=
0 and then
c4s6=
0
for coefficient
c4s
from
Pm(A)
. Note that
Ai
,
i=
2, 3,
. . .
,
s
are computed only once. The
degree and computing cost of
y1s(A)
are given by (36) of [
1
], i.e.,
dy1s=
4
s
and
Cy1s=s+
1,
s=
2, 3,
. . .
, respectively. A general solution for the coefficients in
(16)
and
(17)
is given
in [1] (Section 3), with the condition
c4s6=0. (18)
Given a cost
C(M)
, the maximum orders that can be reached when using the
Formulae (16) and (17) and the Paterson–Stockmeyer method are shown in [1] (Table 5).
Proposition 1 from [
1
] (Section 3) shows a method for computing matrix polynomials
combining the Paterson–Stockmeyer method with (17) as
zkps(x) = ·· ·yks (x)xs+ap1xs1+ap2xs2+. . . +aps+1x+aps
×xs+aps1xs1+aps2xs2+. . . +ap2s+1x+ap2s
×xs+ap2s1xs1+ap2s2xs2+. . . +ap3s+1x+ap3s
.
.
.
×xs+as1xs1+as2xs2+. . . +a1x+a0, (19)
Mathematics 2021,9, 1600 6 of 23
where
k=
1,
p
is a multiple of
s
and
yks(x) = y1s(x)
is evaluated using
(16)
and
(17)
.
This allows one to increase the degree of the polynomial to be evaluated. The degree of
z1ps(A)
and its computational cost are given by (53) of [
1
], i.e.,
dz1ps =
4
s+p
,
Cz1ps =
(
1
+s+p/s)M
, respectively. Ref. [
1
] (Table 6) shows that evaluating a matrix polynomial
using (19) requires one less product than using the Paterson–Stockmeyer Formula (6).
Proposition 2 from [
1
] (Section 5) gives general formulae more efficient than the
formulae of the previous methods, whenever at least one solution for the coefficients in
(62)–(65) from [
1
] (Prop. 2) exists so that
yks(x)
is equal to the polynomial
Pm
to evaluate.
The maximum polynomial degree and the computing cost if
x=A
,
ACn×n
, are given
by (66) of [
1
], i.e.,
dyks =
2
k+1s
,
Cyks = (s+k)
where
dyks
increases exponentially while
Cyks
increases linearly. (17) is a particular case of (65) from [1] where k=1.
3. Three General Expressions for y2s(A)
This section gives general procedures to obtain the coefficients of
y2s(A)
from (65)
from [
1
] with
k=
2, generalizing the results from [
5
] (Section 3) for the evaluation of
the matrix exponential Taylor approximations of degrees 15, 21, 24, and 30, also giving
formulae for evaluating matrix polynomials of orders 6s, where s=2 , 3, . . .
3.1. Evaluation of Matrix Polynomial Approximations of Order 15 with y2s(A), s =2.
The following proposition allows to compute polynomial approximations of order 15
with cost 4
M
. Note that from [
1
] (Table 8), the maximum available order with cost 4
M
is 9
for the Paterson–Stockmeyer method and 12 for the method given by (16) and (17).
Proposition 1. Let y12(A)and y22(A)be
y12(A) =
8
i=2
ciAi, (20)
y22(A)=(y12(A) + d2A2+d1A)(y12(A) + e0y02 (A) + e1A)
+f0y12(A) + g0y02(A) + h2A2+h1A+h0I, (21)
and let P15(A)be a polynomial of degree 15 with coefficients bi
P15(A) =
15
i=0
biAi. (22)
Then,
y22(A) =
16
i=0
aiAi, (23)
where coefficients aiare functions of the following variables
ai=ai(c8,c7, . . . , c2,d2,d1,e1,e0,f0,g0,h2,h1,h0),i=0, 1, . . . , 16,
and there exist at least one set of values of the 16 coefficients
c8
,
c7
,
. . .
,
c2
,
d2
,
d1
,
e1
,
e0
,
f0
,
g0
,
h2
,
h1, h0so that
ai=bi,i=0, 1, . . . , 15, (24)
and
a16 =c2
8, (25)
Mathematics 2021,9, 1600 7 of 23
provided the following conditions are fulfilled:
c86=0, (26)
3b2
15 6=8b14c2
8, (27)
27b6
15c45/2
8+576b2
14b2
15c53/2
86=512b3
14c57/2
8+216b14b4
15c49/2
8. (28)
Proof of Proposition 1.
Note that
y12(A)
from
(20)
is a matrix polynomial of degree 8.
Therefore, if condition
(26)
holds, then Example [
1
] (Example 3.1) gives four possible
solutions for evaluating
y12(A)
using the evaluation Formulas
(8)
and
(9)
with cost 3
M
.
Similarly to [5] (Section 3.2), we will denote these four solutions as nested solutions.
Using (10) and (11), one gets that y02(A)from (8) can be written as
y02(A) = ±A2(c8A2+c7/(2c8)A). (29)
Then, taking the positive solution in
(29)
, if we equate
y22(A)
from
(23)
to
P15(A)
from
(22)
, we obtain the following nonlinear system with 16 coefficients
ci
,
i=
2, 3,
. . .
, 8,
d2
,
d1
,
e1,e0,f0,g0,h2,h1,h0:
a15 =2c7c8=b15,
a14 =c72+2c6c8=b14,
a13 =2c5c8+2c6c7=b13,
a12 =c62+c8(c4+c8e0)+c4c8+2c5c7=b12,
a11 =c7(c4+c8e0)+c3c8+c4c7+2c5c6+c8c3+c7e0
2c8=b11,
a10 =c52+c6(c4+c8e0)+
2
i=0
c2+ic8i+c7c3+c7e0
2c8+c8(c2+d2) = b10,
a9=c5(c4+c8e0)+
2
i=0
c2+ic7i+c6c3+c7e0
2c8+c7(c2+d2) + c8(d1+e1) = b9,
a8=c4(c4+c8e0)+c2c6+c3c5+c5c3+c7e0
2c8(30)
+c6(c2+d2) + c7(d1+e1) + c8f0=b8,
a7=c3(c4+c8e0)+c2c5+c4c3+c7e0
2c8+c5(c2+d2) + c6(d1+e1) + c7f0=b7,
a6=c2c4+c3c3+c7e0
2c8+ (c2+d2)(c4+c8e0)+c5(d1+e1) + c6f0=b6,
a5=d1c8e0+c2c3+ (c2+d2)c3+c7e0
2c8+c4(d1+e1) + c5f0=b5,
a4=d1
c7e0
2c8
+c2(c2+d2) + c3(d1+e1) + c4f0+c8g0=b4,
a3=e1d2+c2(d1+e1) + c3f0+c7g0
2c8
=b3,
a2=d1e1+c2f0+h2=b2,
a1=h1=1,
a0=h0=1.
This system of equations can be solved for a set of given variables
bi
,
i=
1, 2,
. . .
, 15,
using variable substitution with the MATLAB Symbolic Toolbox using the following MAT-
LAB code fragment (we used MATLAB R2020a in all the computations):
% MATLAB code fragment 4.1: solves coefficient c8 of
% the system of equations (30)for general coefficients bi
1 syms A c2 c3 c4 c5 c6 c7 c8 d1 d2 e0 e1 f0 g0 h2 h1 h0I
2 syms b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16
Mathematics 2021,9, 1600 8 of 23
3 c=[c2;c3;c4;c5;c6;c7;c8];
4 b=[b16;b15;b14;b13;b12;b11;b10;b9;b8;b7;b6;b5;b4;b3;b2;b1;b0];
5 y0=Aˆ2*(sqrt(c8)*Aˆ2+c7/(2*sqrt(c8))*A); % y00from (29)
6 y1=sum(c.*A.ˆ([2:8]’));
7 y2=(y1+d2*Aˆ2+d1*A)*(y1+e0*y0+e1*A)+f0*y1+g0*y0+h2*Aˆ2+h1*A+h0I;
8 [cy2,a1]=coeffs(y2,A);
9 cy2=cy2.’;
10 v=[cy2 b a1.’]%v shows the coefficients of each power of A
11 cy2=cy2(2:end)-b(2:end); %System of equations
12 c7s=solve(cy2(1),c7,’ReturnConditions’,true); %c7s=f(c8,bi)
13 c7s.conditions %c8 ˜= 0 condition for the existence of solutions
14 c7s=c7s.c7;
15 cy2=subs(cy2,c7,c7s);
16 c6s=solve(cy2(2), c6); %c6s depends on c8 bi
17 cy2=subs(cy2,c6,c6s);
18 c5s=solve(cy2(3), c5); %c5s depends on c8 bi
19 cy2=simplify(subs(cy2,c5,c5s));
20 symvar(cy2(4)) %cy2(4) depends on c8, c4, e0 bi
21 e0s=solve(cy2(4), e0);
22 cy2=simplify(subs(cy2,e0,e0s));
23 symvar(cy2(5)) %cy2(5) depends on c8, c3, c4, bi
24 c3s=solve(cy2(5), c3);
25 cy2=simplify(subs(cy2,c3,c3s));
26 symvar(cy2(6)) %cy2(6) depends only on c8, c2, d2, bi
27 d2s=solve(cy2(6), d2);
28 cy2=simplify(subs(cy2,d2,d2s));
29 symvar(cy2(7)) %cy2(7) depends only on c8, d1, e1, bi
30 d1s=solve(cy2(7), d1);
31 cy2=simplify(subs(cy2,d1,d1s));
32 symvar(cy2(8)) %cy2(8) depends only on c8, c4, f0, bi
33 f0s=solve(cy2(8), f0);
34 cy2=simplify(subs(cy2,f0,f0s));
35 symvar(cy2(9)) %cy2(9) depends only on c8, b7, b8,...,b15
36 c8s=solve(cy2(9), c8)
Since
cy2(9)
from the code fragment line 35 depends only on coefficients
bi
, for
i=
7, 8,
. . .
, 15, and
c8
, the solutions for
c8
are given by the zeros of equation
cy2(9)
with
the condition given by MATLAB code fragment line 13, i.e., condition
(26)
.
solve
function
gives 16 solutions for
c8
. They are the roots of a polynomial with coefficients depending on
variables bi, for i=7, 8, . . . , 15.
Once the 16 solutions for
c8
are obtained for concrete values of the coefficients
bi
,
i=
0, 1,
. . .
, 15, the remaining variables can be obtained with the following MATLAB
code fragment:
% MATLAB code fragment 4.2: solves coefficient c2 of the
% system of equations (30)for general coefficients bi by
% using the solutions for coefficient c8 obtained using the
% MATLAB piece of code 4.1
1 symvar(cy2(10)) %cy2(10) depends on c8, c2, c4, bi
2 c4s=solve(cy2(10), c4) % two solutions depending on c8, c2, bi
3 cy2=simplify(subs(cy2,c4,c4s(1)))%change c4s(1) for c4s(2) for
more solutions
4 symvar(cy2(11)) %cy2(11) depends on c8, c2, e1, bi
5 e1s=solve(cy2(11), e1)
Mathematics 2021,9, 1600 9 of 23
6 cy2=simplify(subs(cy2,e1,e1s))
7 symvar(cy2(12)) %cy2(12) depends on c8, c2, g0, bi
8 g0s=solve(cy2(12), g0,’ReturnConditions’,true)
9 g0s.conditions %conditions for the existence of solutions:
%3*b15ˆ2 ˜= 8*b14*c8ˆ2 &
%27*b15ˆ6*c8ˆ(45/2) + 576*b14ˆ2*b15ˆ2*c8ˆ(53/2) ˜=
%512*b14ˆ3*c8ˆ(57/2) + 216*b14*b15ˆ4*c8ˆ(49/2) &
%c8 ˜= 0
10 g0s=g0s.g0
11 cy2=simplify(subs(cy2,g0,g0s))
12 symvar(cy2(13)) %cy2(13) depends on c8, c2, bi
Since
cy2(13)
depends only on coefficients
bi
, for
i=
3, 4,
. . .
, 15,
c8
, and
c2
, sub-
stituting the values obtained previously for
c8
in
cy2(13)
, the solutions for
c2
are given
by the zeros of equation
cy2(13)
with the conditions given by line 9 from
MATLAB code
fragment 4.2
when solving
c7
and
g0
, given by
(26)
(28)
. Both code fragments are available
(http://personales.upv.es/jorsasma/Software/coeffspolm15plus.m (accessed on
24 June 2021)).
All of the coefficients
c7
,
c6
,
. . .
,
c3
,
d2
,
d1
,
e1
,
e0
,
f0
,
g0
, can be obtained from
c2
,
c8
and
bi
,
i=
0, 1,
. . .
15, and then
hi
can be obtained from the three last equations of system
(30) as
h2=b2d1e1c2f0, (31)
h1=b1, (32)
h0=b0. (33)
Finally, using (20) and (21), coefficient a16 from (23) is given by (25).
Hence, for any values of the coefficients
bi
,
i=
0, 1,
. . .
, 15, of the polynomial
(22)
,
then there exist at least one solution of system
(30)
giving a set of values of the coefficients
from
y22(A)
from
(20)
and
(21)
so that
(24)
and
(25)
hold, provided conditions
(26)
(28)
are fulfilled.
Given certain coefficients
bi
,
i=
1, 2,
. . .
, 15 for
P15(A)
from
(22)
, using MATLAB
code fragments 4.1 and 4.2, one can get typically more than one solution of system
(30)
.
Moreover, if we take the negative sign in
(29)
another set of solutions fulfilling
(24)
can
be obtained. For each of those solutions there are also different solutions for the nested
solutions for evaluating (20) with the solutions from Example [1] (Example 3.1).
For each of those solutions, coefficient
a16
from
y22(A)
in
(23)
is given by
(25)
. For the
particular case of the matrix exponential Taylor approximation from [
5
] (p. 209), there were
two real solutions of c8giving
|c2
81/16!|16! 0.454, (34)
|c2
81/16!|16! 2.510. (35)
Therefore, we selected the first solution
(34)
since both solutions were stable according
to the stability study from Section 2.2 (see [
1
], p. 243), but
(34)
had a lower error for
a16
with respect to the corresponding Taylor coefficient 1
/
16
!
. Then, considering exact
arithmetic, one gets that the matrix exponential approximation from
y22(A)
in evaluation
Formulas (10)–(12)
from [
5
] (p. 209) with the coefficients from [
5
] (Table 3) is more accurate
than the exponential Taylor approximation of order 15. For that reason, the corresponding
Taylor approximation order was denoted by m=15+in [5] (Section 4).
Recently, in [
17
], an evaluation formula of the type given in
Proposition 1
was used
to evaluate a Taylor polynomial approximation of degree 15+ of the hyperbolic tangent.
However, in this case, all the solutions obtained were complExample We tried different
configurations of the evaluation formulae giving degree 15+, but all of them gave complex
Mathematics 2021,9, 1600 10 of 23
solutions. Then, we proposed the similar evaluation Formula (11) from [
17
] (p. 6) with
degree 14+ that did give real solutions. Similarly to
(34)
, in the case of the hyperbolic
tangent, the relative error of the coefficients
ai
,
i=
15, and 16 was also lower than 1—
concretely, 0.38 and 0.85, respectively (see [
17
], p. 6). This method was compared to
the Paterson–Stockmeyer method being noticeably more efficient without affecting the
accuracy (see [
17
], Section 3) for details. Proposition 1allows us to evaluate polynomial
approximations of degree 15 not only for the matrix exponential or the hyperbolic tangent
but also for other matrix functions. If all the given solutions were complex, we can modify
the formula to evaluate approximation formulae with a lower degree, such as 14+, to check
if they give real solutions.
Example 1.
In [
4
] (Section 2), we showed that the solutions for the coefficients of the polynomial
evaluation method similar to [
5
] (Section 3.2) of the matrix cosine Taylor approximation of order
2
m=
30
+
were not stable, giving poor accuracy results. Using Proposition 1, this example gives a
stable solution for calculating a Taylor-based approximation of the matrix cosine with a combination
of formula
(21)
with the Paterson–Stockmeyer method from
(19)
. Setting
k=p=s=
2in
(19)
and yks =y22 from (21), one gets
z222(B) = y22 (B)B2B/2 +I=P17(B) =
17
i=0
biBi+a18B18, (36)
where
B=A2
and
z222(B)
is a Taylor-based approximation of the matrix cosine
(3)
of order
2
m=
34
+
, i.e.,
bi= (
1
)i/(
2
i)!
for
i=
0, 1,
. . .
, 17, coefficient
a18
is given by
a18 =c2
8
(see (25)).
MATLAB code fragment 4.1 was used for obtaining all the real solutions of
c8
. Then, MATLAB
code fragment 4.2 was used with these solutions taking solution 1 for coefficient
c4
in line 3 of
the MATLAB code fragment 4.2. Then, we obtain the equation
cy2(13)
from the code fragment
in line 12 depending on
c2
and
c8
. This equation was solved for every real solution of
c8
, using
the MATLAB Symbolic Math Toolbox with variable precision arithmetic. Finally, we obtained the
nested solutions for computing (20)with (8)and (9)with q4>0from (10).
The real solutions of system
(30)
rounded to IEEE double precision arithmetic explored in [
4
]
(Section 2) gave errors of order
10
14
, greater than the unit roundoff in IEEE double precision
arithmetic
u=
2
53
1.11
×
10
16
. Using MATLAB code fragments 4.1 and 4.2, we checked that
there is no solution with a lower error. Then, according to the stability check from Section 2.2, the
solutions are unstable, and we checked in [
4
] that they gave poor accuracy results. However, using
Proposition 1, for 2
m=
34
+
, we could find two real solutions of system
(30)
giving a maximum
error of order u. For those two solutions, a18 gave
|a18 1/36!|36! 0.394, (37)
|a18 1/36!|36! 16.591, (38)
respectively. Therefore, the solution
(37)
giving the lowest error was selected. Table 1gives the
corresponding coefficients in IEEE double precision arithmetic from
(8)
and
(9)
for computing
(20)
with three matrix products, and the rest of the needed coefficients for computing
y22(B)
from
(21)
with s =2, given finally by
y02(B) = B2(q4B2+q3B), (39)
y12(B) = y02 (B) + r2B2+r1By02 (B) + s2B2
+s0y02(B) + t2B2, (40)
y22(B)=(y12(B) + d2B2+d1B)(y12(B) + e0y02 (B) + e1B)
+f0y12(B) + g0y02(B) + h2B2+h1B+h0I. (41)
Using
(39)
(41)
with the coefficients from Table 1and
(36)
, a matrix cosine Taylor approxima-
tion of order 2
m=
34
+
can be computed in IEEE double precision arithmetic with a cost of six
Mathematics 2021,9, 1600 11 of 23
matrix products, i.e.,
B=A2
,
B2
, three for evaluating
(39)
(41)
, and one more for
evaluating (36)
.
The maximum available and stable order given in [
4
] (Section 2) with six matrix products was
2
m=
30. The coefficients from Table 1were computed with variable precision arithmetic with a
precision of 32 and 250 decimal digits to check its correctness, giving the same results.
Taking into account (3)and the selection of the solution in (37), in exact arithmetic, one gets
z222(x) =
17
i=0
(1)ix2i
(2i)!+a18x36. (42)
where, using (39)(41), one gets a18 =q4
4.
Table 1.
Coefficients of
y02
,
y12
, and
y22
from
(39)
(41)
for computing the Taylor-based approxima-
tion z222(B)of order 2m=34+from (36)of the matrix cosine.
q43.571998478323090 ×1011 d12.645687940516643 ×103
q31.857982456862233 ×108e11.049722718717408 ×101
r23.278753597700932 ×105e08.965376033761624 ×104
r11.148774768780758 ×102f01.859420533601965 ×100
s22.008741312156575 ×105g01.493008139094410 ×101
s01.737292932136998 ×101h21.570135323717639 ×104
t26.982819862335600 ×105h11/6!
d25.259287265295055 ×105h01/4!
To check if the new evaluation formulae are accurate, we compared the results of computing
the matrix cosine with function
cosm
from [
7
] with a function using the coefficients from Table 1in
(39)
(41)
and
(36)
with no scaling for simplicity. Since [
7
] used a relative backward error analysis,
we used the values of
Θ
from [
15
] (Table 1) corresponding to the backward relative error analysis
of the Taylor approximation of the matrix cosine, denoted by
Eb
. Then, if
||B|| =||A2|| ≤ Θ
,
then
||Eb|| ≤ u
for the corresponding Taylor approximations. In [
15
] (Table 1),
Θ
for Taylor
approximation of order 16 was 9.97 and
Θ20 =
10.18, showing two decimal digits. Then, for
our test with order 2
m=
34
+
, we used a set of 48 8
×
8matrices from the Matrix Computation
Toolbox [
18
] divided by random numbers to give
kBk
between 9 and 10. We compared the forward
error Efof both functions
Ef=||cos(A)f(A)||, (43)
where function
f(A)
was
cosm
and the function using
z222(B)
. The “exact value" of
cos(A)
was
computed using the method in [
19
]. The total cost of the new matrix cosine computation function
z222
summing up the number of matrix products over all the test matrices is denoted by
Costz222
.
Taking into account
(4)
, the cost for the
cosm
Padé function summing up the number of matrix
products and inversions over all the test matrices is denoted by
Costcosm
. Then, the following cost
comparison was obtained for that set of test matrices
100 ×Costcosm Costz222
Costz222
=40.78%,
i.e., the cost of
z222
is 40.78% lower than the cost of
cosm
. Moreover, the results were more accurate
in 76.60% of the matrices. Therefore, the new formulae are efficient and accurate.
3.2. Evaluation of Matrix Polynomial Approximations of Order 21
In this section, we generalize the results from [
5
] (Section 3.3) for evaluating poly-
nomial approximations of order
m=
21 with cost 5
M
. Note that for that cost, from [
1
]
(
Table 8
), the maximum available orders using the Paterson–Stockmeyer method and the
evaluation Formulas
(16)
and
(17)
are 12 and 16, respectively. Applying a similar proce-
dure to that in Section 3.1 to obtain the coefficients for evaluating a matrix polynomial
approximation of order 21, in this case, a system of 22 equations with 22 unknown variables
Mathematics 2021,9, 1600 12 of 23
arises. This system can be reduced to three equations with three unknowns using variable
substitution with the MATLAB Symbolic Toolbox, provided that two of the variables are
not zero. The following proposition summarizes the results
Proposition 2. Let y13(A)and y23(A)be
y13(A) =
12
i=2
ciAi(44)
y23(A)=(y13(A) + d3A3+d2A2+d1A)(y13(A) + e0y03 (A) + e1A)
+f0y13(A) + g0y03(A) + h3A3+h2A2+h1A+h0I(45)
and let P21(A)be a polynomial of degree 21 with coefficients bi
P21(A) =
21
i=0
biAi. (46)
Then,
y23(A) =
24
i=0
aiAi, (47)
where coefficients
ai=ai(c12
,
c11
,
. . .
,
c2
,
d3
,
d2
,
d1
,
e1
,
e0
,
f0
,
g0
,
h3
,
h2
,
h1
,
h0)
,
i=
0, 1,
. . .
, 24,
and the system of equations arising when equating
ai=bi,i=0, 1, . . . , 21, (48)
can be reduced to a system of three equations of variables c12, c11 and c10 , provided
c12 6=0, e06=0, (49)
and then variables ai, i =22, 23 and 24 are
a24 =c2
12,
a23 =2c11c12 , (50)
a22 =c2
11 +2c10c12 .
Proof of Proposition 2.
The proof of Proposition 2is similar to the proof of Proposition 1.
Analogously, if condition
(18)
is fulfilled with
s=
3, i.e.,
c12 6=
0, then polynomial
y13(A)
can be evaluated using
(16)
and
(17)
with
s=
3 and cost 4
M
, where
y03
is given by (21)
of [5] (Section 3.3), i.e.,
y03(A) = ±A3(c12 A3+c11 /(2c12)A2+ (4c10c12 c2
11)/(8c3/2
12 )A). (51)
If we apply
(48)
, we obtain a similar system to
(30)
. Using variable substitution
with the MATLAB Symbolic Toolbox, the MATLAB code
coeffspolm21plus.m
(http://
personales.upv.es/jorsasma/Software/coeffspolm21plus.m (accessed on 24 June 2021))
similar to
MATLAB code fragments 4.1 and 4.2
is able to reduce the whole nonlinear
system of 22 equations to a nonlinear system of three equations with three variables
c10
,
c11
,
and
c12
. The MATLAB code
coeffspolm21plus.m
returns conditions
(49)
(see the actual
code for details.)
If there is at least one solution for
c10
,
c11
, and
c12
fulfilling condition
(49)
, all of the
other coefficients can be obtained using the values of
c10
,
c11
,
c12
. Then,
y13(A)
from
(44)
can be evaluated using
(16)
and
(17)
giving several possible solutions. Finally, the
solutions are rounded to the required precision. Then, according to the stability study from
Section 2.2 (see [1], p. 243), the solution giving the least error should be selected.
Mathematics 2021,9, 1600 13 of 23
Similarly to
(34)
and
(35)
, the degree of
y23(A)
of
(45)
is 24, but with the proposed
method, we can only set the polynomial approximation coefficients of
(46)
up to order
m=
21. The coefficients of
ai
of the power
Ai
,
i=
22, 23, and 24 are given by
(50)
. The
authors of [
5
] (Section 3.3) give one particular example of this method for calculating a
matrix Taylor approximation of the exponential function, where in exact arithmetic
y23(A) = T21(A) + a22 A22 +a23 A23 +a24 A24, (52)
where T21 is the Taylor approximation of order m=21 of the exponential function and
|a22 1/22!|22! 0.437, (53)
|a23 1/23!|23! 0.270, (54)
|a24 1/24!|24! 0.130, (55)
showing three decimal digits. Again, in exact arithmetic, the approximation
y23(A)
is more
accurate than
T21(A)
. Therefore, the order of that approximation was denoted as
m=
21
+
in [
5
] (Section 4). The experimental results from [
5
] showed that this method was more
accurate and efficient than the Padé method from [6].
Recently, in [
17
], an evaluation formula similar to
(45)
was used to evaluate a Taylor
polynomial approximation of the hyperbolic tangent. Similarly to
(53)
, in the case of the
hyperbolic tangent, the relative error of the coefficients
ai
,
i=
22, 23, and 24 was also
lower than 1—concretely, 0.69, 0.69, and 0.70, respectively (see [
17
], p. 7). This method was
compared to the Paterson–Stockmeyer method being noticeably more efficient without
affecting the accuracy (see [17], Section 3 for details).
Proposition 2allows us to evaluate polynomial approximations of degree 21 not
only for the matrix exponential or the hyperbolic tangent but also for other matrix func-
tions. In the following example, we show an application for the evaluation of the Taylor
approximation of the matrix logarithm.
Example 2.
In this example, we give real coefficients for computing a Taylor-based approximation
of the matrix logarithm of order
m=
21
+
in a stable manner based on the previous results.
Evaluating
(44)
using
(16)
and
(17)
with
s=
3, and using
(45)
, the following formulae can be used
to compute the approximation of order
m=
21
+
of the principal logarithm
log(B)
for a square
matrix B =IA with no eigenvalues on R
y03(A) = A3(c1A3+c2A2+c3A), (56)
y13(A)=(y03(A) + c4A3+c5A2+c6A)(y03(A) + c7A3+c8A2)
+c9y03(A) + c10 A3+c11 A2, (57)
y23(A)=(y13(A) + c12 A3+c13 A2+c14 A)(y13(A) + c15 y03(A) + c16 A)
+c17y13 (A) + c18y03 (A) + c19 A3+c20 A2+A, (58)
where the coefficients are numbered correlatively, and using (1), we take
log(B) = log(IA) =
i>1
Ai/i≈ −y23(A). (59)
The coefficients can be obtained solving first the system of equations arising from
(48)
with
bi=1/i
for
i=
1, 2,
. . .
, 21,
b0=
0. We used
vpasolve
(https://es.mathworks.com/help/
symbolic/vpasolve.html (accessed on 24 June 2021)) function from the MATLAB Symbolic Com-
putation Toolbox to solve those equations with variable precision arithmetic. We used the
Random
option of
vpasolve
, which allows to obtain different solutions for the coefficients, running it 100
times. The majority of the solutions were complex, but there were two real stable solutions. Then,
we obtained the nested solutions for the coefficients of
(16)
and
(17)
with
s=
3for computing poly-
nomial (44)with four matrix products (see [1], Section 3), giving also real and complex solutions.
Mathematics 2021,9, 1600 14 of 23
Again, we selected the real stable solution given in Table 2. This solution avoids complex
arithmetic if the matrix
A
is real. The relative errors of the coefficients of
A22
,
A23
and
A24
of
y23(A)
with respect to the corresponding Taylor approximation of order 24 of
log(IA)
function are:
a22 =3.205116205918952 ×102,|a22 1/22|22 0.295, (60)
a23 =1.480540983455180 ×102,|a23 1/23|23 0.659, (61)
a24 =3.754613237786792 ×103,|a24 1/24|24 0.910, (62)
where
a22
,
a23
, and
a24
are rounded to double precision arithmetic. Then, considering exact
arithmetic, one gets
y23(A) =
21
i=1
Ai/i+a22 A22 +a23 A23 +a24 A24, (63)
which is more accurate than the corresponding Taylor approximation of
log(B)
of order
m=
21.
Therefore, similarly to [5] (Section 4), the approximation order of (63)is denoted by m =21+.
Table 2.
Coefficients of
y03
,
y13
, and
y23
from (56)–(58) for computing a Taylor-based approximation
of function log(B) = log(IA)of order m=21+.
c12.475376717210241 ×101c11 1.035631527011582 ×101
c22.440262449961976 ×101c12 3.416046999733390 ×101
c31.674278428631194 ×101c13 4.544910328432021 ×102
c49.742340743664729 ×102c14 2.741820014945195 ×101
c54.744919764579607 ×102c15 1.601466804001392 ×100
c65.071515307996127 ×101c16 1.681067607322385 ×101
c72.025389951302878 ×101c17 7.526271076306975 ×101
c84.809463272682823 ×102c18 4.282509402345739 ×102
c96.574533191427105 ×101c19 1.462562712251202 ×101
c10 3.236650728737168 ×101c20 5.318525879522635 ×101
The
θ
values such that the relative backward errors for the Padé approximations are lower
than
u
are shown in [
11
] (Table 2.1). The corresponding
θ
value for the Taylor approximation of
log(IA)
of order
m=
21
+
, denoted by
θ21+
, can be computed similarly (see [
11
] for details),
giving
θ21+=
0.211084493690929, where the value is rounded to IEEE double precision arithmetic.
We compared the results of using
(56)
(58)
with the coefficient values from Table 2, with the
results given by function
logm_iss_full
from [
20
]. For that comparison, we used a matrix test set
of 43 8
×
8matrices of the Matrix Computation Toolbox [
18
]. We reduced their norms so that they
are random with a uniform distribution in
[
0.2,
θ21+]
in order to compare the Padé approximations
of
logm_iss_full
with the Taylor-based evaluation Formulas
(56)
(58)
using no inverse scaling
in none of the approximations (see [11]).
The “exact” matrix logarithm was computed using the method from [
19
]. The error of the
implementation using Formula
(58)
was lower than
logm_iss_full
in 100% of the matrices with
a19.61% lower relative cost in flops. Therefore, evaluation Formulas
(56)
(58)
are efficient and
accurate for a future Taylor-based implementation for computing the matrix logarithm.
3.3. Evaluation of Matrix Polynomials of Degree m =6s
The following proposition generalizes the particular cases of the evaluation of the
matrix exponential Taylor approximation with degrees
m=
24 and 30 from [
5
] (Section 3.4)
for evaluating general matrix polynomials of degree m=6s,s=2, 3, . . .
Mathematics 2021,9, 1600 15 of 23
Proposition 3. Let y0s(A), y1s(A), and y2s(A)be the polynomials
y0s(A) = Ass
i=1
es+iAi, (64)
y1s(A) =
4s
i=1
ciAi, (65)
y2s(A) = y1s(A) y0s(A) +
s
i=1
eiAi!+
s
i=0
fiAi, (66)
and let Pm(A)be the polynomial
Pm(A) =
6s
i=0
biAi. (67)
Then,
y2s(A) =
6s
i=0
aiAi, (68)
where coefficients ai=ai(ci,ej,fk), i =1, 2,. . . , 4s, j =1, 2,. . . , 2s, k =0, 1,. . . , s, and if
b6s6=0, (69)
then, if we equate y2s(A) = Pm(A), i.e.,
ai=bi,i=0, 1, . . . , 6s, (70)
then the following relationships between the coefficients of the polynomials
y0s(A)
,
y1s(A)
,
y2s(A)
,
and Pm(A)are fulfilled:
a.
c4sk=c4sk(b6s,b6s1, . . . , b6sk),for k =0, 1, . . . , s1,
e2sk=e2sk(b6s,b6s1, . . . , b6sk),for k =0, 1, . . . , s1. (71)
b.
c3sk=c3sk(b6s,b6s1, . . . , b5sk,es, . . . , esk),k=0, . . . , s1. (72)
c.
c2sk=c2sk(b6s, . . . , b4sk,es, . . . , e1),k=0, . . . , s1. (73)
d.
csk=csk(b6s, . . . , b3sk,es, . . . , e1),k=0, . . . , s1. (74)
Proof of Proposition 3.
Polynomial
y1s(A)
from
(65)
can be computed using the general
method from [
1
] (Section 3), reproduced here as
(16)
and
(17)
, provided condition
(18)
is
fulfilled, i.e., c4s6=0.
a.
In the following, we show that
(71)
holds. Taking
(16)
and
(17)
into account, one gets
y1s(A) = y2
0s(A) + q(A), (75)
where
q(x)
is a polynomial of degree lower than 3
s+
1, and equating the terms of
degree 4
s
in
(75)
, we obtain
e2s=±c4s
. On the other hand, equating the terms of
degree 6sin (66), taking condition (69) into account, we obtain
c4se2s=b6s,
c4s(±c4s)=b6s,
Mathematics 2021,9, 1600 16 of 23
c4s=3
qb2
6s6=0, (76)
e2s=±3
pb6s6=0. (77)
Since condition
(69)
is fulfilled, then by
(76)
, one gets that condition
(18)
is also
fulfilled. Then, polynomial
y1s(A)
from
(65)
can be effectively computed using
(16)
and (17), and by (76) and (77), one gets that c4sand e4sdepend on b6s, i.e.,
c4s=c4s(b6s),e2s=e2sf(b6s). (78)
Equating the terms of degree 4s1 in (75), we obtain
c4s1=2e2se2s1.
Therefore,
e2s1=c4s1
2e2s
. (79)
Equating the terms of degree 4s2 in (75), we obtain
c4s2=e2se2s2+e2
2s1+e2s2e2s,
then
e2s2=c4s2e2
2s1
2e2s
. (80)
Equating the terms of degree 4s3 in (75), we obtain
c4s3=e2se2s3+e2s1e2s2+e2s2e2s1+e2s3e2s,
then
e2s3=c4s3(e2s1e2s2+e2s2e2s1)
2e2s
.
Equating the terms of degree 4s4 in (75), we obtain
c4s4=e2se2s4+e2s1e2s3+e2s2e2s2+e2s3e2s1+e2s4e2s,
then
e2s4=
c4s43
i=1
e2sie2s+i4
2e2s
.
Proceeding in an analogous way with e2skfor k=5, 6, . . . , s1, we obtain
e2sk=
c4skk1
i=1
e2sie2s+ik
2e2s
,k=1, 2, . . . , s1. (81)
On the other hand, equating the terms of degree 6
s
1 in
(66)
, and taking
(79)
into
account, we obtain
c4se2s1+c4s1e2s=c4s1c4s
2e2s
+e2s=b6s1,
Since
c4s
2e2s
+e2s=c4s+2e2
2s
2e2s
=
3
qb2
6s+23
qb2
6s
23
b6s
=33
b6s
26=0, (82)
then
c4s1=2b6s1
33
b6s
. (83)
Mathematics 2021,9, 1600 17 of 23
Taking into account
(77)
,
(79)
and
(83)
, we obtain that
c4s1
and
e2s1
depend on
b6s
and b6s1, i.e.,
c4s1=c4s1(b6s,b6s1),e2s1=e2s1(b6s,b6s1). (84)
Equating the terms of degree 6s2 in (66) and taking (82) into account, we obtain
c4se2s2+c4s1e2s1+c4s2e2s=b6s2,
and taking into account (80) and (82), it follows that
c4s2=b6s2c4s1e2s1+e2
2s1
2e2s
33
b6s
2
. (85)
On the other hand, from
(77)
,
(84)
and
(85)
, one gets that
c4s2
and
e2s2
can be
computed explicitly depending on b6s,b6s1, and b6s2, i.e.,
c4s2=c4s2(b6s,b6s1,b6s2),e2s2=e2s2(b6s,b6s1,b6s2). (86)
Proceeding similarly when equating the terms of degrees 6
s
3, 6
s
4
. . .
, 5
s+
1 in
(66), one gets (71).
b.
In the following, we show that
(72)
holds. Equating the terms of degree 5
s
in
(66)
and taking condition e2s6=0 from (77) into account, we obtain
c4ses+c4s1es+1+. . . +c3s+1e2s1+c3se2s=b5s,
c3s=b5s(c4ses+c4s1es+1+. . . +c3s+1e2s1)
e2s
.
Hence, taking (71) into account, it follows that
c3s=c3s(b6s,b6s1, . . . , b5s+1,b5s,es). (87)
Equating the terms 5
s
1 in
(66)
and taking condition
e2s6=
0 from
(77)
into account,
we obtain c4ses1+c4s1es+. . . +c3se2s1+c3s1e2s=b5s1,
c3s1=b5s1(c4ses1+c4s1es+. . . +c3se2s1)
e2s
.
Hence, using (87), one gets
c3s1=c3s1(b6s,b6s1, . . . , b5s,b5s1,es,es1)(88)
Proceeding similarly, equating the terms of degrees 5
s
2, 5
s
3
. . .
, 4
s+
1 in
(66)
,
one gets (72).
c.
In the following, we show that
(73)
holds. Equating the terms of degree 4
s
in
(66)
and taking condition e2s6=0 from (77) into account, it follows that
c4s1e1+c4s2e2+. . . +c2s+1e2s1+c2se2s=b4s,
c2s=b4s(c4s1e1+c4s2e2+. . . +c2s+1e2s1)
e2s
Taking (71) and (72) into account, we obtain
c2s=c2s(b6s, . . . , b4s,es, . . . , e1).
Equating the terms of degree 4s1 in (66) and condition e2s6=0, one gets
Mathematics 2021,9, 1600 18 of 23
c4s2e1+c4s3e2+. . . +c2se2s1+c2s1e2s=b4s1,
c2s1=b4s1(c4s2e1+c4s3e2+. . . +c2se2s1)
e2s
.
Taking (71) and (72) into account, we obtain
c2s1=c2s1(b6s, . . . , b4s1,es, . . . , e1).
Proceeding similarly, equating the terms of degrees 4
s
2, 4
s
3,
. . .
, 3
s+
1 in
(66)
and taking (71), (72), and condition e2s6=0 into account, one gets (73).
d.
In the following, we show that
(74)
holds. Equating the terms of degree 3
s
in
(66)
and takingcondition e2s6=0 into account, it follows that
c3s1e1+c3s2e2+. . . +cs+1e2s1+cse2s=b3s,
cs=b3s(c3s1e1+c3s2e2+. . . +cs+1e2s1)
e2s
.
Hence, from (71)–(73), we obtain
cs=cs(b6s, . . . , b3s,es, . . . , e1).
Equating the terms of degree 3s1 in (66) and condition e2s6=0, we obtain
c3s2e1+c3s3e2+. . . +cse2s1+cs1e2s=b3s1,
cs1=b3s1(c3s2e1+c3s3e2+. . . +cse2s1)
e2s
.
Hence, from (71)–(73) we obtain
cs1=cs1(b6s, . . . , b3s1,es, . . . , e1).
Proceeding similarly, equating the terms of degrees 3
s
2, 3
s
3,
. . .
, 2
s+
1, in
(66)
,
one gets (74).
Corollary 1.
If condition
(69)
holds, then the system of 6
s+
1equations with 7
s+
1variables
arising from
(70)
can be reduced using variable substitution to a system of
s
equations with
s
variables, and if there exist at least one solution for that system, then all the coefficients from
(64)(66)can be calculated using the solution of the system.
Proof of Corollary 1.
If we equate the terms of degree 2
s
, 2
s
1,
. . .
,
s+
1 in
(66)
, we obtain
the following system of equations:
c2s1e1+c2s2e2+. . . +c2e2s2+c1e2s1=b2s
c2s2e1+c2s3e2+. . . +c2e2s3+c1e2s2=b2s1
. . .
cse1+cs1e2+. . . +c2es1+c1es=bs+1. (89)
Taking (71), (73), and (74) into account, it follows that system (89) can be written as a
system of
s
equations with a set of
s
unknown variables
{e1,e2. . . , es}
, where, in general,
(89) is nonlinear system since the ckcoefficients depend on the ekcoefficients.
Mathematics 2021,9, 1600 19 of 23
Equating the terms of degrees s,s1, . . . , 0, in (66), one gets
fs=bscs1e1cs2e2. . . c1es1,
fs1=bs1cs2e1cs3e2. . . c1es2,
.
.
.
f2=b2c1e1,
f1=b1,
f0=b0. (90)
Using
(71)
from Proposition 3, one gets that the values
c4sk
and
e2sk
, for
k=0, . . . , s1,
can be calculated explicitly depending on the polynomial coefficients
bifor i=6s, 6s2, . . . , 5s+1.
If there exist at least one solution of system
(89)
, then the values
c3sk
,
c2sk
and
csk
can be calculated for
k=
0,
. . .
,
s
1 (see
(72)
(74)
), and coefficients
fsk
can be calculated
for k=0, . . . , s, using (90), allowing one to obtain all the coefficients from (64)–(66).
Using [
1
] (Table 6), in Table 3, we present the maximum available order for a cost
C(M)in the following cases:
The Paterson–Stockmeyer evaluation formula.
zkps
from
(19)
with
k=
1, denoting the combination of
(17)
with the Paterson–
Stockmeyer formula proposed in [1] (Section 3.1).
zkps
from
(19)
with
k=
2, denoting the combination of
(66)
with the Paterson–
Stockmeyer formula, whenever a solution for the coefficients of z2ps exist.
Table 3.
Maximum available approximation order for a cost
C
using the Paterson–Stockmeyer
method, order denoted by
dPS
, maximum order using
z1ps
from
(19)
combining
(16)
and
(17)
with
the Paterson–Stockmeyer, denoted by
dz1s
, and maximum order using
z2ps
from
(19)
combining
(66)
with the Paterson–Stockmeyer method, denoted by
dz2s
, whenever a solution for the coefficients of
z2ps
exist. Parameters
s
and
p
for
z2ps(x)
such that
s
is minimum to obtain the required order giving
a system (89)of sequations with minimum size.
C(M) 3 4 5 6 7 8 9 10 11 12 13
dPS 6 9 12 16 20 25 30 36 42 49 56
dz1s8 12 16 20 25 30 36 42 49 56 64
dz2s- 12 18 24 30 36 42 49 56 64 72
sz2s-2345667788
pz2s- 0 0 0 0 0 6 7 14 16 24
Table 3also shows the values of
p
and
s
for
z2ps(A)
such that
s
is minimum to obtain
the required order, giving the minimum size of the system
(89)
to solve, i.e.,
s
equations
with
s
unknown variables. Note that it makes no sense to use
(66)
for
s=
1 and cost
C=
3
M
since the order obtained is
m=
6
s=
6 and for that cost the Paterson–Stockmeyer
method obtains the same order. Table 3shows that evaluation formula
z2ps
obtains a greater
order than
z1ps
for
dz1s>
12. Concretely, for
sz2s
5, where the available order with
z2s
is
dz2s=
30, 36, 42,
. . .
,
z2ps
allows increments 10, 11, 12
. . .
of the available order with respect
to using the Paterson–Stockmeyer method, and increments of
sz2s=
5, 6, 6
. . .
with respect
to using z1ps.
In [
5
], real stable solutions were found for the coefficients of
(64)
(66)
for the exponen-
tial Taylor approximation with degrees 6
s
with
s=
4 and 5, i.e., 24, and 30. The following
example deals with the matrix logarithm Taylor approximation.
Mathematics 2021,9, 1600 20 of 23
Example 3.
In this example, we provide real coefficients for calculating the Taylor approximation
of the principal matrix logarithm
log(B)
of order
m=
6
s=
30,
s=
5, in a stable manner based on
the results of Proposition 3and Corollary 1with the following expressions
y05(A) = A5(c1A5+c2A4+c3A3+c4A2+c5A), (91)
y15(A)=(y05(A) + c6A5+c7A4+c8A3+c9A2+c10 A)
×(y05(A) + c11 A5+c12 A4+c13 A3+c14 A2)
+c15y05 (A) + c16 A5+c17 A4+c18 A3+c19 A2+c20 A, (92)
y25(A) = y15 (A)(y05 (A) + c21 A5+c22 A4+c23 A3+c24 A2+c25 A)
+c26 A5+c27 A4+c28 A3+c29 A2+c30 A, (93)
where the coefficients were numbered correlatively. The coefficients
cii=
1, 2,
. . .
, 30, can be
obtained following the procedure from Section 3.3, reducing the whole system of 30 equations with
30 unknown variables to the system
(89)
of
s=
5variables with
s
unknowns
ei
,
i=
1, 2
. . .
, 5,
corresponding in
(93)
to
e1=c25
,
e2=c24
,
. . .
,
e5=c21
. Once this was done, we checked that
e1
and
e2
could be easily solved as functions of
e3
,
e4
and
e5
, reducing the system to a system of
three equations with three unknown variables. To obtain a real solution of the three coefficients,
we used the MATLAB Symbolic Math Toolbox function
vpasolve
giving a range
[
10, 10
]
for
the solutions of the three variables and using 32 decimal digits. The results of the coefficients from
(91)(93)rounded to IEEE double precision arithmetic are given in Table 4.
Note that using the evaluation Formulas
(91)
(93)
, the Taylor approximation
y25(A)
of order
m=
30 can be computed with a cost of 7
M
. For the same order the cost of the Paterson–Stockmeyer
method is 9
M
, and using
z1ps
from
(19)
the cost is 8
M
(see Table 3). Similarly to [
11
], we computed
the value such that the relative backward error is lower than
u
for the Taylor approximation of
log(IA)of order m =30 giving θ30 =0.329365534847136.
Similarly to Example 2, to check if
y25(A)
is competitive, we prepared a new matrix test set
with 50 8
×
8matrices of the Matrix Computation Toolbox [
18
] reducing their norms so that they
are random with a uniform distribution in
[
0.3,
θ30]
, and the inverse scaling algorithm is not used
in either the Padé and Taylor algorithms. Then, we compared the results of using
(91)
(93)
with
the results given by function
logm_iss_full
from [
20
] for the previous matrix set, computing
the “exact” values of the matrix logarithm in the same way. The error of using the evaluation
Formulas (91)(93)
was lower than
logm_iss_full
in 97.62% of the matrices with a 42.40%
lower relative cost in flops, being competitive in efficiency and accuracy for future implementations
for computing the matrix logarithm.
Table 4.
Coefficients of
y05
,
y15
,
y25
from (91)–(93) for computing the Taylor approximation of
log(B) = log(IA) = y25 (A)of order m=30.
c13.218297948685432 ×101c16 2.231079274704953 ×101
c21.109757913339804 ×101c17 3.891001336083639 ×101
c37.667169819995447 ×102c18 6.539646241763075 ×101
c46.192062222365700 ×102c19 8.543283349051067 ×101
c55.369406358130299 ×102c20 1.642222074981266 ×102
c62.156719633283115 ×101c21 6.179507508449100 ×102
c72.827270631646985 ×102c22 3.176715034213954 ×102
c81.299375958233227 ×101c23 8.655952402393143 ×102
c93.345609833413695 ×101c24 3.035900161106295 ×101
c10 8.193390302418316 ×101c25 9.404049154527467 ×101
c11 1.318571680058333 ×101c26 2.182842624594848 ×101
c12 1.318536866523954 ×101c27 5.036471128390267 ×101
c13 1.718006767617093 ×101c28 4.650956099599815 ×101
c14 1.548174815648151 ×101c29 5.154435371157740 ×101
c15 2.139947460365092 ×101c30 1
Mathematics 2021,9, 1600 21 of 23
Note that using the evaluation formulae from Sections 1and 2with cost 4
M
and 5
M
,
one can get an order of approximation 15
+
and 21
+
, respectively, whereas using
z2ps
from
(19)
combining
(66)
with the Paterson–Stockmeyer method, the orders that can be obtained
are lower, i.e., 12 and 18, respectively (see Table 3). Note that for the approximation 15
+
where
s=
2 (see Section 3.1), one gets order 15
+=(
6
s+
3
)+
and the total degree of the
polynomial obtained is 8
s=
16. For the approximation 21
+
where
s=
3, one gets order
21
+=(
6
s+
3
)+
and the total degree of the polynomial degree is 8
s=
24. The next step in
our research is to extend the evaluation formulae from Propositions 1and 2to evaluate
polynomial approximations of order (6s+3)+ of the type
y0s(A) = As
s
i=1
ciAi, (94)
y1s(A) =
4s
i=s+1
aiAi= y0s(A) +
s
i=1
diAi! y0s(A) +
s
i=2
eiAi!
+f0y0s(A) +
s
i=3
fiAi, (95)
y2s(A) = y1s(A) +
s
i=1
giAi! y1s(A) + h0y0s(A) +
s
i=1
hiAi!
+j0y1s(A) + k0y0s(A) +
s
i=0
liAi. (96)
Those formulae correspond to a particular case of Formulas (62)–(65) of [
1
] (
Prop. 2
)
where
k=
2. It is easy to show that the degree of
y2s(A)
is 8
s
and the total number
of coefficients of
y2s
is 6
s+
4, i.e., 3
s
coefficients
ai
,
s
coefficients
gi
,
s
coefficients
hi
,
s+
1 coefficients
li
, and coefficients
f0
,
j0
and
k0
. Using
vpasolve
in a similar way as in
Example 2
, we could find solutions for the coefficients of
(94)
(96)
and
(19)
so that
y2s(A)
and
z2ps
allows to evaluate matrix logarithm Taylor-based approximations of orders from
15
+
up to 75
+
. Similarly, we could also find the coefficients for Formulas
(94)
(96)
to
evaluate matrix hyperbolic tangent Taylor approximations of orders higher than 21. Then,
our next research step is to show that evaluation Formulas
(94)
(96)
and its combination
with the Paterson–Stockmeyer method from
(19)
can be used for the general polynomial
approximations of matrix functions.
4. Conclusions
In this paper, we extend the family of methods for evaluating matrix polynomials
from [
1
], obtaining general solutions for new cases of the general matrix polynomial
evaluation Formulas (62)–(65) from Proposition 2 from [
1
] (Section 5). These cases allow
to compute matrix polynomial approximations of orders 15 and 21 with a cost of 4
M
and
5
M
, respectively, whenever a stable solution for the coefficients exist. Moreover, a general
method for computing matrix polynomials of order
m=
6
s
, for
s=
3, 4... more efficiently
than the methods provided in [
1
] was provided. Combining this method with the Paterson–
Stockmeyer method, polynomials or degree greater than 30 can be evaluated with two
matrix products less than using Paterson–Stockmeyer method as shown in Table 3.
Examples for evaluating Taylor approximations of the matrix cosine and the matrix
logarithm were given. The accuracy and efficiency results of the proposed evaluation
formulae were compared to state-of-the-art Padé algorithms, being competitive for future
implementations for computing both functions.
Future work will deal with the generalization of more efficient evaluation formulae
based on the evaluation Formulas (62)–(65) from Proposition 2 from [
1
] (Section 5), its
combinations with Paterson–Stockmeyer method
(19)
, and in general, evaluation formulae
based on products of matrix polynomials.
Mathematics 2021,9, 1600 22 of 23
Author Contributions:
Conceptualization, J.S.; methodology, J.S. and J.I.; software, J.S.; validation,
J.S. and J.I.; formal analysis, J.S. and J.I.; investigation, J.S. and J.I.; resources, J.S. and J.I.; writing—
original J.S. and J.I.; writing—review and editing, J.S. and J.I. Both authors have read and agreed to
the published version of the manuscript.
Funding:
This research was partially funded by the European Regional Development Fund (ERDF)
and the Spanish Ministerio de Economía y Competitividad grant TIN2017-89314-P, and by the
Programa de Apoyo a la Investigación y Desarrollo 2018 of the Universitat Politècnica de València
grant PAID-06-18-SP20180016.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
MDPI Multidisciplinary Digital Publishing Institute
DOAJ Directory of open access journals
TLA Three letter acronym
LD Linear dichroism
References
1. Sastre, J. Efficient evaluation of matrix polynomials. Linear Algebra Appl. 2018,539, 229–250. [CrossRef]
2.
Paterson, M.S.; Stockmeyer, L.J. On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J. Comput.
1973,2, 60–66. [CrossRef]
3.
Higham, N.J. Functions of Matrices: Theory and Computation; Society for Industrial and Applied Mathematics: Philadelphia,
PA, USA, 2008.
4.
Sastre, J.; Ibáñez, J.; Alonso, P.; Peinado, J.; Defez, E. Fast taylor polynomial evaluation for the computation of the matrix cosine.
J. Comput. Appl. Math. 2019,354, 641–650. [CrossRef]
5. Sastre, J.; Ibáñez, J.E. Defez, Boosting the computation of the matrix exponential. Appl. Math. Comput. 2019,340, 206–220.
6.
Al-Mohy, A.H.; Higham, N.J. A new scaling and squaring algorithm for the matrix exponential. SIAM J. Matrix Anal. Appl.
2009
,
31, 970–989. [CrossRef]
7.
Al-Mohy, A.H.; Higham, N.J.; Relton, S. New Algorithms for Computing the Matrix Sine and Cosine Separately or Simultaneously.
SIAM J. Sci. Comput. 2015,37, A456–A487. [CrossRef]
8.
Bader, P.; Blanes, S.; Casas, F. Computing the Matrix Exponential with an Optimized Taylor Polynomial Approximation.
Mathematics 2019,7, 1174. [CrossRef]
9. Bader, P.; Blanes, S.; Casas, F. An improved algorithm to compute the exponential of a matrix. arXiv 2017, arXiv:1710.10989.
10.
Sastre, J. On the Polynomial Approximation of Matrix Functions. Available online: http://personales.upv.es/~jorsasma/AMC-S-
16-00951.pdf (accessed on 20 April 2020).
11.
Al-Mohy, A.H.; Higham, N.J. Improved inverse scaling and squaring algorithms for the matrix logarithm. SIAM J. Sci. Comput.
2012,34, C153–C169. [CrossRef]
12. Moler, C.B.; Loan, C.V. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev. 2003,
45, 3–49. [CrossRef]
13.
Blackford, S.; Dongarra, J. Installation Guide for LAPACK, LAPACK Working Note 41. Available online: http://www.netlib.org/
lapack/lawnspdf/lawn41.pdf (accessed on 20 April 2020).
14.
Sastre, J. Efficient mixed rational and polynomial approximation of matrix functions. Appl. Math. Comput.
2012
,218, 11938–11946.
[CrossRef]
15.
Sastre, J.; Ibáñez, J.; Alonso, P.; Peinado, J.; Defez, E. Two algorithms for computing the matrix cosine function. Appl. Math.
Comput. 2017,312, 66–77. [CrossRef]
16.
Fasi, M. Optimality of the Paterson–Stockmeyer method for evaluating matrix polynomials and rational matrix functions. Linear
Algebra Appl. 2019,574, 182–200. [CrossRef]
17.
Ibáñez, J.; Alonso, J.M.; Sastre, J.; Defez, E.; Alonso-Jordá, P. Advances in the Approximation of the Matrix Hyperbolic Tangent.
Mathematics 2021,9, 1219. [CrossRef]
18.
Higham, N.J. The Matrix Computation Toolbox. Available online: http://www.ma.man.ac.uk/~higham/mctoolbox (accessed on
18 April 2020).
Mathematics 2021,9, 1600 23 of 23
19. Davies, E.B. Approximate diagonalization. SIAM J. Matrix Anal. Appl. 2007,29, 1051–1064. [CrossRef]
20.
Higham, N. Matrix Logarithm. 2020. Available online: https://www.mathworks.com/matlabcentral/fileexchange/33393-matrix-
logarithm (accessed on 18 April 2020).
... For the lower orders m = 1, 2, and 4, the corresponding polynomial approximations from (3) use the Paterson-Stockmeyer method for the evaluation. For the Taylor approximation orders m ≥ 8, Sastre evaluation formulas from [41,42] are used for evaluating Taylor-based polynomial approximations from (3) more efficiently than the Paterson-Stockmeyer method. For reasons that will be shown below, the Taylor approximation of − log(I − A), denoted by T m , is used in all the Sastre approximations and evaluation formulas of the matrix logarithm. ...
... For higher orders of approximation, we followed a similar procedure as that described in [43] (Sections 3.2 and 3.3), where evaluation formulas for Taylor-based approximations of the matrix exponential of orders m = 15+ and 21+ were given. Both of these formulas were generalized to any polynomial approximation in [42] (Section 3), and this type of polynomial approximation evaluation formula was first presented in [41] (Example 5.1). Following the notation given in [43] (Section 4), the suffix "+" in m means that the corresponding Taylor-based approximation is more accurate than the Taylor one of order m. ...
... Proposition 1 and the MATLAB code fragments from [42] (Section 3.1) allow obtaining all the solutions for the coefficients of the evaluation formulas (10)-(12) from [43] (Section 3.2), not only for the exponential Taylor approximation of order m = 15+, but for whatever polynomial approximation of the same order. Unfortunately, there were no real solutions for the evaluation formulas of the logarithm Taylor approximation of order m = 15+. ...
Article
Full-text available
The most popular method for computing the matrix logarithm is a combination of the inverse scaling and squaring method in conjunction with a Padé approximation, sometimes accompanied by the Schur decomposition. In this work, we present a Taylor series algorithm, based on the free-transformation approach of the inverse scaling and squaring technique, that uses recent matrix polynomial formulas for evaluating the Taylor approximation of the matrix logarithm more efficiently than the Paterson–Stockmeyer method. Two MATLAB implementations of this algorithm, related to relative forward or backward error analysis, were developed and compared with different state-of-the art MATLAB functions. Numerical tests showed that the new implementations are generally more accurate than the previously available codes, with an intermediate execution time among all the codes in comparison.
... As can be seen in Table 4, both approaches lead to polynomials of degree far lower than the maximum obtainable and considerably below the number of degrees of freedom in the scheme (2). New approaches to determine the coe cients using symbolic computation have recently been proposed [33]. Remark 5 (Related notation) Some literature focusing on minimizing the number of matrix-matrix multiplications [30,31,34] uses a slightly di erent compact form to express multiplication-e cient polynomial evaluation. ...
Preprint
Full-text available
Many numerical methods for evaluating matrix functions can be naturally viewed as computational graphs. Rephrasing these methods as directed acyclic graphs (DAGs) is a particularly effective way to study existing techniques, improve them, and eventually derive new ones. As the accuracy of these matrix techniques is determined by the accuracy of their scalar counterparts, the design of algorithms for matrix functions can be viewed as a scalar-valued optimization problem. The derivatives needed during the optimization can be calculated automatically by exploiting the structure of the DAG, in a fashion akin to backpropagation. The Julia package GraphMatFun.jl offers the tools to generate and manipulate computational graphs, to optimize their coefficients, and to generate Julia, MATLAB, and C code to evaluate them efficiently. The software also provides the means to estimate the accuracy of an algorithm and thus obtain numerically reliable methods. For the matrix exponential, for example, using a particular form (degree-optimal) of polynomials produces algorithms that are cheaper, in terms of computational cost, than the Pad\'e-based techniques typically used in mathematical software. The optimized graphs and the corresponding generated code are available online.
Article
Many numerical methods for evaluating matrix functions can be naturally viewed as computational graphs. Rephrasing these methods as directed acyclic graphs (DAGs) is a particularly effective approach to study existing techniques, improve them, and eventually derive new ones. The accuracy of these matrix techniques can be characterized by the accuracy of their scalar counterparts, thus designing algorithms for matrix functions can be regarded as a scalar-valued optimization problem. The derivatives needed during the optimization can be calculated automatically by exploiting the structure of the DAG, in a fashion analogous to backpropagation. This paper describes GraphMatFun.jl , a Julia package that offers the means to generate and manipulate computational graphs, optimize their coefficients, and generate Julia, MATLAB, and C code to evaluate them efficiently at a matrix argument. The software also provides tools to estimate the accuracy of a graph-based algorithm and thus obtain numerically reliable methods. For the exponential, for example, using a particular form (degree-optimal) of polynomials produces implementations that in many cases are cheaper, in terms of computational cost, than the Padé-based techniques typically used in mathematical software. The optimized graphs and the corresponding generated code are available online.
Article
Full-text available
In this paper, we introduce two approaches to compute the matrix hyperbolic tangent. While one of them is based on its own definition and uses the matrix exponential, the other one is focused on the expansion of its Taylor series. For this second approximation, we analyse two different alternatives to evaluate the corresponding matrix polynomials. This resulted in three stable and accurate codes, which we implemented in MATLAB and numerically and computationally compared by means of a battery of tests composed of distinct state-of-the-art matrices. Our results show that the Taylor series-based methods were more accurate, although somewhat more computationally expensive, compared with the approach based on the exponential matrix. To avoid this drawback, we propose the use of a set of formulas that allows us to evaluate polynomials in a more efficient way compared with that of the traditional Paterson–Stockmeyer method, thus, substantially reducing the number of matrix products (practically equal in number to the approach based on the matrix exponential), without penalising the accuracy of the result.
Article
Full-text available
A new way to compute the Taylor polynomial of a matrix exponential is presented which reduces the number of matrix multiplications in comparison with the de-facto standard Paterson-Stockmeyer method for polynomial evaluation. Combined with the scaling and squaring procedure, this reduction is sufficient to make the Taylor method superior in performance to Padé approximants over a range of values of the matrix norms. An efficient adjustment to make the method robust against overscaling is also introduced. Numerical experiments show the superior performance of our method to have a similar accuracy in comparison with state-of-the-art implementations, and thus, it is especially recommended to be used in conjunction with Lie-group and exponential integrators where preservation of geometric properties is at issue.