ArticlePDF Available

Abstract and Figures

The computation of matrix trigonometric functions has received remarkable attention in the last decades due to its usefulness in the solution of systems of second order linear differential equations. Several state-of-the-art algorithms have been provided recently for computing these matrix functions. In this work, we present two efficient algorithms based on Taylor series with forward and backward error analysis for computing the matrix cosine. A MATLAB implementation of the algorithms is compared to state-of-the-art algorithms, with excellent performance in both accuracy and cost.
Content may be subject to copyright.
Two algorithms for computing the matrix cosine
function
Jorge Sastre,JavierIb´a˜nez
,PedroAlonso
,Jes´usPeinado
,EmilioDefez
Instituto de Telecomunicaciones y Aplicaciones Multimedia.
Instituto de Instrumentaci´on para Imagen Molecular.
Dept. of Information Systems and Computation.
Instituto de Matem´atica Multidisciplinar.
Universitat Polit`ecnica de Val`encia, Camino de Vera s/n, 46022, Valencia, Espa˜na.
jsastrem@upv.es, jjibanez@dsic.upv.es, palonso@dsic.upv.es, jpeinado@dsic.upv.es,
edefez@imm.upv.es
Abstract
The computation of matrix trigonometric functions has received remarkable
attention in the last decades due to its usefulness in the solution of systems of
second order linear dierential equations. Several state-of-the-art algorithms
have been provided recently for computing these matrix functions. In this
work we present two ecient algorithms based on Taylor series with forward
and backward error analysis for computing the matrix cosine. A MATLAB
implementation of the algorithms is compared to state-of-the-art algorithms,
with excellent performance in both accuracy and cost.
Keywords: matrix cosine, scaling and recovering method, Taylor series,
forward error analysis, backward error analysis, MATLAB.
1. Introduction
Many engineering processes are described by second order dierential
equations, whose solution is given in terms of the trigonometric matrix func-
tions sine and cosine. Examples arise in the spatial semi-discretization of
e-mail: jjibanez@dsic.upv.es. This work has been supported by Spanish Ministerio
de Econom´ıa y Competitividad and the European Regional Development Fund (ERDF)
grant TIN2014-59294-P.
Preprint submitted to Applied Mathematical Modelling. February 21, 2017
the wave equation or in mechanical systems without damping, where their
solutions can be expressed in terms of integrals involving the matrix sine and
cosine [1, 2]. Several state-of-the-art algorithms have been provided recently
for computing these matrix functions using polynomial and rational approxi-
mations with scaling and recovering techniques [3, 4, 5, 6]. In order to reduce
computational costs Paterson-Stockmeyer method [7] is used to evaluate the
matrix polynomials arising in these approximations.
In the Taylor algorithm proposed in [4] we used sharp absolute forward
error bounds. In the Taylor algorithm proposed in [6] we improved the previ-
ous algorithm using relative error bounds based on backward error bounds of
the matrix exponentials involved in cos(A). Those error bounds do not guar-
antee that the cosine backward error bound in exact arithmetic is less than
the unit roundoin double precision arithmetic [6, Sec. 2]. However, accord-
ing to the tests, that algorithm improved the accuracy with respect to the
previous Taylor algorithm at the expense of some increase in cost (measured
in flops). The algorithm proposed in [6] was also superior in both accuracy
and cost to the version of the scaling and recovering Pad´e state-of-the-art
algorithm in [5] not using the Schur decomposition.
Other algorithms based on approximations on Lfor normal and non-
negative matrices have been presented recently in [8]. In this work we focus
on general matrices and algorithms using approximations at the origin. We
present two algorithms based on Taylor series that use Theorem 1 from [4]
for computing the matrix cosine. We provide relative forward and backward
error analysis for the matrix cosine Taylor approximation that improves even
more the comparison to the algorithm in [5] with and without Schur decom-
position in both accuracy and cost tests.
Throughout this paper Cn×ndenotes the set of complex matrices of size
n×n,Ithe identity matrix for this set, ρ(X)thespectralradiusofmatrixX,
and Nthe set of positive integers. In this work we use the 1-norm to compute
the actual norms. This paper is organized as follows. Section 2 presents a
Tay l o r a l g o r i thm for computing the matrix cosine function . S e c t i o n 3 d e a l s
with numerical tests and, finally, Section 4 gives some conclusions.
2
2. Algorithms for computing matrix cosine
The matrix cosine can be defined for all ACn×nby
cos(A)=
!
i=0
(1)iA2i
(2i)! ,
and let
T2m(A)=
m
!
i=0
(1)iBi
(2i)! Pm(B),(1)
be the Taylor approximation of order 2mof cos(A), where B=A2.Since
Tay l o r s e r i es are accurate only near the origin, i n a l g o r i t h m s t h a t u s e t h i s
approximation the norm of matrix Bis reduced by scaling the matrix. Then,
aTaylorapproximationiscomputed,andfinallytheapproximationofcos(A)
is recovered by means of the double angle formula cos(2X)=2cos
2(X)I.
Algorithm 1 shows a general algorithm for computing the matrix cosine based
on Taylor approximation. By using the fact that sin(A)=cos(Aπ
2I),
Algorithm 1 also can be easily used to compute the matrix sine.
Algorithm 1 Given a matrix ACn×n,thisalgorithmcomputesC=
cos(A)byTaylorseries.
1: Select adequate values of mand sPhase I
2: B=4
sA2
3: C=Pm(B)Phase II: Compute Taylor approximation
4: for i=1:sdo Phase III: Recovering cos(A)
5: C=2C2I
6: end for
In Phase I of Algorithm 1, mand smust be calculated so that the Taylor
approximation of the scaled matrix is computed accurately and eciently.
In this phase some powers Bi,i2, are usually computed for estimating m
and sand if so they are used in Phase II.
Phase II consists of computing the Taylor approximation (1). For clarity
of the exposition we recall some results summarized in [6, Sec. 2]. Tay-
lor matrix polynomial approximation (1), expressed as Pm(B)="m
i=0 piBi,
BCn×n,canbecomputedwithoptimalcostbythePaterson-Stockmeyers
method [7] choosing mfrom the set
M={1,2,4,6,9,12,16,20,25,30,36,42,...},
3
where the elements of Mare denoted as m1,m
2,m
3,... The algorithm com-
putes first the powers Bi,2iqnot computed in the previous phase,
being q=#mk$or q=mkan integer divisor of mk,k1, both val-
ues giving the same cost in terms of matrix products. Therefore, (1) can be
computed eciently as
Pmk(B)= (2)
(((pmkBq+pmk1Bq1+pmk2Bq2+···+pmkq+1B+pmkqI)Bq
+pmkq1Bq1+pmkq2Bq2+···+pmk2q+1 B+pmk2qI)Bq
+pmk2q1Bq1+pmk2q2Bq2+···+pmk3q+1 B+pmk3qI)Bq
...
+pq1Bq1+pq2Bq2+···+p1B+p0I.
Table 1 (page 10) shows the values of qfor dierent values of m.From
Table 4.1 from [9, p. 74] the cost of computing (1 ) with (2) is Πmk=k
matrix products, k=1,2,...
Finally, Phase III is necessary to obtain the cosine of matrix Afrom
cos(4sB)computedpreviouslyinPhaseII.Ifmkis the order used and
sis the scaling parameter, then the computational cost of Algorithm 1 is
2(k+s)n3flops, and the storage cost is (2 + qk)n2.
The diculty of Algorithm 1 is to find appropriate values of mkand s
such that cos(A)iscomputedaccuratelywithminimumcost. Forthat,in
the following sections we will use Theorem 1:
Theorem 1 ([4]). Let hl(x)="
il
pixibe a power series with radius of con-
vergence w,˜
hl(x)="
il|pi|xi,BCn×nwith ρ(B)<w,lNand tN
with 1!t!l.Ift0is the multiple of tsuch that l!t0!l+t1and
βt=max{d1/j
j:j=t, l, l +1,...,t
01,t
0+1,t
0+2,...,l+t1},
where djis an upper bound for ||Bj||,dj"||Bj||, then
||hl(B)|| !˜
hl(βt).
2.1. Relative forward error in Taylor approximation
The following proposition gives a necessary and sucient condition for
the existence of cos1(A).
4
Proposition 1. Let Abe a matrix in Cn×nand let B=A2.IfB<a
2
where a=acosh(2)then cos (A)is invertible.
Proof. Since
Icos(A)=%
%
%
%
%"
k1
(1)kA2k
(2k)! %
%
%
%
%"
k1A2k
(2k)! <"
k1
a2k
(2k)! =cosh(a)1=1,
then, by applying Lema 2.3.3 from [10, pp. 58], we obtain that I(I
cos(A)) = cos(A)isinvertible. #
Using Proposition 1, if
B=A2<acosh2(2) 1.7343,(3)
then cos1(A)existsanditfollowsthattherelativeforwarderrorofcomput-
ing cos(A)bymeansof(1),denotedbyEf,is
Ef=%
%cos1(A)(cos(A)T2m(A))%
%=%
%
%
%
%!
im+1
e(2m)
iA2i%
%
%
%
%
=%
%
%
%
%!
im+1
e(2m)
iBi%
%
%
%
%
,
where the coecients e(2m)
idepend on Taylor approximation order 2m.Ifwe
define gm+1(x)= "
i!m+1
e(2m)
ixiand ˜gm+1(x)= "
im+1 &&&e(2m)
i&&&xi,andweapply
Theorem 1, then
Ef=gm+1(B)∥≤˜gm+1(β(m)
t),(4)
for every t,1tm+1. Following [4, Sec. 5.1], in (4) we denote by β(m)
t
the corresponding value of βtfrom Theorem 1 for order m,andfromnowon
we will use that nomenclature.
Let Θmbe
Θm=max'θ0: !
im+1 &&&e(2m)
i&&&θiu(,(5)
where u=2
53 is the unit roundoin double precision floating-point arith-
metic. We have used MATLAB Symbolic Math Toolbox to evaluate
"
im+1 &&&e(2m)
i&&&θifor each min 250-digit decimal arithmetic, adding the first
5
250 series terms with the coecients obtained symbolically. Then, a numer-
ical zero-finder is invoked to determine the highest value of Θmsuch that
"
im+1 &&&e(2m)
i&&&Θi
muholds. For this analysis to hold it is necessary that
cos(A)isinvertible. Hence,ifcondition(3)holdsandβ(m)
tΘm,then
Efu.
Some values of Θmare given in Table 1.
2.2. Backward error in Taylor approximation
In [5], a backward error analysis is made for computing sine and cosine
matrix functions. For each matrix function, two analysis were made. In the
cosine function, the first one is based on considering the function
h2m(x):=arccos(rm(x)) x,
where rm(x)isthe[m/m]Pad´eapproximanttothecosinefunction,andthe
authors conclude that dierent restrictions make this analysis unusable [5,
Sec. 2.2]. We checked that an error analysis for the matrix cosine Tay-
lor approximation similar to that on [5, Sec. 2.2] yields analogous results.
Therefore, in order to calculate the backward error Xof approximating
cos(X)byTaylorpolynomialT2m(X)suchthat
T2m(X)=cos(X+X),(6)
we propose a dierent approach that holds for any matrix XCr×rand
uses the following result whose proof is trivial:
Lemma 1. If Aand Bare matrices in Cr×rand AB =BA then
cos(A+B)=cos(A)cos(B)sin(A)sin(B).(7)
Note that the backward error Xfrom (6) is a holomorphic function of X
and then XX=XX.Therefore,using(6)andLemma1
T2m(X)=cos(X+X)=cos(X)cos(X)sin(X)sin(X)
=cos(X)!
i0
(1)iX2i
(2i)! sin(X)!
i0
(1)iX2i+1
(2i+1)! .
6
Hence,
cos(X)T2m(X)= !
im+1
(1)iX2i
(2i)! (8)
= sin(X)!
i0
(1)iX2i+1
(2i+1)! cos(X)!
i1
(1)iX2i
(2i)! ,
and consequently the backward error Xcan be expressed by
X=!
im
c(2m)
iX2i+1,(9)
where coecients c(2m)
idepend on Taylor approximation order 2m,andX
commutes with X.Notethatanexpressionsimilarto(8)canbeobtained
for other approximations of the matrix cosine such as Pad´e approximation.
Using (8) and (9) it follows that
sin(X)X+O(X4m+2)= !
im+1
(1)iX2i
(2i)!.(10)
Hence, coecients c(2m)
i,i=m, m+1,...,2m1canbecomputedobtaining
symbolically the Taylor series of sin(X)Xfrom the left-hand side of (10)
and solving the system of equations that arise when equating the coecients
of X2i,i=m+1,m +2,...,2m,frombothsidesof(10)usingfunction
solve from the MATLAB Symbolic Math Toolbox.
Analogously, using (8) and (9) it follows that
sin(X)X+cos(X)X2
2! +O(X6m+4)= !
im+1
(1)iX2i
(2i)!,(11)
and coecients c(2m)
i,i=2m, 2m+1,...,3m+1, can be calculated by using
coecients c(2m)
i,i=m, m +1,...,2m1, obtained previously, computing
symbolically the Taylor series of sin(X)X+cos(X)X2
2! in the left-hand
side of (11) and solving the system of equations that arise when equating
the coecients of X2i,i=2m+1,2m+2,...,3m+1, from both sides of
(11). By proceeding analogously, c(2m)
i,i>3m+1can be computed. Then,
7
we compute the relative backward error of approximating cos(A)byT2m(A),
denoted by Eb,as
Eb=A
A=%
%
%
%"
im
c(2m)
iA2i+1 %
%
%
%
A%
%
%
%
%!
im
c(2m)
iA2i%
%
%
%
%
=%
%
%
%
%!
im
c(2m)
iBi%
%
%
%
%
.
If we define hm(x)= "
i!m
c(2m)
ixiand ˜
hm(x)= "
im&&&c(2m)
i&&&xi,andweapply
Theorem 1, then
Eb≤∥hm(B)∥≤˜
hm(β(m)
t).(12)
Let ¯
Θmbe
¯
Θm=max'θ0: !
im&&&c(2m)
i&&&θiu(.(13)
For computing ¯
Θm,wehaveusedMATLABSymbolicMathToolboxtoevalu-
ate "
im&&&c(2m)
i&&&θiin 250-digit decimal arithmetic for each madding a dierent
number of series terms depending on mwith the coecients obtained sym-
bolically, and a numerical zero-finder was invoked to determine the highest
value of ¯
Θmsuch that "
im&&&c(2m)
i&&&
¯
Θi
muholds. We have checked that for
m=1,2,4,6,9,12 the values of ¯
Θmobtained in double precision arithmetic
do not vary if we take more than 256 series terms (and even fewer terms for
the lower orders).
Note that the values ¯
Θ1=2.66 ·1015 ,¯
Θ2=2.83 ·107,¯
Θ4=4.48 ·103
and ¯
Θ6=1.45 ·101,presentedwithtwosignicantdigits,arelowerthan
the corresponding Θmvalues, m=1,2,4,6, from Table 1, for the forward
error analysis, respectively.
On the other hand, considering 1880 and 1946 series terms for m=
16 the corresponding ¯
Θ16 values have a relative dierence of 4 .0010 ·104.
Considering 1880 and 1980 series terms for m=20thecorresponding ¯
Θ20
values have a relative dierence of 1 .6569 ·103.Theprocessofcomputing
those values for so many terms was very time consuming and we took as
final values the ones shown in Table 1 corresponding to 1946 series terms for
m=16and1980seriestermsform=20.
With the final selected values of ¯
Θmin Table 1 it follows that if β(m)
t
¯
Θm,thenrelativebackwarderrorislowerthantheunitroundoindouble
8
precision floating-point arithmetic, i.e.
Ebufor mk=9,12,and Eb$ufor mk=16,20.
2.3. Backward error in double angle formula of the matrix cosine
We are int e r e sted in the backwa r d e r r o r i n P h a s e II I o f A l g o r ithm 1. I n t h e
previous section we have shown that it is possible to obtain a small backward
error in the Taylor approximation of the matrix cosine. As in [5, Sec. 2.3], it
can be shown that the backward error propagates linearly through the double
angle formula. A result for the backward error similar to Lemma 2.1 from [5]
can be obtained for polynomial approximations of the matrix cosine.
Lemma 2. Let ACn×nand X=2
sA,sinteger non negative, and sup-
pose that t(X)=cos(X+X)for a polynomial function t. Then the approx-
imation Yby applying the double angle formula satisfies Y=cos(A+A)
in exact arithmetic, and hence
A
A=X
X.
Proof. The proof is similar to that given in Lemma 2.1 from [5].
Lemma 2 shows that if we choose mand s,suchthatX/X∥≤u,
with X=4
sA,thenifs1thetotalbackwarderrorinexactarithmetic
after Phase III of Algorithm 1 is bounded by u,producingnoerrorgrowth.
2.4. Determining the values of the Taylor approximation order mand the
scaling parameter s
Since Θ60.1895 <acosh2(2) 1.7343 <¯
Θ91.7985, see (3), and the
values Θmkfor the forward error analysis and mk6aregreaterthanthe
corresponding ¯
Θmkvalues for the backward error analysis, we use the relative
forward analysis for mk6, and the relative backward analysis for mk9.
Therefore, by Theorem 1, for mk=1,2,4,6, if there exists tsuch that
β(mk)
tΘmk,onegetsthat(2.1)holds,andformk=9,12,16,20, if there
exists tsuch that β(mk)
t¯
Θmkit follows that (2.2) holds. Table 1 shows
the values Θmk,mk=1,2,4,6and ¯
Θmk,mk=9,12,16,20. For simplicity of
notation, from now on we will denote by Θmkboth Θmk,mk6and ¯
Θmk,
mk9.
The selection of the order mand scaling parameter sis as follows. If there
exists tand mksuch that β(mk)
tΘ9then it is not necessary to scale Band
9
Table 1: Va l ues of Θmk(forward analysis), ¯
Θmk(backward analysis) and qk
used to compute (1) by Paterson-Stockmeyer method (2).
km
kqkΘmkkm
kqk¯
Θmk
1115.161913593731081e-8 5 9 3 1.798505876916759
222
4.307691256676447e-5 6 12 4 6.752349007371135
342
1.319680929892753e-2 7 16 4 9.971046342716772
463
1.895232414039165e-1 8 20 5 10.177842844012551
the Taylor approximation order mk,wherek=min
)k:β(mk)
tΘmk*is
selected, i.e. the order mkproviding the minimum cost. Since in this case no
scaling is applied then the double angle formula of Phase III of Algorithm 1
is not applied. Else, we scale the matrix Bby the scaling parameter
s=max'0,+1
2log2,β(mk)
t
Θmk-.(,m
k{9,12,16},(14)
such that the matrix cosine is computed with minimum cost. Lemma 2
ensures that the backward error propagates linearly through the double angle
formula in exact arithmetic if s1. Following [11, Sec. 3.1], the explanation
for the minimum and maximum orders mkto be used in (14) for scaling
(giving the minimum cost) is as follows: Since Θ9/4>Θ6and Θ9·4>Θ12
the minimum order to select for scaling is m=m5=9. Ontheotherhand,
since Θ20/4<Θ12 and Θ16 /4>Θ9,if||B|| >Θ9the maximum order to
select for scaling is m=m7=16. Following[11, Sec. 3.1] thenalselection
of mkis the maximum order mk{9,12,16}giving also the minimum cost.
This selection provides the minimum scaling parameter sover all selections
of mkthat provide the minimum cost.
Then the Taylor approximation of order mkof cos(B/4s)iscomputed,
and if s1therecoveringPhaseIIIofAlgorithm1isapplied.
For computing the parameters β(mk)
tfor 1 mk16, from Theorem 1,
it is necessary to calculate upper bounds dkfor ||Bk||.Wehavedeveloped
two dierent algorithms to obtain the order mkand the scaling parameter
s.Algorithm2usesanestimationβ(mk)
min of the minimum of the values β(mk)
t
from Theorem 1 obtained using Algorithm 3. In order to calculate the upper
bounds dkof ||Bk|| for obtaining β(mk)
min Algorithm 3 uses only products of
norms of matrix powers previously computed in Algorithm 2, i.e. Bi,
10
i4. For instance, note that in order to compute β(m)
2for the relative
forward error bound and m=2,using(1)weneedonlyboundsd1/2
2and
d1/3
3.FromTable1form=2onegetsq=q2=2andBi,i=1,2, are
available, and we take d1=Band d2=B2.Since||B2||1/2||B||
it follows that β(2)
min =β(2)
2=max{d1/2
2,(d2d1)1/3=(d2d1)1/3,(Step3of
Algorithm 3). Something similar happens with β(4)
2,resultinginβ(4)
min =
β(4)
2=max{d1/2
2,(d2
2d1)1/5}=(d2
2d1)1/5,(Step5ofAlgorithm3).
Using Table 1, for m=6onegetsq=q4=3,beingnowalsoavailable
B3,andwetaked3=B3.Using(1),ifd1/2
2d1/3
3we select β(6)
min =β(6)
2=
max{d1/2
2,(d2
2d3)1/7}=(d2
2d3)1/7,(Step8ofAlgorithm3).Else,weselect
β(6)
min =β(6)
3=max{d1/3
3,min{d2
2d3,d
1d2
3}1/7,(d2
3d2)1/8}
=max{min{d2
2d3,d
1d2
3}1/7,(d2
3d2)1/8},
(Step 10 of Algorithm 3). Value β(mk)
min is obtained analogously for mk=
9,12,16 (Steps 12-37 of Algorithm 3).
11
Algorithm 2 Given a matrix ACn×n,thisalgorithmdeterminestheorder
m,scalingparameters,andpowersofB=A2needed for computing Taylor
approximation of cos(A)usingnoestimationofnormsofmatrixpow
ers.
1: B1=A2
2: if B1∥≤Θ1then m=1,s=0,quit
3: B2=B2
1,obtainβ(2)
min using Algorithm 3 with m=2,q=2
4: if β(2)
min Θ2then m=2,s=0,quit
5: β(4)
min =min{β(2)
min,β(4)
min using Algorithm 3 with m=4,q=2}
6: if β(4)
min Θ4then m=4,s=0,quit
7: B3=B2B1
8: β(6)
min =min{β(4)
min,β(6)
min using Algorithm 3 with m=6,q=3}
9: if β(6)
min Θ6then m=6,s=0,quit
10: β(9)
min =min{β(6)
min,β(9)
min using Algorithm 3 with m=9,q=3}
11: if β(9)
min Θ9then m=9,s=0,quit
12: β(12)
min =min{β(9)
min,β(12)
min using Algorithm 3 with m=12,q=3}
13: if β(12)
min Θ12 then m=12,s=0,quit
14: s9=1/2log
2(β(9)
min/Θ9)
15: s12 =1/2log
2(β(12)
min/Θ12 )
16: if s9s12 then s=s9,m=9quit m=9usedforscalingonlyif
providing less cost than m=12
17: B4=B3B1
18: β(12)
min =min{β(12)
min,β(12)
min using Algorithm 3 with m=12,q=4}
19: if β(12)
min Θ12 then m=12,s=0,quit
20: s12 =1/2log
2(β(12)
min/Θ12 )
21: β(16)
min =min{β(12)
min,β(16)
min using Algorithm 3 with m=16,q=4}
22: s16 =max{0,1/2log
2(β(16)
min/Θ16)}
23: if s12 s16 then m=12,s=s12 else m=16,s=s16 quit m=12
only used if provides less cost than m=16
12
Algorithm 3 beta NoNormEst: determines value β(m)
min =min{β(m)
t}from The-
orem 1 given m{2,4,6,9,12,16},di=||Bi||,bi=||Bi||1/i,i=1,2,...,q, for
BCn×n, using bounds di||Bi||,i>q, based on products of ||Bi||,iq.
1: switch mdo
2: case 2m=2
3: β(2)
min =(d2d1)1/3
4: case 4m=4
5: β(4)
min =(d2
2d1)1/5
6: case 6m=6
7: if b2b3then
8: β(6)
min =min{d2
2d3,d
1d2
3}1/7
9: else
10: β(6)
min = max{min{d2
2d3,d
1d2
3}1/7,(d2
3d2)1/8}
11: end if
12: case 9m=9
13: if b2b3then
14: β(9)
min =(d3
2d3)1/9
15: else
16: β(9)
min = max{min{d2
2d2
3,d
3
3d1}1/10,(d3
3d2)1/11}
17: end if
18: case 12 m= 12
19: if q=3then
20: if b2b3then
21: β(12)
min =(d5
2d3)1/13
22: else
23: β(12)
min = max{min{d4
3d1,d
3
3d2
2}1/13,(d4
3d2)1/14
24: end if
25: else if q=4then
26: if b3b4then
27: β(12)
min = max{(d3
3d4)1/13,min{d2
3d2
4,d
4
3d2}1/14
28: else
29: β(12)
min = max{(d2
4min{d3d2,d
4d1}}1/13,(d2
4min{d2
3,d
4d2})1/14
30: end if
31: end if
32: case 16 m= 16
33: if b3b4then
34: β(16)
min = max{(d4
3d4)1/16,min{d5
3d2,d
3
3d2
4}1/17}
35: else
36: β(16)
min = max{(d3
4min{d4d1,d
3d2})1/17,(d3
4min{d2
3,d
4d2})1/18}
37: end if
13
On the other hand, in order to reduce the value β(mk)
min and therefore the
scaling parameter sand/or order mgiven by Algorithm 2, using (4) and
(2.2), similarly to (16) from [12] we approximated β(mk)
min =min{β(mk)
t}from
Theorem 1 by
β(mk)
min max )d1/(mk+1)
mk+1 ,d
1/(mk+2)
mk+2 *,m
k6(forwardbound),(15)
β(mk)
min max )d1/mk
mk,d
1/(mk+1)
mk+1 *mk9(backwardbound),(16)
computing the 1–norms of the corresponding matrix powers Biby the
estimation algorithm from [13] and taking bounds di=Bi.Equation(15)
and (16) may give values β(mk)
min lower than the ones given by Algorithm 3,
especially for nonnormal matrices, since
Bp∥≤∥Bi1B2i2B3i3B4i4,i
1+2i2+3i3+4i4=p.
Then it is possible to substitute Algorithm 3 by a new algorithm that
computes β(mk)
min using (15) and (16) with norm estimations of matrix powers
[13] for the corresponding di.ForacompleteMATLABimplementationsee
the nested function ms selectNormEst from function cosmtay.m available in
http://personales.upv.es/jorsasma/Software/cosmtay.m.
Function cosmtay.m also implements the option without norm estimation:
The MATLAB implementation of Algorithm 2 can be seen in the nested
function ms selectNoNormEst and Algorithm 3 in beta NoNormEst from
cosmtay.m.BothfunctionsareslightlydierentfromAlgorithms2and3to
be compatible with the version with norm estimation ms selectNoNormEst.
3. Numerical experiments
In this section we compare the MATLAB functions cosmtay,cosm and
costaym:
Function cosmtay(A,NormEst) (http://personales.upv.es/jorsasma/
software/cosmtay.m), is the MATLAB implementation of Algorithm 1 with
determination of the order mand the scaling parameter susing the 1-
norm estimator [13] (NormEst=1 corresponding in cosmtay to the nested
function ms selectNormEst)withcosm [5, Alg. 4.2] (http://github.com/
sdrelton/cosm_sinm). The tests using Algorithm 2 for determining mand
s(NormEst=0 in cosmtay(A,NormEst) corresponding to the nested function
14
ms selectNoNormEst in cosmtay)gavesimilaraccuracyresultsandarela-
tive increase of the number of matrix products less than or equal to 3% in
tests, therefore we omitted them. We also noted that for small matrices the
cost of the norm estimation algorithm is not negligible compared to a ma-
trix product. Therefore, the algorithm using no norm estimation is typically
more ecient. For large matrices the norm estimation algorithm is negligi-
ble, and then, the algorithm with norm estimation is faster when it really
saves matrix products.
The MATLAB function cosm has an argument which allows to compute
cos(A)bymeansofjustPad´eapproximants,oralsousingtherealSchur
decomposition and the complex Schur decomposition. For this function, we
have used Pad´e approximants with and without real Schur decomposition,
denoted by cosmSchur and cosm,respectively.
Finally, function costaym is the MATLAB implementation of Algorithm
1from[6](http://personales.upv.es/jorsasma/software/costaym.m ).
In tests we used MATLAB (R2014b) running on an Intel Core 2 Duo
processor at 3.00 GHz with 4 GB main memory. The following tests were
made:
Te st 1: 1 00 d ia go n al iz a bl e 1 2 8 ×128 real matrices with real and complex
eigenvalues and 1-norms varying from 2.32 to 220.04.
Test 2: 100 non diagonalizable 128 ×128 real matrices with eigenval-
ues whose algebraic multiplicity vary between 1 and 128 and 1-norms
varying from 5.27 to 21.97.
Test 3: Sevent e e n m a t r i c e s with dimensions lower than or equal to 128
from the Eigtool MATLAB package [14], twenty eight matrices from the
matrix function literature with dimensions lower than or equal to 128,
fifty one 128×128 real matrices from the function matrix of the Matrix
Computation Toolbox [15] and fifty two 8 ×8realmatricesobtained
scaling the matrices from the Matrix Computation Toolbox [15] such
that their norms vary between 0.000145 and 0.334780. This last group
of matrices sized 8 ×8wereusedfortestingspecicallytheforward
error analysis from Section 2.1, since for those matrices the lower orders
m=1,2,4,6, were only used.
Test 4: Fifty matrices from the semidiscretization of the wave e q u a t i o n
from [5, Sec. 7.5] [16, Problem 4].
15
The “exact” matrix cosine was computed exactly for the matrices of Tests
1and2. Following[6,Sec. 4.1],fortheothermatricesweusedMATLAB
symbolic versions of a scaled Pae rational approximation from [5] and a
scaled Taylor Paterson-Stockmeyer approximation (2), both with 4096 deci-
mal digit arithmetic and several orders mand/or scaling parameters shigher
than the ones used by cosm and cosmtay,respectively,checkingthattheir
relative dierence was small enough. The algorithm accuracy was tested by
computing the relative error
E=cos(A)˜
Y1
cos(A)1
,
where ˜
Yis the computed solution and cos(A)istheexactsolution.
To compare the relative errors of the functions, we plotted in Figure 1 t h e
performance profiles and the ratio of relative errors E(cosm)/E(cosmtay)and
E(cosmSchur)/E(cosmtay)forthethreetests. Intheperformanceprole,
the αcoordinate varies between 1 and 5 in steps equal to 0.1, and the pcoordi-
nate is the probability that the considered algorithm has a relative error lower
than or equal to α-times the smallest error over all the methods. The ratios
of relative errors are presented in decreasing order of E(cosm)/E(cosmtay).
The results were:
Figure 1 shows the performance profile and the relative error ratios
giving the accuracy of the tested functions. Figures 1a, 1c and 1e show
that the most accurate functions in Tests 1, 2 and 3 were costaym from
[6] and cosmtay.InTest1functioncostaym was slightly more accurate
than cosmtay and this function was more accurate than cosm for 96 of
the 100 matrices of Test 1 and more accurate than cosmSchur for all
of them (see Figure 1b). The graph of cosmSchur does not appear in
Figure 1a because for all matrices of Test 1 the relative error of this
function was greater than 5 times the error by the other functions. In
Test 2 cosmtay was the most accurate function, being more accurate
than cosm for 93 of the 100 matrices and more accurate than cosmSchur
for 98 matrices (see Figure 1d). Finally, in Text 3 costaym from [6]
was the most accurate function, and cosmtay was more accurate than
cosm for 114 of the 135 matrices of Test 3 and more accurate than
cosmSchur for 109 matrices of that test (see Figure 1f).
The ratios of flops from Figure 2 show that the computational costs
of cosmtay are always lower than cosm,cosmSchur and costaym.In
16
the majority of matrices of Test 3, the ratios of flops of cosmSchur
and cosmtay are between 2 and 4 and the ratios of flops of cosm and
cosmtay are between 1 and 2 (Figures 2a, 2c and 2e). In the majority
of matrices of Test 3, the execution time ratios of cosm and cosmtay are
between 1 and 5, the ratios of cosmSchur and cosmtay are greater than
5(Figures2b,2dand2f),andtheexecutiontimeratiosof costaym and
cosmtay are between 1 and 2. Then, cosmtay provided always a lower
cost than costaym but giving a lower accuracy for some test matrices.
Test 4: Section 7.5 from [5] showed that for the matrices which appear
in a wave equation problem the version of cosm using Schur decompo-
sition (cosmSchur)wasincreasinglymoreaccuratethantheMATLAB
function costay for the biggest matrix dimensions in that test. This
function was our first MATLAB implementation for computing the ma-
trix cosine [4]. In Test 4 we compared the MATLAB implementations
cosmSchur,cosm,cosmtay and cosmtaySchur for the biggest matrix
size given in Section 7.5 from [5]. cosmtaySchur was based on the
real Schur decomposition given by a modified implementation of Algo-
rithm 4.2 from [5], where the Algorithm 1 is used for computing the
cosine of the real Schur matrix of matrix A.Figure3showsthatthe
accuracy of cosmSchur and cosmtaySchur are similar, and both im-
plementations are more accurate than the other implementations not
based on the real Schur of a matrix. In [5] the authors claimed that
costay had signs of instability. Note that cosm without Schur decom-
position also shows signs of instability, even greater than those from
cosmtay.WehavetestedthatfunctioncosmtaySchur is more accurate
than cosmSchur for 28 of the 50 matrices of Test 4 and less accurate
for 22 matrices of that test. Anyway, the dierences are negligible
and the main result from this test is that algorithms cosmtay,cosm,
cosmtaySchur,cosmSchur had cost 1100, 1200, 1800 and 1900 matrix
products, respectively. Therefore, cosmtay is 8.33% more ecient than
cosm [5], and cosmtaySchur is an 5.26% more ecient than cosmSchur
[5].
We have used cosmtay with the 1-norm estimator (cosmtay(A,NormEst)
with NormEst=1) in the four tests above, because the results obtained with
the variant that does not use the 1-norm estimator are similar in accuracy
and we found that the computational cost in terms of matrix products for
the implementation that uses the 1-norm estimator is only 1.90%, 1.85%,
17
3.00% and 0% lower than the implementation without estimation in Tests 1,
2, 3 and 4, respectively. However, the execution time was greater due to the
overhead of using the 1-norm estimator, since the matrix sizes are not large
enough so that the cost of the estimation (O(n2)forn×nmatrices [13]) is
negligible compared to the cost of matrix products (O(n3)).
4. Conclusions
In this work two accurate Taylor algorithms have been proposed to com-
pute the matrix cosine. These algorithms use the scaling technique based on
the double angle formula of the cosine function, the Paterson-Stockmeyer’s
method for computing the Taylor approximation, and new forward and back-
ward relative error bounds for the matrix cosine Taylor approximation, which
allow to calculate the optimal scaling parameter and the optimal order of the
Tay l o r a p p r ox i m a t i on. T h e two algorithms dier only in the use or not of
the 1-norm estimation of norms of matrix powers [13]. The algorithm with
no norm estimation uses Theorem 1 using the norms of matrix powers used
for computing the matrix cosine to obtain bounds on the norms of matrix
powers involved, giving in tests small relative cost dierences in terms of ma-
trix products with the version with norm estimation. The accuracy of both
algorithms is similar, and the norm estimation algorithm has a cost negligi-
ble only for large matrices. Therefore, we recommend using the algorithm
with no norm estimation for small matrices, and the algorithm with norm
estimation for large matrices.
The MATLAB implementation that uses estimation was compared with
other state-of-the-art MATLAB implementations for matrices sized up to
128 ×128 (analogous results were obtained with the other implementation).
Numerical experiments show in general that our Taylor implementations have
higher accuracy and less cost than the Pad´e state-of-the-art implementation
cosm from [5] in the majority of tests. In particular, when the real Schur
decomposition was used the ratio of flops between cosmSchur and cosmtay
was flops(cosmSchur)/flops(cosmtay)>5forsomematrices,andusingthe
Schur decomposition in our algorithms gave the same accuracy results as
cosmSchur with less cost. Numerical experiments also showed that function
cosmtay was slightly less accurate than costaym from [6] in some tests but
cosmtay provided always a lower computational cost.
18
1 1.5 2 2.5 3 3.5 4 4.5 5
p
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
cosm
cosmSchur
costaym
cosmtay
(a) Perfomance profile Test 1.
10 20 30 40 50 60 70 80 90 100
Er
100
101
102
E(cosm)/E(cosmtay)
E(cosmSchur)/E(cosmtay)
E(costaym)/E(cosmtay)
(b) Ratio of relative errors Test 1.
1 1.5 2 2.5 3 3.5 4 4.5 5
p
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
cosm
cosmSchur
costaym
cosmtay
(c) Perfomance profile Test 2.
10 20 30 40 50 60 70 80 90 100
Er
100
101
E(cosm)/E(cosmtay)
E(cosmSchur)/E(cosmtay)
E(costaym)/E(cosmtay)
(d) Ratio of relative errors Test 2.
1 1.5 2 2.5 3 3.5 4 4.5 5
p
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
cosm
cosmSchur
costaym
cosmtay
(e) Perfomance profile Test 3.
20 40 60 80 100 120
Er
10-5
100
105
1010
E(cosm)/E(cosmtay)
E(cosmSchur)/E(cosmtay)
E(costaym)/E(cosmtay)
(f) Ratio of relative errors Test 3.
Figure 1: Accuracy in Tests 1, 2 an 3.
19
Matrix
0 20 40 60 80 100
Flops ratio
1
1.5
2
2.5
3
3.5
4Flops of cosm/Flops of cosmtay
Flops of cosmSchur/Flops of cosmtay
Flops of costaym/Flops of cosmtay
(a) Ratio of flops Test 1.
Matrix
0 20 40 60 80 100
Ratio of execution times
1
1.5
2
2.5
3
3.5
4T(cosm)/T(cosmtay)
T(cosmSchur)/T(cosmtay)
T(costaym)/T(cosmtay)
(b) Ratio of execution times Test 1.
Matrix
0 20 40 60 80 100
Flops ratio
1
1.5
2
2.5
3
3.5 Flops of cosm/Flops of cosmtay
Flops of cosmSchur/Flops of cosmtay
Flops of costaym/Flops of cosmtay
(c) Ratio of flops Test 2.
Matrix
0 20 40 60 80 100
Ratio of execution times
1
1.5
2
2.5
3
3.5 T(cosm)/T(cosmtay)
T(cosmSchur)/T(cosmtay)
T(costaym)/T(cosmtay)
(d) Ratio of execution times Test 2.
Matrix
0 20 40 60 80 100 120
Flops ratio
100
101
Flops of cosm/Flops of cosmtay
Flops of cosmSchur/Flops of cosmtay
Flops of costaym/Flops of cosmtay
(e) Ratio of flops Test 3.
Matrix
0 20 40 60 80 100 120
Ratio of execution times
100
101
T(cosm)/T(cosmtay)
T(cosmSchur)/T(cosmtay)
T(costaym)/T(cosmtay)
(f) Ratio of execution times Test 3.
Figure 2: Computational costs in Tests 1, 2 an 3.
20
matrix
0 10 20 30 40 50
Er
10-11
10-10
10-9
cond*u
cosm
cosmSchur
cosmtay
cosmtaySchur
(a) Normwise relative error Test 4.
5 10 15 20 25 30 35 40 45 50
Er
100
101
E(cosm)/E(cosmtaySchur)
E(cosmSchur)/E(cosmtaySchur)
E(cosmtay)/E(cosmtaySchur)
(b) Ratio of relative errors Test 4.
Figure 3: Accuracy in Test 4.
Acknowledgments
The authors are very grateful to the anonymous referees, whose comments
greatly improved this paper.
[1] S. Serbin, Rational approximations of trigonometric matrices with ap-
plication to second-order systems of dierential equations, Appl. Math.
Comput. 5 (1) (1979) 75–92.
[2] S. M. Serbin, S. A. Blalock, An algorithm for computing the matrix
cosine, SIAM J. Sci. Statist. Comput. 1 (2) (1980) 198–204.
[3] E. Defez, J. Sastre, J. J. Ib´nez, P. A. Ruiz, Computing matrix func-
tions arising in engineering models with orthogonal matrix polynomials,
Math. Comput. Model. 57 (7-8) (2013) 1738–1743.
[4] J. Sastre, J. Ib´nez, P. Ruiz, E. Defez, Ecient computation of the
matrix cosine, Appl. Math. Comput. 219 (2013) 7575–7585.
[5] A. H. Al-Mohy, N. J. Higham, S. D. Relton, New algorithms for com-
puting the matrix sine and cosine separately or simultaneously, SIAM
J. Sci. Comput. 37 (1) (2015) A456–A487.
21
[6] P. Alonso, J. Ib´a˜nez, J. Sastre, J. Peinado, E. Defez, Ecient and accu-
rate algorithms for computing matrix trigonometric functions, J. Com-
put. Appl. Math. 309 (2017) 325–332. doi:10.1016/j.cam.2016.05.015.
URL http://dx.doi.org/10.1016/j.cam.2016.05.015
[7] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar multi-
plications necessary to evaluate polynomials, SIAM J. Comput. 2 (1)
(1973) 60–66.
[8] C. Tsitouras, V. N. Katsikis, Bounds for variable degree rational Lap-
proximations to the matrix cosine, Computer Physics Communications
185 (11) (2014) 2834–2840.
[9] N. J. Higham, Functions of Matrices: Theory and Computation, SIAM,
Philadelphia, PA, USA, 2008.
[10] G. H. Golub, C. V. Loan, Matrix Computations, 3rd Edition, Johns
Hopkins Studies in Mathematical Sciences, The Johns Hopkins Univer-
sity Press, 1996.
[11] J. Sastre, J. J. Ib´nez, E. Defez, P. A. Ruiz, Ecient scaling-squaring
Taylor method for computing matrix exponential, SIAM J. Sci. Comput.
37 (1) (2015) A439–A455.
[12] P. Ruiz, J. Sastre, J. Ib´nez, E. Defez, High perfomance computing of
the matrix exponential, J. Comput. Appl. Math. 291 (2016) 370–379.
[13] J. Higham, F. Tisseur, A block algorithm for matrix 1-norm estimation,
with an application to 1-norm pseudospectra, SIAM J. Matrix Anal.
Appl. 21 (2000) 1185–1201.
[14] T. G. Wright, Eigtool, version 2.1 (2009).
URL web.comlab.ox.ac.uk/pseudospectra/eigtool.
[15] N. J. Higham, The Test Matrix Toolbox for MATLAB, Numerical Anal-
ysis Report No. 237, Manchester, England (Dec. 1993).
[16] J. M. Franco, New methods for oscillatory systems based on
arkn methods, Appl. Numer. Math. 56 (8) (2006) 1040–1053.
doi:10.1016/j.apnum.2005.09.005.
URL http://dx.doi.org/10.1016/j.apnum.2005.09.005
22
... In the last years, the backward error analysis has become a fundamental tool in the development of algorithms based on Padé [1,2,3,4,5], Taylor [6,7,8,9,10,11] and other approximations [12] for computing the matrix exponential, logarithm, sine, cosine and other matrix functions. In many of the previous references the backward error analysis of the approximations is based on using the compositional inverse function of f , denoted by (f ) −1 , where the backward error of a certain approximation p(x) of function f (x) is defined by a quantity ∆x such that ...
... In [8,Sec. 2.2] it was shown that a backward error analysis for the matrix cosine Taylor approximation based on the inverse function arccos similar to that on [4,Sec. ...
... 2.2], yielded analogous results, making it unusable. In order to calculate the backward error ∆A of approximating cos(A) by the Taylor polynomial T 2m (A) such that T 2m (A) = cos(A + ∆A), (7) in [8,Sec. 2.2] the authors proposed a particular approach for the matrix cosine that allows to obtain ∆A as a power series of A ...
Article
Full-text available
In this paper we give a new formula to write the forward error of Taylor approximations of analytical functions in terms of the backward error of those approximations, considering exact arithmetic in both errors. Using this formula, a method to compute a backward error given by the power series centered in the same expansion point as the Taylor approximation is provided. The application of this method for Padé approximations is also shown. Based on the previous method, a MATLAB implementation for computing the first power series terms of the backward error for Taylor and Padé approximations of a given analytical function is provided, and examples of its use are given. Applications to the computation of matrix functions are given that overcome limitations of other backward error analyses which uses inverse compositional functions in the literature.
... In this article, the following notation will be used: x denotes the smallest integer greater than or equal to x, and x the largest integer less than or equal to x. u denotes the unit roundoff in IEEE double precision arithmetic (see [15], Section 2.2). The set of positive integers is denoted as N. ...
... Then, if ||B|| = ||A 2 || ≤ Θ, then ||E b || ≤ u for the corresponding Taylor approximations. In [15] (Table 1), Θ for Taylor approximation of order 16 was 9.97 and Θ 20 = 10.18, showing two decimal digits. ...
... CoefficientsTo check if the new evaluation formulae are accurate, we compared the results of computing the matrix cosine with function cosm from[7] with a function using the coefficients fromTable 1in (39)-(41) and (36) with no scaling for simplicity. Since[7] used a relative backward error analysis, we used the values of Θ from[15] ...
Article
Full-text available
Recently, two general methods for evaluating matrix polynomials requiring one matrix product less than the Paterson–Stockmeyer method were proposed, where the cost of evaluating a matrix polynomial is given asymptotically by the total number of matrix product evaluations. An analysis of the stability of those methods was given and the methods have been applied to Taylor-based implementations for computing the exponential, the cosine and the hyperbolic tangent matrix functions. Moreover, a particular example for the evaluation of the matrix exponential Taylor approximation of degree 15 requiring four matrix products was given, whereas the maximum polynomial degree available using Paterson–Stockmeyer method with four matrix products is 9. Based on this example, a new family of methods for evaluating matrix polynomials more efficiently than the Paterson–Stockmeyer method was proposed, having the potential to achieve a much higher efficiency, i.e., requiring less matrix products for evaluating a matrix polynomial of certain degree, or increasing the available degree for the same cost. However, the difficulty of these family of methods lies in the calculation of the coefficients involved for the evaluation of general matrix polynomials and approximations. In this paper, we provide a general matrix polynomial evaluation method for evaluating matrix polynomials requiring two matrix products less than the Paterson-Stockmeyer method for degrees higher than 30. Moreover, we provide general methods for evaluating matrix polynomial approximations of degrees 15 and 21 with four and five matrix product evaluations, respectively, whereas the maximum available degrees for the same cost with the Paterson–Stockmeyer method are 9 and 12, respectively. Finally, practical examples for evaluating Taylor approximations of the matrix cosine and the matrix logarithm accurately and efficiently with these new methods are given.
... 6 Methods based on finite differences, and its application to solve fractional partial differential equations, can be found in previous studies. [10][11][12] Among the methods proposed to approximate the matrix cosine, two fundamentally stand out: (1) those focused on polynomial approximations, thanks to the developments of the matrix cosine in Taylor or Hermite series (see previous works [13][14][15] ), or (2) those based on rational approximations, such as Padé approach (see previous studies 5,9,16,17 ). In general, polynomial methods are more efficient in terms of accuracy than rational ones, although they may be somewhat more computationally expensive. ...
... Taking into account the two previous truncated series, three different approximations have been addressed: two of them are based on expressions (6) or (9), respectively, in which all the polynomial terms have been considered, and one more, derived from expression (9), where only the coefficients p (m) i of the even order terms have been taken into account, similarly to what happens when considering cosine series expansions using Taylor or Hermite polynomials. 13,15 According to the above mentioned approaches, we have developed Algorithms 1 and 2. In Phase I (for both algorithms), integers m and s are estimated so that the Bernoulli approximation of the scaled matrix is computed accurately and efficiently. There exist several methods that can be applied to compute efficiently C = P m k (A) in Phase II. ...
... In this way, only the even terms will be employed, as in the case of Taylor series. 13 One more time, the same Algorithms 3, 4 and 5 will be employed to work out m ∈ {16,20} (it would be equivalent to m = 32 or 40 using the even and odd terms) and s values. • cosm: implementation based on the Padé rational approximation for the matrix cosine function. ...
Article
Full-text available
This paper presents a new series expansion based on Bernoulli matrix polynomials to approximate the matrix cosine function. An approximation based on this series is not a straightforward exercise since there exist different options to implement such a solution. We dive into these options and include a thorough comparative of performance and accuracy in the experimental results section that shows benefits and downsides of each one. Also, a comparison with the Padé approximation is included. The algorithms have been implemented in MATLAB and in CUDA for NVIDIA GPUs.
... • Without estimation of norms of matrix powers. In this case, the bounds of matrix powers are computed from products of norms of matrix powers previously computed (see Algorithms 2 and 3 from Sastre et al. 2 ). In this case, (4) min is computed as ...
... To carry out the computation of these matrix series, we have used a Taylor approximation with a modified version of the scaling algorithm. 2 For the simultaneous computation of the two series in Equation ( 16 ), new forward and backward error bounds for the Taylor approximation had to be determined, allowing to implement the optimal scaling parameter and the optimal order for the Taylor method. ...
... Table 1. 2,4,6,9,12,16}. From (34) and (35), ...
Article
Full-text available
This work deals with the simulation of a two‐dimensional ideal lattice having simple tetragonal geometry. The harmonic character of the oscillators give rise to a system of second‐order linear differential equations, which can be recast into matrix form. The explicit solutions which govern the dynamics of this system can be expressed in terms of matrix trigonometric functions. For the derivation we employ the Lagrangian formalism to determine the correct solutions, which extremize the underlying action of the system. In the numerical evaluation we develop diverse state‐of‐the‐art algorithms which efficiently tackle equations with matrix sine and cosine functions. For this purpose, we introduce two special series related to trigonometric functions. They provide approximate solutions of the system through a suitable combination. For the final computation an algorithm based on Taylor expansion with forward and backward error analysis for computing those series had to be devised. We also implement several MATLAB programs which simulate and visualize the two‐dimensional lattice and check its energy conservation.
... Another algorithm that can calculate the two functions simultaneously is proposed by Seydaoglu, Bader, Blanes, and Casas [40]; it chooses from some Taylor polynomial approximations of fixed degree and relies on precomputed constants. Other algorithms have been developed for computing the matrix cosine based on Taylor series [6], [35], [36], with improvements on the error bounds or the cost of evaluation of the approximating polynomials. There are also algorithms for evaluating the matrix cosine based on approximating functions other than Taylor and Padé approximants, for example, algorithms based on Bernoulli matrix polynomials [10] and Hermite matrix polynomials [11]. ...
... 4.1], which is intended for double precision only. • cosm tay, the algorithm by Sastre et al. [35], which uses the scaling and recovering method based on truncated Taylor series, and is intended for double precision only. • cosm pol, the algorithm by Sastre et al. [36], which uses Taylor polynomial approximations of fixed degree with precomputed coefficients, and is intended for double precision only. ...
... The optimal values m k and s are obtained from the values of t of Theorem 4.1 and from the values ⇥ m k of Table 1. A complete study of this question was developed by the authors in [18,24]. Next, we reproduce that study. ...
... The algorithm for computing the values m and s is analogous to Algorithm 2 from [24]. ...
Article
Full-text available
There are, currently, very few implementations to compute the hyperbolic cosine of a matrix. This work tries to fill this gap. To this end, we first introduce both a new rational-polynomial Hermite matrix expansion and a formula for the forward relative error of Hermite approximation in exact arithmetic with a sharp bound for the forward error. This matrix expansion allows obtaining a new accurate and efficient method for computing the hyperbolic matrix cosine. We present a MATLAB implementation, based on this method, which shows a superior efficiency and a better accuracy than other state-of-the-art methods. The algorithm developed on the basis of this method is also able to run on an NVIDIA GPU thanks to a MEX file that connects the MATLAB implementation to the CUDA code.
... Among the proposed methods for the approximate computation of the matrix cosine, two fundamental ones stand out: those based on rational approximations [1,[5][6][7], and those related to polynomial approximations, using either Taylor series developments [8,9] or serial developments of Hermite matrix polynomials [10]. In general, polynomial approximations showed to be more efficient than the rational algorithms in tests because they are more accurate despite a slightly higher cost. ...
... cosmtay. Code based on the Taylor series for the cosine [8]. It will provide a maximum value of m = 16, considering only the even terms, which would be equivalent to m = 32 using the even and odd terms. ...
Conference Paper
Full-text available
The computation of matrix trigonometric functions has received remarkable attention in the last decades due to its usefulness in the solution of systems of second order linear differential equations. Recently, several state-of-the-art algorithms have been provided for computing these matrix functions, in particular for the matrix cosine function.
... The problem [17,22] of calculating analytic functions f of a matrix R is of constant interest since it has at least one extensive application: the solution of a linear differential equationẋ = Rx with a matrix coefficient R can be expressed in terms of the function e Rt . We only mention some other applications: the cosine function [19,20,51], the power function [8,23], the sign function [6,22,31], Green's function [17,36,41], the gamma and beta functions [10,29,52], the Lambert function [28]. Usually, a function of a matrix can be calculated only approximately. ...
Article
Full-text available
Let T be a square matrix with a real spectrum, and let f be an analytic function. The problem of the approximate calculation of f(T) is discussed. Applying the Schur triangular decomposition and the reordering, one can assume that T is triangular and its diagonal entries tii are arranged in increasing order. To avoid calculations using the differences tii − tjj with close (including equal) tii and tjj, it is proposed to represent T in a block form and calculate the two main block diagonals using interpolating polynomials. The rest of the f(T) entries can be calculated using the Parlett recurrence algorithm. It is also proposed to perform some scalar operations (such as the building of interpolating polynomials) with an enlarged number of significant decimal digits.
... The problem [19,23] of calculating analytic functions f of a matrix R is of constant interest since it has at least one extensive application: the solution of a linear differential equationẋ = Rx with a matrix coefficient R can be expressed in terms of the function e Rt . We only mention some other applications: the cosine [21,22,52], the power function [10,24], the sign function [6,23,31], Green's function [19,37,42], the gamma and beta functions [12,29,53], the Lambert function [28]. Usually, a function of a matrix can be calculated only approximately. ...
Preprint
Full-text available
Let $T$ be a square matrix with a real spectrum, and let $f$ be an analytic function. The problem of the approximate calculation of $f(T)$ is discussed. Applying the Schur triangular decomposition and the reordering, one can assume that $T$ is triangular and its diagonal entries $t_{ii}$ are arranged in increasing order. To avoid calculations using the differences $t_{ii}-t_{jj}$ with close (including equal) $t_{ii}$ and $t_{jj}$, it is proposed to represent $T$ in a block form and calculate the two main block diagonals using interpolating polynomials. The rest of the $f(T)$ entries can be calculated using the Parlett recurrence algorithm. It is also proposed to perform scalar operations (such as the building of interpolating polynomials) with an enlarged number of decimal digits.
Conference Paper
Full-text available
The Group of High Performance Scientific Computing is a research group dedicated to scientific and high performance computing and its applications. We present here their main contributions to the calculation of matrix functions: mixed rational/polynomial approximations, a general family of methods for the computation of matrix polynomials which is more efficient than the Paterson-Stockmeyer method, and new series expansions of Hermite matrix polynomials, both polynomial and rational, for some of the most important matrix functions.
Article
Full-text available
Trigonometric matrix functions play a fundamental role in second order differential equations. This work presents an algorithm based on Taylor series for computing the matrix cosine. It uses a backward error analysis with improved bounds. Numerical experiments show that MATLAB implementations of this algorithm has higher accuracy than other MATLAB implementations of the state of the art in the majority of tests. Furthermore, we have implemented the designed algorithm in language C for general purpose processors, and in CUDA for one and two NVIDIA GPUs. We obtained a very good performance from these implementations thanks to the high computational power of these hardware accelerators and our effort driven to avoid as much communications as possible. All the implemented programs are accessible through the MATLAB environment.
Article
Full-text available
The matrix exponential plays a fundamental role in linear differential equations arising in engineering, mechanics, and control theory. The most widely used, and the most generally efficient, technique for calculating the matrix exponential is a combination of "scaling and squaring" with a Pade approximation. For alternative scaling and squaring methods based on Taylor series, we present two modifications that provably reduce the number of matrix multiplications needed to satisfy the required accuracy bounds, and a detailed comparison of the several algorithmic variants is provided.
Article
Full-text available
Several existing algorithms for computing the matrix cosine employ polynomial or rational approximations combined with scaling and use of a double angle formula. Their derivations are based on forward error bounds. We derive new algorithms for computing the matrix cosine, the matrix sine, and both simultaneously that are backward stable in exact arithmetic and behave in a forward stable manner in floating point arithmetic. Our new algorithms employ both Pade approximants of sin x and new rational approximants to cos x and sin x obtained from Pade approximants to e(x). The amount of scaling and the degree of the approximants are chosen to minimize the computational cost subject to backward stability in exact arithmetic. Numerical experiments show that the new algorithms have backward and forward errors that rival or surpass those of existing algorithms and are particularly favorable for triangular matrices.
Article
Full-text available
This work presents a new algorithm for matrix exponential computation that significantly simplifies a Taylor scaling and squaring algorithm presented previously by the authors, preserving accuracy. A Matlab version of the new simplified algorithm has been compared with the original algorithm, providing similar results in terms of accuracy, but reducing processing time. It has also been compared with two state-of-the-art implementations based on Padé approximations, one commercial and the other implemented in Matlab, getting better accuracy and processing time results in the majority of cases.
Article
Full-text available
Trigonometric matrix functions play a fundamental role in second order differential equation systems. This work presents an algorithm for computing the cosine matrix function based on Taylor series and the cosine double angle formula. It uses a forward absolute error analysis providing sharper bounds than existing methods. The proposed algorithm had lower cost than state-of-the-art algorithms based on Hermite matrix polynomial series and Padé approximants with higher accuracy in the majority of test matrices.
Article
A thorough and elegant treatment of the theory of matrix functions and numerical methods for computing them, including an overview of applications, new and unpublished research results, and improved algorithms. Key features include a detailed treatment of the matrix sign function and matrix roots; a development of the theory of conditioning and properties of the Frechet derivative; Schur decomposition; block Parlett recurrence; a thorough analysis of the accuracy, stability, and computational cost of numerical methods; general results on convergence and stability of matrix iterations; and a chapter devoted to the f(A)b problem. Ideal for advanced courses and for self-study, its broad content, references and appendix also make this book a convenient general reference. Contains an extensive collection of problems with solutions and MATLAB implementations of key algorithms.
Article
Trigonometric matrix functions play a fundamental role in the solution of second order differential equations. Hermite series truncation together with Paterson–Stockmeyer method and the double angle formula technique allow efficient computation of the matrix cosine. A careful error bound analysis of the Hermite approximation is given and a theoretical estimate for the optimal value of its parameters is obtained. Based on the ideas above, an efficient and highly-accurate Hermite algorithm is presented. A MATLAB implementation of this algorithm has also been developed and made available online. This implementation has been compared to other efficient state-of-the-art implementations on a large class of matrices for different dimensions, obtaining higher accuracy and lower computational costs in the majority of cases.
Article
A one-parameter family of explicit fourth-order methods for oscillatory systems of the form y″+Ky=f(t,y), K being a symmetric positive semi-definite matrix, is obtained. The new methods possess eighth order for the unperturbed problem (f(t,y)≡0), and the free parameter is chosen so that the dispersion or the dissipation are optimized. The stability and phase properties of the new methods are analyzed, obtaining generalized stability regions for the second-order homogeneous linear test model. The numerical experiments carried out show that the new methods are very competitive when they are compared with standard, symplectic and special codes proposed in the scientific literature.