Content uploaded by Jesús Peinado
Author content
All content in this area was uploaded by Jesús Peinado on Oct 21, 2022
Content may be subject to copyright.
Content uploaded by Pedro Alonso
Author content
All content in this area was uploaded by Pedro Alonso on Feb 08, 2022
Content may be subject to copyright.
New Hermite series expansion for computing the matrix
hyperbolic cosine
E. Defeza,⇤,J.Ib´a˜nez
a, J. Peinadob, P. Alonso-Jord´ab, Jos´e M. Alonsoc
Universitat Polit`ecnica de Val`encia, Camino de Vera s/n, 46022, Valencia. Spain
aInstituto de Matem´atica Multidisciplinar
bDepartamento Sistemas Inform´aticos y Computaci´on.
cInstituto de Instrumentaci´on para Imagen Molecular
Abstract
There are, currently, very few implementations to compute the hyperbolic cosine
of a matrix. This work tries to fill this gap. To this end, we first introduce
both a new rational-polynomial Hermite matrix expansion and a formula for
the forward relative error of Hermite approximation in exact arithmetic with a
sharp bound for the forward error. This matrix expansion allows obtaining a
new accurate and efficient method for computing the hyperbolic matrix cosine.
We present a MATLAB implementation, based on this method, which shows a
superior efficiency and a better accuracy than other state-of-the-art methods.
The algorithm developed on the basis of this method is also able to run on an
NVIDIA GPU thanks to a MEX file that connects the MATLAB implementation
to the CUDA code.
Key words:
Hermite matrix approximation, matrix hyperbolic cosine, error analysis, GPU
computing
1. Introduction
Functions of a square matrix Afrequently arise in many areas of science
and technology, especially those that require the resolution of first and second
order di↵erential systems [1, pp. 35–37]. In particular, the hyperbolic matrix
functions cosh (A) and sinh (A), defined in terms of the exponential matrix eA
as
cosh (A)=eA+eA
2,sinh (A)=eAeA
2,
⇤Corresponding author
Email addresses: edefez@imm.upv.es (E. Defez), jjibanez@dsic.upv.es (J. Ib´a˜nez),
jpeinado@dsic.upv.es (J. Peinado), palonso@upv.es (P. Alonso-Jord´a),
jmalonso@dsic.upv.es (Jos´e M. Alonso)
Preprint submitted to Journal of Computational and Applied MathematicsNovember 27, 2021
Manscrip Click here to ie linked References
are involved in the solution of coupled hyperbolic systems of partial di↵erential
equations [2]. Moreover, we also can find applicability for these functions in
other fields of science and engineering, e.g., communicability analysis in complex
networks [3–6].
A way of computing these matrix functions is to use the well-known relations:
cosh (A) = cos (iA),and sinh (A)=icos ✓Ai⇡
2I◆,i
2=1,
provided there exists a method to compute cos (A). Notwithstanding, this ap-
proach has the disadvantage of requiring complex arithmetic even when Ais a
real matrix.
There exist, however, alternative and more practical ways to evaluate these
matrix functions. One of them uses Hermite matrix polynomials series expan-
sions [7]. Other methods based on Taylor series have been studied to evaluate
the action of these functions on vectors [8, 9].
1.1. Notation
Throughout this paper, we denote by Cr⇥rthe set of all the complex square
matrices of size r. We denote by ⇥and I, respectively, the zero and the identity
matrix in Cr⇥r.IfA2Cr⇥r, we denote by (A) the set of all the eigenvalues
of A. For a real number x,dxedenotes the ceiling function, that is, the least
integer greater than or equal to xand bxcdenotes the floor function, that is,
the largest integer less than or equal to x.
If f(z) and g(z) are holomorphic functions in an open set ⌦of the complex
plane, and if (A)⇢⌦, we denote by f(A) and g(A), respectively, the image
by the Riesz-Dunford functional calculus of functions f(z) and g(z) acting on
the matrix A,beingf(A)g(A)=g(A)f(A) [10, pp. 558]. We say that the ma-
trix Ais positive stable if Re(z)>0 for every eigenvalue z2(A). In this
case, let us denote pA=A1/2=exp
1
2log (A)the image of the function
z1/2=exp
1
2log (z)by the Riesz-Dunford functional calculus, acting on the
matrix A, where log (z) denotes the principal branch of the complex logarithm.
In this paper, we use consistent matrix norms. In particular, kAk2is the
2-norm. In tests, we use the 1-norm of a matrix A2Cr⇥rdefined by kAk1=
sup
x6=0
kAxk1
kxk1
,wherek·k1denotes the vector 1-norm defined as kyk1=|y1|+···+
|yr|,y2Cr[11, Chapter 2]. When a concrete norm is not indicated in the text,
any consistent norm can be applied.
Although originally introduced by Laplace in 1810, Hermite polynomials get
their name from the French mathematician Charles Hermite, who wrote about
them in 1864. Hermite polynomials and their generalizations are an active
research area in the field of special functions, and its applications are numerous,
see for example the references [12–14].
2
For a positive stable matrix A2Cr⇥r,then-th Hermite matrix polynomial
is defined in [15] by
Hn(x, A)=n!bn
2c
X
k=0
(1)k⇣p2A⌘n2k
k!(n2k)! xn2k,(1)
which satisfies the three-term matrix recurrence:
Hn(x, A)=xp2AHn1(x, A)2(n1)Hn2(x, A),n1,
H1(x, A)=⇥,H
0(x, A)=I. 9
=
;(2)
The following upper bounds of Hermite matrix polynomials,
kH2n(x, A)k2gn(x),n1,
kH2n+1(x, A)k2|x|
✓A
2◆1
2
2
2gn(x)
n+1 ,n0,
9
>
>
>
=
>
>
>
;
(3)
were demonstrated in [16], where the function gn(x) is defined as
gn(x)=(2n+ 1)!22n
n!exp ✓5
2kAk2x2◆,n0.(4)
The Hermite matrix polynomial sequence {Hn(x, A)}n0has the following
generating function [15]:
extp2A=et2X
n0
Hn(x, A)
n!tn,
from which we can derive the following expressions for the matrix hyperbolic
sine and cosine [17]:
cosh ⇣xtp2A⌘=et2X
n0
H2n(x, A)
(2n)! t2n
sinh ⇣xtp2A⌘=et2X
n0
H2n+1(x, A)
(2n+ 1)! t2n+1
9
>
>
>
=
>
>
>
;
,x2R,|t|<1.(5)
Recent polynomial and rational developments in series of Hermite polyno-
mials have been obtained in [18]. These results have led to the development of
a more accurate and efficient method, compared with others proposed in the
literature, for the computation of the matrix cosine function. In this paper, we
calculate the exact value of the following Hermite matrix polynomial series:
X
n0
H2n+1(x, A)
(2n)! t2n:= A(x, t;A),(6)
3
X
n0
H2n+2(x, A)
(2n+ 1)! t2n+1 := B(x, t;A),(7)
and X
n0
H2n+3(x, A)
(2n+ 1)! t2n+1 := C(x, t;A),(8)
which result in a new expansion of the hyperbolic matrix cosine in Hermite
matrix polynomials. This method aims to improve the one proposed in [7] and
to achieve more accuracy in the results, for a large variety of matrices, with a
lower computational cost in terms of matrix products.
The organization of this paper is as follows. Section 2 presents the proofs
for the formulas (6)-(8) introduced in this paper. Section 3 deals with the new
rational-polynomial Hermite matrix expansions for the hyperbolic matrix co-
sine. The proposed algorithm and its MATLAB implementation is described in
Section 4. The numerical results are exposed in Section 5. Some final conclu-
sions are given in Section 6.
2. A proof of formulas (6)-(8)
The aim is to calculate the exact value of matrix series A(x, t;A),B(x, t;A)
and C(x, t;A) defined by (6)-(8). Next, we prove that all matrix series are
convergent. Taking into account (3), we have
H2n+1(x, A)
(2n)! t2n
2|x|
✓A
2◆1
2
2
2gn(x)
(n+ 1)(2n)!|t|2n.
Since using (4), X
n0
gn(x)
(n+ 1)(2n)!|t|2nis convergent for |t|<1, the matrix
series A(x, t;A) defined by (6) is convergent in any compact real interval. In
the same way, we have
H2n+2(x, A)
(2n+ 1)! t2n+1
2gn+1(x)
(2n+ 1)!|t|2n+1.
Since using (4), X
n0
gn+1(x)
(2n+ 1)!|t|2n+1 is convergent for |t|<1, the matrix
series B(x, t;A) defined by (7) is convergent in any compact real interval. Anal-
ogously and taking into account (3) again, we have
H2n+3(x, A)
(2n+ 1)! t2n+1
2|x|
✓A
2◆1
2
2
2gn+1(x)
(n+ 2)(2n+ 1)!|t|2n+1.
4
Since using (4), X
n0
gn+1(x)
(n+ 2)(2n+ 1)!|t|2n+1 is convergent for |t|<1,the
matrix series C(x, t;A) defined by (8) is convergent in any compact real interval.
Using now (2), (6) and the fact that H1(x, A)=p2Ax, we can write
A(x, t;A)=⇣xp2A⌘X
n0
H2n(x, A)
(2n)! t2n2X
n1
(2n)H2n1(x, A)
(2n)! t2n
=H1(x, A)et2cosh ⇣xtp2A⌘2tX
n1
H2n1(x, A)
(2n1)! t2n1
=H1(x, A)et2cosh ⇣xtp2A⌘2tX
n0
H2n+1(x, A)
(2n+ 1)! t2n+1
=H1(x, A)et2cosh ⇣xtp2A⌘2tet2sinh ⇣xtp2A⌘
=et2hH1(x, A) cosh ⇣xtp2A⌘2tsinh ⇣xtp2A⌘i.
The proof of (7) is similar:
B(x, t;A)=⇣xp2A⌘X
n0
H2n+1(x, A)
(2n+ 1)! t2n+1 2X
n0
(2n+ 1)H2n(x, A)
(2n+ 1)! t2n
=H1(x, A)et2sinh ⇣xtp2A⌘2tX
n0
H2n(x, A)
(2n)! t2n
=H1(x, A)et2sinh ⇣xtp2A⌘2tet2cosh ⇣xtp2A⌘
=et2hH1(x, A)sinh⇣xtp2A⌘2tcosh ⇣xtp2A⌘i.
Working analogously for (8):
C(x, t;A)=xp2AX
n0
H2n+2(x, A)
(2n+ 1)! t2n+1 2X
n0
(2n+ 2)H2n+1(x, A)
(2n+ 1)! t2n+1
=H1(x, A)B(x, t;A)20
@X
n0
(2n+ 1)H2n+1(x, A)
(2n+ 1)! t2n+1
+X
n0
H2n+1(x, A)
(2n+ 1)! t2n+11
A
=H1(x, A)B(x, t;A)20
@X
n0
H2n+1(x, A)
(2n)! t2n+1 +et2sinh ⇣xtp2A⌘1
A
=H1(x, A)B(x, t;A)2tA(x, t;A)2et2sinh ⇣xtp2A⌘.
5
Taking into account the values of A(x, t;A) and B(x, t;A), we get
C(x, t;A)=H1(x, A)B(x, t;A)2tA(x, t;A)2et2sinh ⇣xtp2A⌘
=et2hH1(x, A)2+(4t22)Isinh ⇣xtp2A⌘4tH1(x, A) cosh ⇣xtp2A⌘i.
By (1), we have that H1(x, A)=p2Ax, H2(x, A)=2x2A2I, and we can
rewrite the last expression of C(x, t;A) in the form
C(x, t;A):=et2hH2(x, A)+4t2Isinh ⇣xtp2A⌘4tH1(x, A) cosh ⇣xtp2A⌘i.
Summarizing, the following result has been established:
Theorem 2.1. Let A2Cr⇥rbe a positive stable matrix, x2R,|t|<+1.
Then
X
n0
H2n+1(x, A)
(2n)! t2n=et2hH1(x, A) cosh ⇣xtp2A⌘2tsinh ⇣xtp2A⌘i,
X
n0
H2n+2(x, A)
(2n+1)! t2n+1 =et2hH1(x, A) sinh ⇣xtp2A⌘2tcosh ⇣xtp2A⌘i,
X
n0
H2n+3(x, A)
(2n+1)! t2n+1 =et2hH2(x, A)+4t2Isinh ⇣xtp2A⌘4tH1(x, A) cosh ⇣xtp2A⌘i .
9
>
>
>
>
>
>
>
>
>
>
>
>
=
>
>
>
>
>
>
>
>
>
>
>
>
;
(9)
Having in mind that the Hermite matrix polynomial Hn(x, A) coincides with
the Hermite polynomial Hn(x), taking r= 1 and A= 2 (see [15] for more
details), we get the following corollary:
Corollary 1. Let {Hn(x)}n0be the sequence of Hermite polynomials, x2
R,|t|<+1. Then
X
n0
H2n+1(x)
(2n)! t2n=et2[H1(x) cosh (2xt)2tsinh (2xt)] ,
X
n0
H2n+2(x)
(2n+ 1)! t2n+1 =et2[H1(x)sinh(2xt)2tcosh (2xt)] ,
X
n0
H2n+3(x)
(2n+ 1)! t2n+1 =et2⇥H2(x)+4t2sinh (2xt)4tH1(x) cosh (2xt)
⇤.
9
>
>
>
>
>
>
>
>
>
>
>
>
=
>
>
>
>
>
>
>
>
>
>
>
>
;
(10)
Formulas (10) are new in the literature of Hermite polynomials and special
functions.
6
3. Some new Hermite matrix series expansions for the hyperbolic
matrix cosine
Let A2Cr⇥rbe a positive stable matrix. Then the matrix polynomial
H1(x, A)=p2Ax is invertible if x6= 0. Substituting sinh ⇣xtp2A⌘given in
(5) into the first expression of (9), we obtain a new rational expression for the
hyperbolic matrix cosine in terms of Hermite matrix polynomials:
cosh ⇣xtp2A⌘=et20
@X
n0
H2n+1(x, A)
(2n)! ✓1+ 2t2
2n+1◆t2n1
A[H1(x, A)]1,
x6=0,|t|<+1.
(11)
Substituting sinh ⇣xtp2A⌘given in (5) into the second expression of (9)
and using the three-term matrix recurrence (2), we obtain the expression of
cosh ⇣xtp2A⌘given in (5).
On the other hand, replacing the expression of sin ⇣xtp2A⌘given in (5)
into the third expression of (9), we get another new rational expression for the
hyperbolic matrix cosine in terms of Hermite matrix polynomials:
cosh ⇣xtp2A⌘=
=et2
4
2
4X
n0
H2n+3(x, A)
(2n+1)! t2nH2(x, A)+4t2I?0
@X
n0
H2n+1(x, A)
(2n+1)! t2n+1 1
A3
5[H1(x, A)]1,
x6=0,|t|<+1.(12)
Comparing (12) with (11), we observe that there is always an extra matrix
product when evaluating (12), the matrix product indicated by the symbol “?”.
Due to the importance of reducing the number of matrix products, see [19–21]
for more details, we will focus mainly on the expansion (11).
From (1), it follows that, for x6= 0:
H2n+1 (x, A)[H1(x, A)]1=(2n+ 1)!
n
X
k=0
(1)kx2(nk)(2A)nk
k!(2(nk) + 1)!
=e
H2n+1 (x, A),(13)
where
e
Hn(x, A)=n!bn
2c
X
k=0
(1)k⇣p2A⌘n2k1
k!(n2k)! xn2k1,(14)
7
so the right hand side of (13) is still defined in the case where matrix Ais
singular. In this way, we can rewrite the relation (11) in terms of the matrix
polynomial e
H2n+1 (x, A) as
cosh ⇣xtp2A⌘=et20
@X
n0e
H2n+1 (x, A)
(2n)! ✓1+ 2t2
2n+1◆t2n1
A,
x2R,|t|<+1.
(15)
Replacing matrix Aby A2/2 in (15) we can avoid the square roots of matri-
ces, and taking x=,6= 0, t=1/, we finally obtain the expression
cosh (A)=e1
20
@X
n0e
H2n+1 ,1
2A2
(2n)!2n✓1+ 2
(2n+ 1)2◆1
A,0<<+1.(16)
4. Numerical approximations
Truncating the given series (16) until order m, we obtain the approximation
CHm(, A)⇡cosh (A)definedby
CHm(, A)=e1
2 m
X
n=0 e
H2n+1 ,1
2A2
(2n)!2n✓1+ 2
(2n+ 1)2◆!,0<<+1.
(17)
Now, from the definition (13), it follows that e
H2n+1 x, 1
2A2is given by
e
H2n+1 ✓x, 1
2A2◆=(2n+ 1)!
n
X
k=0
(1)kx2(nk)A2nk
k!(2(nk) + 1)!
and, therefore, considering the 2norm, we find that
e
H2n+1 ✓x, 1
2A2◆
2(2n+ 1)!
n
X
k=0
|x|2(nk)
A2
2nk
k!(2(nk) + 1)!
=(2n+ 1)!
n
X
k=0
|x|2(nk)⇣
A2
1/2
2⌘2(nk)
k!(2(nk) + 1)!
=(2n+ 1)!
n
X
k=0 ⇣|x|
A2
1/2
2⌘2(nk)
k!(2(nk) + 1)! .(18)
If A(k, n) is a matrix in Cr⇥r, for n0,k 0, then from [22, p.57] we get
that
8
X
n0X
k0A(k, n)=X
n0
n
X
k=0 A(k, n k).(19)
Indeed, thanks to (19), it can be asserted that
esinh ⇣|x|
A2
1/2
2⌘
|x|kA2k1/2
2
=X
n0⇣|x|
A2
1/2
2⌘2n
(2n+ 1)! X
k0
1
k!
=X
n0X
k0⇣|x|
A2
1/2
2⌘2n
k!(2n+ 1)!
| {z }
=A(k,n)
=X
n0
n
X
k=0 ⇣|x|
A2
1/2
2⌘2(nk)
k!(2(nk) + 1)! ,
from which it is deduced that
n
X
k=0 ⇣|x|
A2
1/2
2⌘2(nk)
k!(2(nk) + 1)!
esinh ⇣|x|
A2
1/2
2⌘
|x|kA2k1/2
2
.(20)
If the Expression (20) is multiplied by (2n+ 1)! and the inequality (18) is
applied, we finally conclude that, for x6= 0,
e
H2n+1 ✓x, 1
2A2◆
2(2n+ 1)!
esinh ⇣|x|
A2
1/2
2⌘
|x|kA2k1/2
2
.(21)
Now, the following expression for the approximation error can be obtained:
kcosh (A)CHm(, A)k2e1
2X
nm+1
e
H2n+1 ,1
2A2
2
(2n)!2n✓1+ 2
(2n+ 1)2◆(22)
e1+ 1
2sinh ⇣
A2
1/2
2⌘
kA2k1/2
2X
nm+1
2n+1
2n✓1+ 2
(2n+ 1)2◆.
For >1, it follows that 2
(2n+ 1)2<1, and
X
nm+1
2n+1
2n✓1+ 2
(2n+ 1)2◆2X
nm+1
2n+1
2n
=4+(4m+ 6)(21)
2m(21)2,
9
function [Lopt,Zopt]=compute_Lz_cosh(m,Lmin,Lmax,incL,Zmin,Zmax,incZ)
Zopt=0;
for L=Lmin:incL:Lmax
for z=Zmin:incZ:Zmax
if z>Zopt
f=(exp(1+1/L^2)*sinh(L*sqrt(z))*(4+(4*m+6)*(L^2-1)))/(sqrt(z)*L^(2*m+1)*(L^2-1)^2);
if f<eps/2
Zopt=z;
Lopt=L;
end
end
end
end
end
Figure 1: MATLAB code for computing optimal values of and z.
thus, from (22) we finally deduce that:
kcosh (A)CHm(, A)k2
e1+ 1
2sinh ⇣
A2
1/2
2⌘4+(4m+ 6)(21)
kA2k1/2
22m+1 (21)2.
(23)
From expression (23), we can derive the optimal values (m,z
m) such that
zm= max 8
<
:z=
A2
2;
e1+ 1
2sinh ⇣z1/2⌘4+(4m+ 6)(21)
z1/22m+1 (21)2<u
9
=
;,
where uis the unit roundo↵in IEEE double precision arithmetic (u=2
53).
The optimal values of zand , for each m, have been obtained with the MAT-
LAB code which appears in Figure 1. Given the order of the Taylor approx-
imation m, this code determines all pairs (L, z) so that it is verified that the
right hand side of (23) is lower than u, varying Lbetween a minimum value
Lmin and a maximum value Lmax in steps equal to incL. In the same way, the
code increments the zvariable from the minimum value zmin to the maximum
value zmax by incz, on each iteration. This allows reducing the search time
by choosing appropriately these parameters according to the value of m.The
precision to be achieved is determined by the values of incL and incz. Once
all possible pairs have been calculated, the chosen pair is the one that has a
maximum value of z. The results are given in Table 1.
Approximation CHm(see (17)) can be expressed as a polynomial with only
even powers of matrix A:
CHm(, A)=
m
X
i=0
p()
iA2i=
m
X
i=0
p()
iBi⌘P()
m(B),(24)
10
Table 1: Values of zmk,mkand ⇥mkof the matrix function cosh(A).
mkzmkmk⇥mk
2 1.085438916 ⇥1053645.569817 1.8509243149007247 ⇥106
4 7.072119354 ⇥102130.7978189 3.810252709308867 ⇥103
6 1.024769681 ⇥10131.00030100 8.9416635239106868 ⇥102
9 1.232994877952250 17.607040100 1.1838963351971854 ⇥100
12 4.840136411698479 10.200005000 5.0162962795121144 ⇥100
16 16.851353484604754 7.9080200400 1.7588311877511131 ⇥101
where B=A2. We present Algorithm 1 which computes the hyperbolic cosine
of a matrix Aby means of the Paterson-Stockmeyer method ([23]). The com-
putational cost of this Algorithm is k+smatrix products, i.e., 2(k+s)n3flops,
where krepresents the position of the vector mkused.
Algorithm 1 Scaling and recovering algorithm for computing C= cosh(A),
where A2Cr⇥r,withmM= 16 the maximum approximation order allowed.
1: B=A2.
2: Choose optimal values of mk2{2,4,6,9,12,16}6mMand the scaling
parameter s2N[{0}.
3: B=4
sB.Scaling matrix B
4: Choose the corresponding mkfrom Table 1.
5: Compute C=P(mk)
mk(B) by the Paterson-Stockmeyer method (see [24,
Section 2]) .
6: for i=1:sdo .Recovering the approximation of cosh(A)
7: C=2C2I.Double angle formula of cosh(A)
8: end for
The basic steps of this algorithm are the choosing of mkand s(step 2), the
computation of P(mk)
mk(B) (a complete study of how to compute P(mk)
mk(B) can
be seen in Section 2 from [24]) (step 5), and the recovering (steps 6-7).
Next, let us show how to compute the values of mkand s. If cosh(A)is
computed from the Taylor series, then the absolute forward error of the Hermite
approximation of cosh(A), denoted by Ef, can be computed as
Ef=
cosh(A)P(k)
mk(B)
=
X
iˆmk
fiBi
,
where ˆmkmk.Iff˜mkis the first value of the above series greater than u,
then we obtain the following approximation:
Ef⇠
=
X
i>˜mk
fiBi
.
11
Table 2: Values of ˆmk,˜mk, and fmax(mk).
m1=2 m2=4 m3=6 m4=9 m5=12 m6=16
ˆmk1 2 2 4 6 9
˜mk1 2 3 10 13 17
fmax(mk) 0 0 1.9⇥1017 5.3⇥1019 3.1⇥1026 3.4⇥1039
Table 2 shows values ˆmk,˜mkand fmax(mk) = max {fi:ˆmki˜mk1},
for each mk2{2,4,6,9,12,16}.
The scaling factor sand the order of the Hermite approximation mkare
obtained by simplifying the following theorem:
Theorem 4.1 ([25]). Let hl(x)=P
il
pixibe a power series with radius of
convergence w,˜
hl(x)=P
il|pi|xi,B2Cn⇥nwith ⇢(B)<w,l2Nand t2N
with 16t6l.Ift0is the multiple of tsuch that l6t06l+t1and
t= max{d1/j
j:j=t, l, l +1,...,t
01,t
0+1,t
0+2,...,l +t1},
where djis an upper bound for ||Bj||,dj>||Bj||, then
||hl(B)|| 6˜
hl(t).
If we apply Theorem 4.1 to the series f˜mk(x)= P
i>˜mk
fixiand ˜
f˜mk(x)=
P
i˜mk|fi|xi, then
Ef=kf˜mk(B)k6˜
f˜mk(t),
for every t,1t˜mk. Let ⇥mkbe
⇥mk= max 8
<
:✓>0: ˜
f˜mk(✓)= X
i>˜mk|fi|✓i6u9
=
;,(25)
then using MATLAB (R2017b) Symbolic Math Toolbox with 200 series terms
and a zero finder, we obtained the values ⇥mkthat verify (25) (see Table 1).
The optimal values mkand sare obtained from the values of tof Theo-
rem 4.1 and from the values ⇥mkof Table 1. A complete study of this question
was developed by the authors in [18, 24]. Next, we reproduce that study.
Let (˜mk)
min =min
16t6˜mk{t}be. If there exists a value mk16 such that
(˜mk)
min ⇥mk, then the forward error Efis lower than u. In this case, we
choose the lower order mksuch that (˜mk)
min ⇥mkand the scaling factor is
s= 0. Otherwise, we choose the Hermite approximation of order 12 or 16
providing the lower cost, with
s= max (0,&1
2log (˜mk)
min
⇥mk)!'),m
k= 12 or 16.
12
For computing (˜mk)
min , we have used the following approximation:
(˜mk)
min ⇡max nd1/˜mk
˜mk,d
1/(˜mk+1)
˜mk+1 o,
where d˜mkand d˜mk+1 are bounds of
B˜mk
and
B˜mk+1
, respectively (see
(16) from [26])). The bounds dl,l=˜mk,˜mk+1 can be computed using products
of norms of matrix powers previously calculated. For example, for mk=6the
powers B2and B3must be obtained, hence (3)
min (˜mk= 3) can be figured as
follows
(3)
min = max ⇢
B3
1/3,min n
B3
kBk,
B2
2o1/4.
The algorithm for computing the values mand sis analogous to Algorithm 2
from [24].
5. Experiments
In this section, we show the results of numerical accuracy and performance of
the proposed algorithm to compute the hyperbolic cosine function. By means
of a CUDA implementation, we also show its performance using an NVIDIA
GPU.
5.1. Numerical experiments
Our MATLAB implementation, named coshmtayher, has been developed
by modifying the coshher MATLAB code given in [7], by replacing the original
Hermite approximation by the new Hermite matrix polynomial developed in this
paper and derived from (16). We have compared the new MATLAB function,
coshmtayher,withcoshher and funmcosh functions defined as:
•coshmtayher: Novel code based on the new developments of Hermite
matrix polynomials (16).
•coshher: Code based on the Hermite series for the matrix hyperbolic
cosine [7].
•funmcosh:funm MATLAB function to compute matrix functions, such as
the matrix hyperbolic cosine.
5.2. Experiments description
The tests have been carried out using MATLAB (R2017b) running on an
Apple Macintosh iMac 27” (iMac retina 5K 27” late 2015) with a quadcore
INTEL i7-6700K 4Ghz processor and 16 Gb of RAM. The following tests were
made using di↵erent matrices:
13
a) Test 1: One hundred diagonalizable 128 ⇥128 randomly generated real
matrices with 1-norm varying from 2.32 to 220.04. These matrices have
the form A=VDVT,whereDis a diagonal matrix with real eigenvalues
and Vis an orthogonal matrix obtained as V=H/p128, where His the
Hadamard matrix. The “exact” matrix hyperbolic cosine was computed
as cosh(A)=Vcosh(D)VT(see [1, pp. 10]), by using the MATLAB
Symbolic Math Toolbox with 128 decimal digit arithmetic in all the com-
putations.
b) Test 2: One hundred non-diagonalizable 128 ⇥128 random real matrices
whose 1-norm range from 6.52 to 249.61. These matrices have the form
A=VJVT,whereJis a Jordan matrix with real eigenvalues with alge-
braic multiplicity varying between 1 and 4, and Vis an orthogonal matrix
obtained as V=H/p128, where His the Hadamard matrix. The “exact”
matrix hyperbolic cosine was computed as cosh(A)=Vcosh(J)VT.
c) Test 3: Thirteen test matrices from the Eigtool MATLAB package [27]
with size 128 ⇥128 and thirty-nine matrices from the matrix function
literature with dimensions lower than or equal to 128 from the function
matrix of the Matrix Computation Toolbox [28]. These matrices have
been scaled so that they have 1-norm not exceeding 1024. The “exact”
matrix hyperbolic cosine was obtained by computing first the eigenvalue
decomposition of matrix A,A=VDV1, by using the MATLAB function
eig, and then computing cosh(A)=Vcosh(D)V1, by means of the
MATLAB’s Symbolic Math Toolbox with 128 decimal digit arithmetic in
all the computations.
Tables 3 and 4 show, respectively, the computational costs and the algorithm
accuracy for the functions under comparison, i.e. coshmtayher,coshher, and
funmcosh, in the three tests described. The algorithm accuracy is tested by
computing the normwise relative error (Er) as
Er = kcosh(A)g
cosh(A)k1
kcosh(A)k1
,
where g
cosh(A) is the computed solution and cosh(A) is the exact solution.
Table 3 shows the computational costs represented in terms of the number of
matrix products (P()) of each code, since the cost of the rest of the operations
is negligible compared to matrix multiplications for large enough matrices. The
funmcosh routine has no matrix products, but a Real Schur Form reduction as
the main cost, with a computational cost of 28n3[29] for an n⇥nmatrix. This
can be expressed as a minimum of 14 matrix products, having in mind that the
cost of a matrix multiplication is 2n3[1, p. 336].
Table 4, on the other hand, shows the percentage of cases in which the
relative errors of coshmtayher are lower, greater or equal than the relative
errors of coshher and funmcosh.
14
Table 3: Matrix products (P) for Tests 1, 2, and 3 using coshmtayher,coshher and funmcosh
MATLAB functions.
P(coshmtayher) P(coshher) P(funmcosh)
Test 1 971 973 1400
Test 2 976 988 1400
Test 3 310 317 560
Table 4: Relative error comparison between coshmtayher vs coshher (rows 1 to 3) and
coshmtayher vs funmcosh (rows 4 to 6) for Test 1, Test 2 and Test 3. The Table entries
show the percentage of cases in which the relative errors of coshmtayher (New Hermite) are
lower, greater or equal to the relative errors of coshher and funmcosh.
Test 1 Test 2 Test 3
Er(coshmtayher)<Er(coshher)76% 79% 65%
Er(coshmtayher)>Er(coshher)21% 20% 25%
Er(coshmtayher)=Er(coshher)3% 1% 10%
Er(coshmtayher)<Er(funmcosh)100% 100% 97.5%
Er(coshmtayher)>Er(funmcosh)0% 0% 2.5%
Er(coshmtayher)=Er(funmcosh)0% 0% 0%
We have plotted in Figures 2, 3, and 4, for the three Tests, respectively, the
normwise relative errors (a), the performance profiles (b), the ratio of relative
errors (c) to show if these ratios are significant:
Er(coshher)/Er(coshmtayher), Er(funmcosh)/Er(coshmtayher),
and the ratio of the matrix products (d):
P(coshher)/P(coshmtayher), P(funmcosh)/P(coshmtayher).
In the performance profile, the ↵coordinate varies between 1 and 5 in steps equal
to 0.1, and the pcoordinate is the probability that the considered algorithm has a
relative error lower than or equal to ↵-times the smallest error over all methods.
The ratios of relative errors are presented in decreasing order with respect to
Er(coshmtayher)/Er(coshher). The solid lines in figures 2a, 3a and 4a is the
function kcoshu,wherekcosh is the condition number of matrix hyperbolic cosine
function [1, Chapter 3] and u=2
53 is the unit roundo↵in the double precision
floating-point arithmetic. Our conclusions are:
•Regarding the normwise relative error shown in Figures 2a, 3a and 4a, in
general, coshmtayher and coshher functions have a very good numerical
stability. This can be appreciated seeing the distance from each matrix
normwise relative error to the cond ⇤uline. In Figures 2a and 3a, the
numerical stability is better because the relative errors are below the cond⇤
uline.
15
0 20 40 60 80 100
Matrix
10-16
10-14
Er
cond*u
coshmtayher
coshher
funmcosh
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
coshmtayher
coshher
funmcosh
(b) Performance profile.
0 20 40 60 80 100
Matrix
100
101
102
103
Relative error ratio
Er(coshher) / Er(coshmtayher)
Er(funmcosh) / Er(coshmtayher)
(c) Ratio of relative errors.
0 20 40 60 80 100
Matrix
1
1.5
2
2.5
Matrix product ratio
P(coshher) / P(coshmtayher)
P(funmcosh) / P(coshmtayher)
(d) Ratio of matrix products.
Figure 2: Experimental results for Test 1.
•The performance profile for the first two tests (Figures 2b, 3b) shows that
accuraccy of the coshmtayher and the coshher methods is similar as ↵
increases. Moreover, both of them have much better accuracy than the
funmcosh method. For the third test, Figure 4b shows that the accuracy of
coshmtayher function is considerably better than the accuracy of coshher
function.
•Quantitatively, Table 4 indicates that coshmtayher code o↵ers a relative
error lower than coshher function in 76%, 79% and 65% of the cases,
respectively, for Tests 1, 2 and 3. Similarly, coshmtayher method improves
funmcosh function in terms of the relative error incurred in the calculation
of the 100% of the matrices that comprise Tests 1 and 2, or in the 97.5%
of the ones belonging to Test 3. These numerical values are corroborated
by the error ratios depicted in Figures 2c, 3c and 4c. As we can notice,
this ratio is greater than one for a percentage of matrices that coincides
with the values exposed in Table 4, which obviously indicates that the
coshmtayher code is the most accurate and reliable one. Whereas this
16
0 20 40 60 80 100
Matrix
10-16
10-15
10-14
10-13
Er
cond*u
coshmtayher
coshher
funmcosh
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
coshmtayher
coshher
funmcosh
(b) Performance profile.
0 20 40 60 80 100
Matrix
100
101
102
Relative error ratio
Er(coshher) / Er(coshmtayher)
Er(funmcosh) / Er(coshmtayher)
(c) Ratio of relative errors.
0 20 40 60 80 100
Matrix
1
1.5
2
2.5
Matrix product ratio
P(coshher) / P(coshmtayher)
P(funmcosh) / P(coshmtayher)
(d) Ratio of matrix products.
Figure 3: Experimental results for Test 2.
error ratio takes values not very far from unity between coshmtayher and
coshher functions for the matrices of the first two Tests, this factor reaches
more distant values from unity for many matrices that are part of Test 3.
Special mention deserves the value of this factor between coshmtayher and
funmcosh codes, which takes very high values particularly in the matrices
of Test 3. More in detail:
Test 1 (Figure 2c):
Er(coshher) 2[0.79 Er(coshmtayher),3.68 Er(coshmtayher)],
Er(funmcosh) 2[22.49 Er(coshmtayher),523.3Er(coshmtayher)].
Test 2 (Figure 3c):
Er(coshher) 2[0.80 Er(coshmtayher),2.10 Er(coshmtayher)],
Er(funmcosh) 2[19.84 Er(coshmtayher),238.6Er(coshmtayher)]
Test 3 (Figure 4c):
Er(coshher) 2[0.68 Er(coshmtayher),199.0Er(coshmtayher)],
Er(funmcosh) 2[0.48 Er(coshmtayher),6.65e+15 Er(coshmtayher)].
17
0 10 20 30 40
Matrix
10-15
10-10
10-5
Er
cond*u
coshmtayher
coshher
funmcosh
(a) Normwise relative errors.
12345
0
0.2
0.4
0.6
0.8
1
p
coshmtayher
coshher
funmcosh
(b) Performance profile.
0 10 20 30 40
Matrix
100
105
1010
1015
Relative error ratio
Er(coshher) / Er(coshmtayher)
Er(funmcosh) / Er(coshmtayher)
(c) Ratio of relative errors.
0 10 20 30 40
Matrix
1
1.5
2
2.5
Matrix product ratio
P(coshher) / P(coshmtayher)
P(funmcosh) / P(coshmtayher)
(d) Ratio of matrix products.
Figure 4: Experimental results for Test 3.
•As it was shown in Table 3, coshmtayher function has a computational
cost significantly lower than funmcosh code and slightly lower than coshher
function. As expected, the numerical tests also confirm this result:
Test 1 (Figure 2d):
P(coshher) 2[0.86 P(coshmtayher),1.20 P(coshmtayher)],
P(funmcosh) 2[1.27 P(coshmtayher),2.80 P(coshmtayher)].
Test 2 (Figure 3d):
P(coshher) 2[0.86 P(coshmtayher),1.20 P(coshmtayher)],
P(funmcosh) 2[1.27 P(coshmtayher),2.80 P(coshmtayher)].
Test 3 (Figure 4d):
P(coshher) 2[0.85 P(coshmtayher),1.29 P(coshmtayher)],
P(funmcosh) 2[1.00 P(coshmtayher),2.80 P(coshmtayher)].
18
0
5
10
15
20
25
30
35
40
45
50
1000 2000 3000 4000 5000 6000 7000 8000 9000
Time (sec.)
Problem dimension
coshher CPU
coshmtayher CPU
coshmtayher GPU
coshher GPU
Figure 5: Execution times (sec.) to compute the matrix hyperbolic cosine in CPU and GPU
by means of coshmtayher and coshher codes on randomly generated large matrices.
5.3. Results in GPU
We have implemented an “accelerated” version that allows to execute our
algorithm to compute the matrix hyperbolic cosine on NVIDIA GPUs. Current
GPUs are computational devices that allow to boost performance on data paral-
lelism applications, i.e. applications that operate over many independent data.
This is the case of matrix multiplication, which is a highly optimized operation
for GPUs in its current implementation routine included in the CUBLAS pack-
age [30]. Our GPU algorithms are all based on polynomial evaluations which,
in turn, results in intensive use of matrix products. The basic MATLAB al-
gorithm is used in this case with some very costly operations (those based on
matrix multiplication) addressed to the GPU through CUDA language by a
means of implementing a MEX file.
We have carried out our experimental results for this subsection on a com-
puter equipped with two processors Intel Xeon CPU E5-2698 v4 @ 2.20GHz
featuring 20 cores each. To obtain the algorithm performance on GPU we used
one NVIDIA Tesla P100-SXM2 (Pascal architecture) attached to the PCI of
this workstation. This GPU features 3584 CUDA cores and 16 GB of memory.
Figure 5 shows the reduction in execution time when we use a GPU to accel-
erate the computations. The figure also indicates that, for large randomly gen-
erated matrices, the former version of the matrix hyperbolic cosine (coshher),
presented in [7], and the new algorithm (coshmtayher) show a similar per-
formance in both the CPU and GPU subsystems. Experimental results with
funmcosh function are not shown in the figure, since they are very large in com-
parison. For instance, for a matrix Aof order n= 2000, the execution time
19
to obtain cosh(A) is 83 seconds whilst it is 0.71 seconds and 0.73 seconds with
coshmtayher and coshher, respectively.
6. Conclusions
A new polynomial Hermite matrix algorithm has been developed in this work
for computing the matrix hyperbolic cosine. We have implemented a MATLAB
routine that is also capable of using an existing GPU in the system. This
new algorithm has been compared with other MATLAB implementations and,
at the light of the tests carried out, we have verified that the new algorithm
behaves in a numerically stable manner showing very good results. One of
the main conclusions is that MATLAB implementations based on the Hermite
series (coshher and coshmtayher) have turned out to be much more accurate
and efficient than others focused on the funm MATLAB function when using
funmcosh code. In addition, the new implementation based on Hermite series
(coshmtayher), proposed here, o↵ers much better numerical accuracy than that
of coshher algorithm, which is also based on the Hermite series, with a slightly
lower computational cost in terms of matrix products.
Acknowledgements
This work has been partially supported by Spanish Ministerio de Econom´ıa
y Competitividad and European Regional Development Fund (ERDF) grants
TIN2017-89314-P and by the Programa de Apoyo a la Investigaci´on y De-
sarrollo 2018 of the Universitat Polit`ecnica de Val`encia (PAID-06-18) grants
SP20180016.
References
[1] N. J. Higham, Functions of Matrices: Theory and Computation, SIAM,
Philadelphia, PA, USA, 2008.
[2] L. J´odar, E. Navarro, A. Posso, M. Casab´an, Constructive solution of
strongly coupled continuous hyperbolic mixed problems, Applied Numeri-
cal Mathematics 47 (3–4) (2003) 477–492.
[3] E. Estrada, D. J. Higham, N. Hatano, Communicability and multipartite
structures in complex networks at negative absolute temperatures, Physical
Review E 78 (2) (2008) 026102.
[4] E. Estrada, J. A. Rodr´ıguez-Vel´azquez, Spectral measures of bipartivity in
complex networks, Physical Review E 72 (4) (2005) 046105.
[5] E. Estrada, J. G´omez-Garde˜nes, Network bipartivity and the transporta-
tion efficiency of european passenger airlines, Physica D: Nonlinear Phe-
nomena 323 (2016) 57–63.
20
[6] J. Kunegis, G. Gr¨oner, T. Gottron, Online dating recommender systems:
The split-complex number approach, in: Proceedings of the 4th ACM Rec-
Sys Workshop on Recommender Systems and the Social Web, RSWeb ’12,
Association for Computing Machinery, 2012, pp. 37–44.
[7] E. Defez, J. Sastre, J. Ib´a˜nez, J. Peinado, Solving engineering models using
hyperbolic matrix functions, Applied Mathematical Modelling 40 (4) (2016)
2837–2844.
[8] N. J. Higham, P. Kandolf, Computing the action of trigonometric and
hyperbolic matrix functions, SIAM Journal on Scientific Computing 39 (2)
(2017) A613–A627.
[9] A. H. Al-Mohy, A truncated Taylor series algorithm for computing the
action of trigonometric and hyperbolic matrix functions, SIAM Journal on
Scientific Computing 40 (3) (2018) A1696–A1713.
[10] N. Dunford, J. T. Schwartz, Linear Operators, Part I: General Theory,
John Wiley & Songs, Inc., 1988.
[11] G. H. Golub, C. F. Van Loan, Matrix Computations, The Johns Hopkins
University Press, Baltimore, Maryland, USA, 2013.
[12] G. Dattoli, C. Cesarano, On a new family of Hermite polynomials associ-
ated to parabolic cylinder functions, Applied Mathematics and Computa-
tion 141 (1) (2003) 143–149.
[13] A. Yari, Numerical solution for fractional optimal control problems by Her-
mite polynomials, Journal of Vibration and Control 27 (5-6) (2021) 698–
716.
[14] D. Masoero, P. Ro↵elsen, Roots of generalised Hermite polynomials when
both parameters are large, Nonlinearity 34 (3) (2021) 1663–1732.
[15] J. J´odar, R. Company, Hermite matrix polynomials and second order ma-
trix di↵erential equations, Approximation Theory and its Applications
12 (2) (1996) 20–30.
[16] E. Defez, A. Herv´as, L. J´odar, A. Law, Bounding Hermite matrix polyno-
mials, Mathematical and Computer Modelling 40 (1) (2004) 117–125.
[17] E. Defez, L. J´odar, Some applications of the Hermite matrix polynomi-
als series expansions, Journal of Computational and Applied Mathematics
99 (1) (1998) 105–117.
[18] E. Defez, J. Ib´a˜nez, J. Peinado, J. Sastre, P. Alonso-Jord´a, An efficient and
accurate algorithm for computing the matrix cosine based on new Hermite
approximations, Journal of Computational and Applied Mathematics 348
(2019) 1–13.
21
[19] J. Sastre, J. Ib´a˜nez, E. Defez, P. Ruiz, New scaling-squaring Taylor algo-
rithms for computing the matrix exponential, SIAM Journal on Scientific
Computing 37 (1) (2015) A439–A455.
[20] P. Alonso, J. Peinado, J. Ib´a˜nez, J. Sastre, E. Defez, Computing matrix
trigonometric functions with gpus through matlab, The Journal of Super-
computing (2018) 1–14.
[21] J. Sastre, Efficient evaluation of matrix polynomials, Linear Algebra and
its Applications 539 (2018) 229–250.
[22] E. D. Rainville, Special functions, Vol. 442, New York, 1960.
[23] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar multiplica-
tions necessary to evaluate polynomials, SIAM Journal on Computing 2 (1)
(1973) 60–66.
[24] J. Sastre, J. Ib´a˜nez, P. Alonso, J. Peinado, E. Defez, Two algorithms for
computing the matrix cosine function, Applied Mathematics and Compu-
tation 312 (2017) 66–77.
[25] J. Sastre, J. Ib´a˜nez, P. Ruiz, E. Defez, Efficient computation of the matrix
cosine, Applied Mathematics and Computation 219 (14) (2013) 7575–7585.
[26] P. Ruiz, J. Sastre, J. Ib´a˜nez, E. Defez, High perfomance computing of the
matrix exponential, J. Comput. Appl. Math. 291 (2016) 370–379.
[27] T. G. Wright, Eigtool, version 2.1, URL:
http://www.comlab.ox.ac.uk/pseudospectra/eigtool.
[28] N. J. Higham, The matrix computation toolbox, URL:
http://www.ma.man.ac.uk/ higham/mctoolbox.
[29] M. I. Smith, A Schur algorithm for computing matrix pth roots, SIAM J.
Matrix Anal. Appl. 24 (4) (2003) 971–989.
[30] NVIDIA, CUDA. CUBLAS library (2009).
22