Content uploaded by Jesús Peinado

Author content

All content in this area was uploaded by Jesús Peinado on Oct 21, 2022

Content may be subject to copyright.

Content uploaded by Pedro Alonso

Author content

All content in this area was uploaded by Pedro Alonso on Feb 08, 2022

Content may be subject to copyright.

New Hermite series expansion for computing the matrix

hyperbolic cosine

E. Defeza,⇤,J.Ib´a˜nez

a, J. Peinadob, P. Alonso-Jord´ab, Jos´e M. Alonsoc

Universitat Polit`ecnica de Val`encia, Camino de Vera s/n, 46022, Valencia. Spain

aInstituto de Matem´atica Multidisciplinar

bDepartamento Sistemas Inform´aticos y Computaci´on.

cInstituto de Instrumentaci´on para Imagen Molecular

Abstract

There are, currently, very few implementations to compute the hyperbolic cosine

of a matrix. This work tries to ﬁll this gap. To this end, we ﬁrst introduce

both a new rational-polynomial Hermite matrix expansion and a formula for

the forward relative error of Hermite approximation in exact arithmetic with a

sharp bound for the forward error. This matrix expansion allows obtaining a

new accurate and eﬃcient method for computing the hyperbolic matrix cosine.

We present a MATLAB implementation, based on this method, which shows a

superior eﬃciency and a better accuracy than other state-of-the-art methods.

The algorithm developed on the basis of this method is also able to run on an

NVIDIA GPU thanks to a MEX ﬁle that connects the MATLAB implementation

to the CUDA code.

Key words:

Hermite matrix approximation, matrix hyperbolic cosine, error analysis, GPU

computing

1. Introduction

Functions of a square matrix Afrequently arise in many areas of science

and technology, especially those that require the resolution of ﬁrst and second

order di↵erential systems [1, pp. 35–37]. In particular, the hyperbolic matrix

functions cosh (A) and sinh (A), deﬁned in terms of the exponential matrix eA

as

cosh (A)=eA+eA

2,sinh (A)=eAeA

2,

⇤Corresponding author

Email addresses: edefez@imm.upv.es (E. Defez), jjibanez@dsic.upv.es (J. Ib´a˜nez),

jpeinado@dsic.upv.es (J. Peinado), palonso@upv.es (P. Alonso-Jord´a),

jmalonso@dsic.upv.es (Jos´e M. Alonso)

Preprint submitted to Journal of Computational and Applied MathematicsNovember 27, 2021

Manscrip Click here to ie linked References

are involved in the solution of coupled hyperbolic systems of partial di↵erential

equations [2]. Moreover, we also can ﬁnd applicability for these functions in

other ﬁelds of science and engineering, e.g., communicability analysis in complex

networks [3–6].

A way of computing these matrix functions is to use the well-known relations:

cosh (A) = cos (iA),and sinh (A)=icos ✓Ai⇡

2I◆,i

2=1,

provided there exists a method to compute cos (A). Notwithstanding, this ap-

proach has the disadvantage of requiring complex arithmetic even when Ais a

real matrix.

There exist, however, alternative and more practical ways to evaluate these

matrix functions. One of them uses Hermite matrix polynomials series expan-

sions [7]. Other methods based on Taylor series have been studied to evaluate

the action of these functions on vectors [8, 9].

1.1. Notation

Throughout this paper, we denote by Cr⇥rthe set of all the complex square

matrices of size r. We denote by ⇥and I, respectively, the zero and the identity

matrix in Cr⇥r.IfA2Cr⇥r, we denote by (A) the set of all the eigenvalues

of A. For a real number x,dxedenotes the ceiling function, that is, the least

integer greater than or equal to xand bxcdenotes the ﬂoor function, that is,

the largest integer less than or equal to x.

If f(z) and g(z) are holomorphic functions in an open set ⌦of the complex

plane, and if (A)⇢⌦, we denote by f(A) and g(A), respectively, the image

by the Riesz-Dunford functional calculus of functions f(z) and g(z) acting on

the matrix A,beingf(A)g(A)=g(A)f(A) [10, pp. 558]. We say that the ma-

trix Ais positive stable if Re(z)>0 for every eigenvalue z2(A). In this

case, let us denote pA=A1/2=exp

1

2log (A)the image of the function

z1/2=exp

1

2log (z)by the Riesz-Dunford functional calculus, acting on the

matrix A, where log (z) denotes the principal branch of the complex logarithm.

In this paper, we use consistent matrix norms. In particular, kAk2is the

2-norm. In tests, we use the 1-norm of a matrix A2Cr⇥rdeﬁned by kAk1=

sup

x6=0

kAxk1

kxk1

,wherek·k1denotes the vector 1-norm deﬁned as kyk1=|y1|+···+

|yr|,y2Cr[11, Chapter 2]. When a concrete norm is not indicated in the text,

any consistent norm can be applied.

Although originally introduced by Laplace in 1810, Hermite polynomials get

their name from the French mathematician Charles Hermite, who wrote about

them in 1864. Hermite polynomials and their generalizations are an active

research area in the ﬁeld of special functions, and its applications are numerous,

see for example the references [12–14].

2

For a positive stable matrix A2Cr⇥r,then-th Hermite matrix polynomial

is deﬁned in [15] by

Hn(x, A)=n!bn

2c

X

k=0

(1)k⇣p2A⌘n2k

k!(n2k)! xn2k,(1)

which satisﬁes the three-term matrix recurrence:

Hn(x, A)=xp2AHn1(x, A)2(n1)Hn2(x, A),n1,

H1(x, A)=⇥,H

0(x, A)=I. 9

=

;(2)

The following upper bounds of Hermite matrix polynomials,

kH2n(x, A)k2gn(x),n1,

kH2n+1(x, A)k2|x|

✓A

2◆1

2

2

2gn(x)

n+1 ,n0,

9

>

>

>

=

>

>

>

;

(3)

were demonstrated in [16], where the function gn(x) is deﬁned as

gn(x)=(2n+ 1)!22n

n!exp ✓5

2kAk2x2◆,n0.(4)

The Hermite matrix polynomial sequence {Hn(x, A)}n0has the following

generating function [15]:

extp2A=et2X

n0

Hn(x, A)

n!tn,

from which we can derive the following expressions for the matrix hyperbolic

sine and cosine [17]:

cosh ⇣xtp2A⌘=et2X

n0

H2n(x, A)

(2n)! t2n

sinh ⇣xtp2A⌘=et2X

n0

H2n+1(x, A)

(2n+ 1)! t2n+1

9

>

>

>

=

>

>

>

;

,x2R,|t|<1.(5)

Recent polynomial and rational developments in series of Hermite polyno-

mials have been obtained in [18]. These results have led to the development of

a more accurate and eﬃcient method, compared with others proposed in the

literature, for the computation of the matrix cosine function. In this paper, we

calculate the exact value of the following Hermite matrix polynomial series:

X

n0

H2n+1(x, A)

(2n)! t2n:= A(x, t;A),(6)

3

X

n0

H2n+2(x, A)

(2n+ 1)! t2n+1 := B(x, t;A),(7)

and X

n0

H2n+3(x, A)

(2n+ 1)! t2n+1 := C(x, t;A),(8)

which result in a new expansion of the hyperbolic matrix cosine in Hermite

matrix polynomials. This method aims to improve the one proposed in [7] and

to achieve more accuracy in the results, for a large variety of matrices, with a

lower computational cost in terms of matrix products.

The organization of this paper is as follows. Section 2 presents the proofs

for the formulas (6)-(8) introduced in this paper. Section 3 deals with the new

rational-polynomial Hermite matrix expansions for the hyperbolic matrix co-

sine. The proposed algorithm and its MATLAB implementation is described in

Section 4. The numerical results are exposed in Section 5. Some ﬁnal conclu-

sions are given in Section 6.

2. A proof of formulas (6)-(8)

The aim is to calculate the exact value of matrix series A(x, t;A),B(x, t;A)

and C(x, t;A) deﬁned by (6)-(8). Next, we prove that all matrix series are

convergent. Taking into account (3), we have

H2n+1(x, A)

(2n)! t2n

2|x|

✓A

2◆1

2

2

2gn(x)

(n+ 1)(2n)!|t|2n.

Since using (4), X

n0

gn(x)

(n+ 1)(2n)!|t|2nis convergent for |t|<1, the matrix

series A(x, t;A) deﬁned by (6) is convergent in any compact real interval. In

the same way, we have

H2n+2(x, A)

(2n+ 1)! t2n+1

2gn+1(x)

(2n+ 1)!|t|2n+1.

Since using (4), X

n0

gn+1(x)

(2n+ 1)!|t|2n+1 is convergent for |t|<1, the matrix

series B(x, t;A) deﬁned by (7) is convergent in any compact real interval. Anal-

ogously and taking into account (3) again, we have

H2n+3(x, A)

(2n+ 1)! t2n+1

2|x|

✓A

2◆1

2

2

2gn+1(x)

(n+ 2)(2n+ 1)!|t|2n+1.

4

Since using (4), X

n0

gn+1(x)

(n+ 2)(2n+ 1)!|t|2n+1 is convergent for |t|<1,the

matrix series C(x, t;A) deﬁned by (8) is convergent in any compact real interval.

Using now (2), (6) and the fact that H1(x, A)=p2Ax, we can write

A(x, t;A)=⇣xp2A⌘X

n0

H2n(x, A)

(2n)! t2n2X

n1

(2n)H2n1(x, A)

(2n)! t2n

=H1(x, A)et2cosh ⇣xtp2A⌘2tX

n1

H2n1(x, A)

(2n1)! t2n1

=H1(x, A)et2cosh ⇣xtp2A⌘2tX

n0

H2n+1(x, A)

(2n+ 1)! t2n+1

=H1(x, A)et2cosh ⇣xtp2A⌘2tet2sinh ⇣xtp2A⌘

=et2hH1(x, A) cosh ⇣xtp2A⌘2tsinh ⇣xtp2A⌘i.

The proof of (7) is similar:

B(x, t;A)=⇣xp2A⌘X

n0

H2n+1(x, A)

(2n+ 1)! t2n+1 2X

n0

(2n+ 1)H2n(x, A)

(2n+ 1)! t2n

=H1(x, A)et2sinh ⇣xtp2A⌘2tX

n0

H2n(x, A)

(2n)! t2n

=H1(x, A)et2sinh ⇣xtp2A⌘2tet2cosh ⇣xtp2A⌘

=et2hH1(x, A)sinh⇣xtp2A⌘2tcosh ⇣xtp2A⌘i.

Working analogously for (8):

C(x, t;A)=xp2AX

n0

H2n+2(x, A)

(2n+ 1)! t2n+1 2X

n0

(2n+ 2)H2n+1(x, A)

(2n+ 1)! t2n+1

=H1(x, A)B(x, t;A)20

@X

n0

(2n+ 1)H2n+1(x, A)

(2n+ 1)! t2n+1

+X

n0

H2n+1(x, A)

(2n+ 1)! t2n+11

A

=H1(x, A)B(x, t;A)20

@X

n0

H2n+1(x, A)

(2n)! t2n+1 +et2sinh ⇣xtp2A⌘1

A

=H1(x, A)B(x, t;A)2tA(x, t;A)2et2sinh ⇣xtp2A⌘.

5

Taking into account the values of A(x, t;A) and B(x, t;A), we get

C(x, t;A)=H1(x, A)B(x, t;A)2tA(x, t;A)2et2sinh ⇣xtp2A⌘

=et2hH1(x, A)2+(4t22)Isinh ⇣xtp2A⌘4tH1(x, A) cosh ⇣xtp2A⌘i.

By (1), we have that H1(x, A)=p2Ax, H2(x, A)=2x2A2I, and we can

rewrite the last expression of C(x, t;A) in the form

C(x, t;A):=et2hH2(x, A)+4t2Isinh ⇣xtp2A⌘4tH1(x, A) cosh ⇣xtp2A⌘i.

Summarizing, the following result has been established:

Theorem 2.1. Let A2Cr⇥rbe a positive stable matrix, x2R,|t|<+1.

Then

X

n0

H2n+1(x, A)

(2n)! t2n=et2hH1(x, A) cosh ⇣xtp2A⌘2tsinh ⇣xtp2A⌘i,

X

n0

H2n+2(x, A)

(2n+1)! t2n+1 =et2hH1(x, A) sinh ⇣xtp2A⌘2tcosh ⇣xtp2A⌘i,

X

n0

H2n+3(x, A)

(2n+1)! t2n+1 =et2hH2(x, A)+4t2Isinh ⇣xtp2A⌘4tH1(x, A) cosh ⇣xtp2A⌘i .

9

>

>

>

>

>

>

>

>

>

>

>

>

=

>

>

>

>

>

>

>

>

>

>

>

>

;

(9)

Having in mind that the Hermite matrix polynomial Hn(x, A) coincides with

the Hermite polynomial Hn(x), taking r= 1 and A= 2 (see [15] for more

details), we get the following corollary:

Corollary 1. Let {Hn(x)}n0be the sequence of Hermite polynomials, x2

R,|t|<+1. Then

X

n0

H2n+1(x)

(2n)! t2n=et2[H1(x) cosh (2xt)2tsinh (2xt)] ,

X

n0

H2n+2(x)

(2n+ 1)! t2n+1 =et2[H1(x)sinh(2xt)2tcosh (2xt)] ,

X

n0

H2n+3(x)

(2n+ 1)! t2n+1 =et2⇥H2(x)+4t2sinh (2xt)4tH1(x) cosh (2xt)

⇤.

9

>

>

>

>

>

>

>

>

>

>

>

>

=

>

>

>

>

>

>

>

>

>

>

>

>

;

(10)

Formulas (10) are new in the literature of Hermite polynomials and special

functions.

6

3. Some new Hermite matrix series expansions for the hyperbolic

matrix cosine

Let A2Cr⇥rbe a positive stable matrix. Then the matrix polynomial

H1(x, A)=p2Ax is invertible if x6= 0. Substituting sinh ⇣xtp2A⌘given in

(5) into the ﬁrst expression of (9), we obtain a new rational expression for the

hyperbolic matrix cosine in terms of Hermite matrix polynomials:

cosh ⇣xtp2A⌘=et20

@X

n0

H2n+1(x, A)

(2n)! ✓1+ 2t2

2n+1◆t2n1

A[H1(x, A)]1,

x6=0,|t|<+1.

(11)

Substituting sinh ⇣xtp2A⌘given in (5) into the second expression of (9)

and using the three-term matrix recurrence (2), we obtain the expression of

cosh ⇣xtp2A⌘given in (5).

On the other hand, replacing the expression of sin ⇣xtp2A⌘given in (5)

into the third expression of (9), we get another new rational expression for the

hyperbolic matrix cosine in terms of Hermite matrix polynomials:

cosh ⇣xtp2A⌘=

=et2

4

2

4X

n0

H2n+3(x, A)

(2n+1)! t2nH2(x, A)+4t2I?0

@X

n0

H2n+1(x, A)

(2n+1)! t2n+1 1

A3

5[H1(x, A)]1,

x6=0,|t|<+1.(12)

Comparing (12) with (11), we observe that there is always an extra matrix

product when evaluating (12), the matrix product indicated by the symbol “?”.

Due to the importance of reducing the number of matrix products, see [19–21]

for more details, we will focus mainly on the expansion (11).

From (1), it follows that, for x6= 0:

H2n+1 (x, A)[H1(x, A)]1=(2n+ 1)!

n

X

k=0

(1)kx2(nk)(2A)nk

k!(2(nk) + 1)!

=e

H2n+1 (x, A),(13)

where

e

Hn(x, A)=n!bn

2c

X

k=0

(1)k⇣p2A⌘n2k1

k!(n2k)! xn2k1,(14)

7

so the right hand side of (13) is still deﬁned in the case where matrix Ais

singular. In this way, we can rewrite the relation (11) in terms of the matrix

polynomial e

H2n+1 (x, A) as

cosh ⇣xtp2A⌘=et20

@X

n0e

H2n+1 (x, A)

(2n)! ✓1+ 2t2

2n+1◆t2n1

A,

x2R,|t|<+1.

(15)

Replacing matrix Aby A2/2 in (15) we can avoid the square roots of matri-

ces, and taking x=,6= 0, t=1/, we ﬁnally obtain the expression

cosh (A)=e1

20

@X

n0e

H2n+1 ,1

2A2

(2n)!2n✓1+ 2

(2n+ 1)2◆1

A,0<<+1.(16)

4. Numerical approximations

Truncating the given series (16) until order m, we obtain the approximation

CHm(, A)⇡cosh (A)deﬁnedby

CHm(, A)=e1

2 m

X

n=0 e

H2n+1 ,1

2A2

(2n)!2n✓1+ 2

(2n+ 1)2◆!,0<<+1.

(17)

Now, from the deﬁnition (13), it follows that e

H2n+1 x, 1

2A2is given by

e

H2n+1 ✓x, 1

2A2◆=(2n+ 1)!

n

X

k=0

(1)kx2(nk)A2nk

k!(2(nk) + 1)!

and, therefore, considering the 2norm, we ﬁnd that

e

H2n+1 ✓x, 1

2A2◆

2(2n+ 1)!

n

X

k=0

|x|2(nk)

A2

2nk

k!(2(nk) + 1)!

=(2n+ 1)!

n

X

k=0

|x|2(nk)⇣

A2

1/2

2⌘2(nk)

k!(2(nk) + 1)!

=(2n+ 1)!

n

X

k=0 ⇣|x|

A2

1/2

2⌘2(nk)

k!(2(nk) + 1)! .(18)

If A(k, n) is a matrix in Cr⇥r, for n0,k 0, then from [22, p.57] we get

that

8

X

n0X

k0A(k, n)=X

n0

n

X

k=0 A(k, n k).(19)

Indeed, thanks to (19), it can be asserted that

esinh ⇣|x|

A2

1/2

2⌘

|x|kA2k1/2

2

=X

n0⇣|x|

A2

1/2

2⌘2n

(2n+ 1)! X

k0

1

k!

=X

n0X

k0⇣|x|

A2

1/2

2⌘2n

k!(2n+ 1)!

| {z }

=A(k,n)

=X

n0

n

X

k=0 ⇣|x|

A2

1/2

2⌘2(nk)

k!(2(nk) + 1)! ,

from which it is deduced that

n

X

k=0 ⇣|x|

A2

1/2

2⌘2(nk)

k!(2(nk) + 1)!

esinh ⇣|x|

A2

1/2

2⌘

|x|kA2k1/2

2

.(20)

If the Expression (20) is multiplied by (2n+ 1)! and the inequality (18) is

applied, we ﬁnally conclude that, for x6= 0,

e

H2n+1 ✓x, 1

2A2◆

2(2n+ 1)!

esinh ⇣|x|

A2

1/2

2⌘

|x|kA2k1/2

2

.(21)

Now, the following expression for the approximation error can be obtained:

kcosh (A)CHm(, A)k2e1

2X

nm+1

e

H2n+1 ,1

2A2

2

(2n)!2n✓1+ 2

(2n+ 1)2◆(22)

e1+ 1

2sinh ⇣

A2

1/2

2⌘

kA2k1/2

2X

nm+1

2n+1

2n✓1+ 2

(2n+ 1)2◆.

For >1, it follows that 2

(2n+ 1)2<1, and

X

nm+1

2n+1

2n✓1+ 2

(2n+ 1)2◆2X

nm+1

2n+1

2n

=4+(4m+ 6)(21)

2m(21)2,

9

function [Lopt,Zopt]=compute_Lz_cosh(m,Lmin,Lmax,incL,Zmin,Zmax,incZ)

Zopt=0;

for L=Lmin:incL:Lmax

for z=Zmin:incZ:Zmax

if z>Zopt

f=(exp(1+1/L^2)*sinh(L*sqrt(z))*(4+(4*m+6)*(L^2-1)))/(sqrt(z)*L^(2*m+1)*(L^2-1)^2);

if f<eps/2

Zopt=z;

Lopt=L;

end

end

end

end

end

Figure 1: MATLAB code for computing optimal values of and z.

thus, from (22) we ﬁnally deduce that:

kcosh (A)CHm(, A)k2

e1+ 1

2sinh ⇣

A2

1/2

2⌘4+(4m+ 6)(21)

kA2k1/2

22m+1 (21)2.

(23)

From expression (23), we can derive the optimal values (m,z

m) such that

zm= max 8

<

:z=

A2

2;

e1+ 1

2sinh ⇣z1/2⌘4+(4m+ 6)(21)

z1/22m+1 (21)2<u

9

=

;,

where uis the unit roundo↵in IEEE double precision arithmetic (u=2

53).

The optimal values of zand , for each m, have been obtained with the MAT-

LAB code which appears in Figure 1. Given the order of the Taylor approx-

imation m, this code determines all pairs (L, z) so that it is veriﬁed that the

right hand side of (23) is lower than u, varying Lbetween a minimum value

Lmin and a maximum value Lmax in steps equal to incL. In the same way, the

code increments the zvariable from the minimum value zmin to the maximum

value zmax by incz, on each iteration. This allows reducing the search time

by choosing appropriately these parameters according to the value of m.The

precision to be achieved is determined by the values of incL and incz. Once

all possible pairs have been calculated, the chosen pair is the one that has a

maximum value of z. The results are given in Table 1.

Approximation CHm(see (17)) can be expressed as a polynomial with only

even powers of matrix A:

CHm(, A)=

m

X

i=0

p()

iA2i=

m

X

i=0

p()

iBi⌘P()

m(B),(24)

10

Table 1: Values of zmk,mkand ⇥mkof the matrix function cosh(A).

mkzmkmk⇥mk

2 1.085438916 ⇥1053645.569817 1.8509243149007247 ⇥106

4 7.072119354 ⇥102130.7978189 3.810252709308867 ⇥103

6 1.024769681 ⇥10131.00030100 8.9416635239106868 ⇥102

9 1.232994877952250 17.607040100 1.1838963351971854 ⇥100

12 4.840136411698479 10.200005000 5.0162962795121144 ⇥100

16 16.851353484604754 7.9080200400 1.7588311877511131 ⇥101

where B=A2. We present Algorithm 1 which computes the hyperbolic cosine

of a matrix Aby means of the Paterson-Stockmeyer method ([23]). The com-

putational cost of this Algorithm is k+smatrix products, i.e., 2(k+s)n3ﬂops,

where krepresents the position of the vector mkused.

Algorithm 1 Scaling and recovering algorithm for computing C= cosh(A),

where A2Cr⇥r,withmM= 16 the maximum approximation order allowed.

1: B=A2.

2: Choose optimal values of mk2{2,4,6,9,12,16}6mMand the scaling

parameter s2N[{0}.

3: B=4

sB.Scaling matrix B

4: Choose the corresponding mkfrom Table 1.

5: Compute C=P(mk)

mk(B) by the Paterson-Stockmeyer method (see [24,

Section 2]) .

6: for i=1:sdo .Recovering the approximation of cosh(A)

7: C=2C2I.Double angle formula of cosh(A)

8: end for

The basic steps of this algorithm are the choosing of mkand s(step 2), the

computation of P(mk)

mk(B) (a complete study of how to compute P(mk)

mk(B) can

be seen in Section 2 from [24]) (step 5), and the recovering (steps 6-7).

Next, let us show how to compute the values of mkand s. If cosh(A)is

computed from the Taylor series, then the absolute forward error of the Hermite

approximation of cosh(A), denoted by Ef, can be computed as

Ef=

cosh(A)P(k)

mk(B)

=

X

iˆmk

fiBi

,

where ˆmkmk.Iff˜mkis the ﬁrst value of the above series greater than u,

then we obtain the following approximation:

Ef⇠

=

X

i>˜mk

fiBi

.

11

Table 2: Values of ˆmk,˜mk, and fmax(mk).

m1=2 m2=4 m3=6 m4=9 m5=12 m6=16

ˆmk1 2 2 4 6 9

˜mk1 2 3 10 13 17

fmax(mk) 0 0 1.9⇥1017 5.3⇥1019 3.1⇥1026 3.4⇥1039

Table 2 shows values ˆmk,˜mkand fmax(mk) = max {fi:ˆmki˜mk1},

for each mk2{2,4,6,9,12,16}.

The scaling factor sand the order of the Hermite approximation mkare

obtained by simplifying the following theorem:

Theorem 4.1 ([25]). Let hl(x)=P

il

pixibe a power series with radius of

convergence w,˜

hl(x)=P

il|pi|xi,B2Cn⇥nwith ⇢(B)<w,l2Nand t2N

with 16t6l.Ift0is the multiple of tsuch that l6t06l+t1and

t= max{d1/j

j:j=t, l, l +1,...,t

01,t

0+1,t

0+2,...,l +t1},

where djis an upper bound for ||Bj||,dj>||Bj||, then

||hl(B)|| 6˜

hl(t).

If we apply Theorem 4.1 to the series f˜mk(x)= P

i>˜mk

fixiand ˜

f˜mk(x)=

P

i˜mk|fi|xi, then

Ef=kf˜mk(B)k6˜

f˜mk(t),

for every t,1t˜mk. Let ⇥mkbe

⇥mk= max 8

<

:✓>0: ˜

f˜mk(✓)= X

i>˜mk|fi|✓i6u9

=

;,(25)

then using MATLAB (R2017b) Symbolic Math Toolbox with 200 series terms

and a zero ﬁnder, we obtained the values ⇥mkthat verify (25) (see Table 1).

The optimal values mkand sare obtained from the values of tof Theo-

rem 4.1 and from the values ⇥mkof Table 1. A complete study of this question

was developed by the authors in [18, 24]. Next, we reproduce that study.

Let (˜mk)

min =min

16t6˜mk{t}be. If there exists a value mk16 such that

(˜mk)

min ⇥mk, then the forward error Efis lower than u. In this case, we

choose the lower order mksuch that (˜mk)

min ⇥mkand the scaling factor is

s= 0. Otherwise, we choose the Hermite approximation of order 12 or 16

providing the lower cost, with

s= max (0,&1

2log (˜mk)

min

⇥mk)!'),m

k= 12 or 16.

12

For computing (˜mk)

min , we have used the following approximation:

(˜mk)

min ⇡max nd1/˜mk

˜mk,d

1/(˜mk+1)

˜mk+1 o,

where d˜mkand d˜mk+1 are bounds of

B˜mk

and

B˜mk+1

, respectively (see

(16) from [26])). The bounds dl,l=˜mk,˜mk+1 can be computed using products

of norms of matrix powers previously calculated. For example, for mk=6the

powers B2and B3must be obtained, hence (3)

min (˜mk= 3) can be ﬁgured as

follows

(3)

min = max ⇢

B3

1/3,min n

B3

kBk,

B2

2o1/4.

The algorithm for computing the values mand sis analogous to Algorithm 2

from [24].

5. Experiments

In this section, we show the results of numerical accuracy and performance of

the proposed algorithm to compute the hyperbolic cosine function. By means

of a CUDA implementation, we also show its performance using an NVIDIA

GPU.

5.1. Numerical experiments

Our MATLAB implementation, named coshmtayher, has been developed

by modifying the coshher MATLAB code given in [7], by replacing the original

Hermite approximation by the new Hermite matrix polynomial developed in this

paper and derived from (16). We have compared the new MATLAB function,

coshmtayher,withcoshher and funmcosh functions deﬁned as:

•coshmtayher: Novel code based on the new developments of Hermite

matrix polynomials (16).

•coshher: Code based on the Hermite series for the matrix hyperbolic

cosine [7].

•funmcosh:funm MATLAB function to compute matrix functions, such as

the matrix hyperbolic cosine.

5.2. Experiments description

The tests have been carried out using MATLAB (R2017b) running on an

Apple Macintosh iMac 27” (iMac retina 5K 27” late 2015) with a quadcore

INTEL i7-6700K 4Ghz processor and 16 Gb of RAM. The following tests were

made using di↵erent matrices:

13

a) Test 1: One hundred diagonalizable 128 ⇥128 randomly generated real

matrices with 1-norm varying from 2.32 to 220.04. These matrices have

the form A=VDVT,whereDis a diagonal matrix with real eigenvalues

and Vis an orthogonal matrix obtained as V=H/p128, where His the

Hadamard matrix. The “exact” matrix hyperbolic cosine was computed

as cosh(A)=Vcosh(D)VT(see [1, pp. 10]), by using the MATLAB

Symbolic Math Toolbox with 128 decimal digit arithmetic in all the com-

putations.

b) Test 2: One hundred non-diagonalizable 128 ⇥128 random real matrices

whose 1-norm range from 6.52 to 249.61. These matrices have the form

A=VJVT,whereJis a Jordan matrix with real eigenvalues with alge-

braic multiplicity varying between 1 and 4, and Vis an orthogonal matrix

obtained as V=H/p128, where His the Hadamard matrix. The “exact”

matrix hyperbolic cosine was computed as cosh(A)=Vcosh(J)VT.

c) Test 3: Thirteen test matrices from the Eigtool MATLAB package [27]

with size 128 ⇥128 and thirty-nine matrices from the matrix function

literature with dimensions lower than or equal to 128 from the function

matrix of the Matrix Computation Toolbox [28]. These matrices have

been scaled so that they have 1-norm not exceeding 1024. The “exact”

matrix hyperbolic cosine was obtained by computing ﬁrst the eigenvalue

decomposition of matrix A,A=VDV1, by using the MATLAB function

eig, and then computing cosh(A)=Vcosh(D)V1, by means of the

MATLAB’s Symbolic Math Toolbox with 128 decimal digit arithmetic in

all the computations.

Tables 3 and 4 show, respectively, the computational costs and the algorithm

accuracy for the functions under comparison, i.e. coshmtayher,coshher, and

funmcosh, in the three tests described. The algorithm accuracy is tested by

computing the normwise relative error (Er) as

Er = kcosh(A)g

cosh(A)k1

kcosh(A)k1

,

where g

cosh(A) is the computed solution and cosh(A) is the exact solution.

Table 3 shows the computational costs represented in terms of the number of

matrix products (P()) of each code, since the cost of the rest of the operations

is negligible compared to matrix multiplications for large enough matrices. The

funmcosh routine has no matrix products, but a Real Schur Form reduction as

the main cost, with a computational cost of 28n3[29] for an n⇥nmatrix. This

can be expressed as a minimum of 14 matrix products, having in mind that the

cost of a matrix multiplication is 2n3[1, p. 336].

Table 4, on the other hand, shows the percentage of cases in which the

relative errors of coshmtayher are lower, greater or equal than the relative

errors of coshher and funmcosh.

14

Table 3: Matrix products (P) for Tests 1, 2, and 3 using coshmtayher,coshher and funmcosh

MATLAB functions.

P(coshmtayher) P(coshher) P(funmcosh)

Test 1 971 973 1400

Test 2 976 988 1400

Test 3 310 317 560

Table 4: Relative error comparison between coshmtayher vs coshher (rows 1 to 3) and

coshmtayher vs funmcosh (rows 4 to 6) for Test 1, Test 2 and Test 3. The Table entries

show the percentage of cases in which the relative errors of coshmtayher (New Hermite) are

lower, greater or equal to the relative errors of coshher and funmcosh.

Test 1 Test 2 Test 3

Er(coshmtayher)<Er(coshher)76% 79% 65%

Er(coshmtayher)>Er(coshher)21% 20% 25%

Er(coshmtayher)=Er(coshher)3% 1% 10%

Er(coshmtayher)<Er(funmcosh)100% 100% 97.5%

Er(coshmtayher)>Er(funmcosh)0% 0% 2.5%

Er(coshmtayher)=Er(funmcosh)0% 0% 0%

We have plotted in Figures 2, 3, and 4, for the three Tests, respectively, the

normwise relative errors (a), the performance proﬁles (b), the ratio of relative

errors (c) to show if these ratios are signiﬁcant:

Er(coshher)/Er(coshmtayher), Er(funmcosh)/Er(coshmtayher),

and the ratio of the matrix products (d):

P(coshher)/P(coshmtayher), P(funmcosh)/P(coshmtayher).

In the performance proﬁle, the ↵coordinate varies between 1 and 5 in steps equal

to 0.1, and the pcoordinate is the probability that the considered algorithm has a

relative error lower than or equal to ↵-times the smallest error over all methods.

The ratios of relative errors are presented in decreasing order with respect to

Er(coshmtayher)/Er(coshher). The solid lines in ﬁgures 2a, 3a and 4a is the

function kcoshu,wherekcosh is the condition number of matrix hyperbolic cosine

function [1, Chapter 3] and u=2

53 is the unit roundo↵in the double precision

ﬂoating-point arithmetic. Our conclusions are:

•Regarding the normwise relative error shown in Figures 2a, 3a and 4a, in

general, coshmtayher and coshher functions have a very good numerical

stability. This can be appreciated seeing the distance from each matrix

normwise relative error to the cond ⇤uline. In Figures 2a and 3a, the

numerical stability is better because the relative errors are below the cond⇤

uline.

15

0 20 40 60 80 100

Matrix

10-16

10-14

Er

cond*u

coshmtayher

coshher

funmcosh

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

coshmtayher

coshher

funmcosh

(b) Performance proﬁle.

0 20 40 60 80 100

Matrix

100

101

102

103

Relative error ratio

Er(coshher) / Er(coshmtayher)

Er(funmcosh) / Er(coshmtayher)

(c) Ratio of relative errors.

0 20 40 60 80 100

Matrix

1

1.5

2

2.5

Matrix product ratio

P(coshher) / P(coshmtayher)

P(funmcosh) / P(coshmtayher)

(d) Ratio of matrix products.

Figure 2: Experimental results for Test 1.

•The performance proﬁle for the ﬁrst two tests (Figures 2b, 3b) shows that

accuraccy of the coshmtayher and the coshher methods is similar as ↵

increases. Moreover, both of them have much better accuracy than the

funmcosh method. For the third test, Figure 4b shows that the accuracy of

coshmtayher function is considerably better than the accuracy of coshher

function.

•Quantitatively, Table 4 indicates that coshmtayher code o↵ers a relative

error lower than coshher function in 76%, 79% and 65% of the cases,

respectively, for Tests 1, 2 and 3. Similarly, coshmtayher method improves

funmcosh function in terms of the relative error incurred in the calculation

of the 100% of the matrices that comprise Tests 1 and 2, or in the 97.5%

of the ones belonging to Test 3. These numerical values are corroborated

by the error ratios depicted in Figures 2c, 3c and 4c. As we can notice,

this ratio is greater than one for a percentage of matrices that coincides

with the values exposed in Table 4, which obviously indicates that the

coshmtayher code is the most accurate and reliable one. Whereas this

16

0 20 40 60 80 100

Matrix

10-16

10-15

10-14

10-13

Er

cond*u

coshmtayher

coshher

funmcosh

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

coshmtayher

coshher

funmcosh

(b) Performance proﬁle.

0 20 40 60 80 100

Matrix

100

101

102

Relative error ratio

Er(coshher) / Er(coshmtayher)

Er(funmcosh) / Er(coshmtayher)

(c) Ratio of relative errors.

0 20 40 60 80 100

Matrix

1

1.5

2

2.5

Matrix product ratio

P(coshher) / P(coshmtayher)

P(funmcosh) / P(coshmtayher)

(d) Ratio of matrix products.

Figure 3: Experimental results for Test 2.

error ratio takes values not very far from unity between coshmtayher and

coshher functions for the matrices of the ﬁrst two Tests, this factor reaches

more distant values from unity for many matrices that are part of Test 3.

Special mention deserves the value of this factor between coshmtayher and

funmcosh codes, which takes very high values particularly in the matrices

of Test 3. More in detail:

Test 1 (Figure 2c):

Er(coshher) 2[0.79 Er(coshmtayher),3.68 Er(coshmtayher)],

Er(funmcosh) 2[22.49 Er(coshmtayher),523.3Er(coshmtayher)].

Test 2 (Figure 3c):

Er(coshher) 2[0.80 Er(coshmtayher),2.10 Er(coshmtayher)],

Er(funmcosh) 2[19.84 Er(coshmtayher),238.6Er(coshmtayher)]

Test 3 (Figure 4c):

Er(coshher) 2[0.68 Er(coshmtayher),199.0Er(coshmtayher)],

Er(funmcosh) 2[0.48 Er(coshmtayher),6.65e+15 Er(coshmtayher)].

17

0 10 20 30 40

Matrix

10-15

10-10

10-5

Er

cond*u

coshmtayher

coshher

funmcosh

(a) Normwise relative errors.

12345

0

0.2

0.4

0.6

0.8

1

p

coshmtayher

coshher

funmcosh

(b) Performance proﬁle.

0 10 20 30 40

Matrix

100

105

1010

1015

Relative error ratio

Er(coshher) / Er(coshmtayher)

Er(funmcosh) / Er(coshmtayher)

(c) Ratio of relative errors.

0 10 20 30 40

Matrix

1

1.5

2

2.5

Matrix product ratio

P(coshher) / P(coshmtayher)

P(funmcosh) / P(coshmtayher)

(d) Ratio of matrix products.

Figure 4: Experimental results for Test 3.

•As it was shown in Table 3, coshmtayher function has a computational

cost signiﬁcantly lower than funmcosh code and slightly lower than coshher

function. As expected, the numerical tests also conﬁrm this result:

Test 1 (Figure 2d):

P(coshher) 2[0.86 P(coshmtayher),1.20 P(coshmtayher)],

P(funmcosh) 2[1.27 P(coshmtayher),2.80 P(coshmtayher)].

Test 2 (Figure 3d):

P(coshher) 2[0.86 P(coshmtayher),1.20 P(coshmtayher)],

P(funmcosh) 2[1.27 P(coshmtayher),2.80 P(coshmtayher)].

Test 3 (Figure 4d):

P(coshher) 2[0.85 P(coshmtayher),1.29 P(coshmtayher)],

P(funmcosh) 2[1.00 P(coshmtayher),2.80 P(coshmtayher)].

18

0

5

10

15

20

25

30

35

40

45

50

1000 2000 3000 4000 5000 6000 7000 8000 9000

Time (sec.)

Problem dimension

coshher CPU

coshmtayher CPU

coshmtayher GPU

coshher GPU

Figure 5: Execution times (sec.) to compute the matrix hyperbolic cosine in CPU and GPU

by means of coshmtayher and coshher codes on randomly generated large matrices.

5.3. Results in GPU

We have implemented an “accelerated” version that allows to execute our

algorithm to compute the matrix hyperbolic cosine on NVIDIA GPUs. Current

GPUs are computational devices that allow to boost performance on data paral-

lelism applications, i.e. applications that operate over many independent data.

This is the case of matrix multiplication, which is a highly optimized operation

for GPUs in its current implementation routine included in the CUBLAS pack-

age [30]. Our GPU algorithms are all based on polynomial evaluations which,

in turn, results in intensive use of matrix products. The basic MATLAB al-

gorithm is used in this case with some very costly operations (those based on

matrix multiplication) addressed to the GPU through CUDA language by a

means of implementing a MEX ﬁle.

We have carried out our experimental results for this subsection on a com-

puter equipped with two processors Intel Xeon CPU E5-2698 v4 @ 2.20GHz

featuring 20 cores each. To obtain the algorithm performance on GPU we used

one NVIDIA Tesla P100-SXM2 (Pascal architecture) attached to the PCI of

this workstation. This GPU features 3584 CUDA cores and 16 GB of memory.

Figure 5 shows the reduction in execution time when we use a GPU to accel-

erate the computations. The ﬁgure also indicates that, for large randomly gen-

erated matrices, the former version of the matrix hyperbolic cosine (coshher),

presented in [7], and the new algorithm (coshmtayher) show a similar per-

formance in both the CPU and GPU subsystems. Experimental results with

funmcosh function are not shown in the ﬁgure, since they are very large in com-

parison. For instance, for a matrix Aof order n= 2000, the execution time

19

to obtain cosh(A) is 83 seconds whilst it is 0.71 seconds and 0.73 seconds with

coshmtayher and coshher, respectively.

6. Conclusions

A new polynomial Hermite matrix algorithm has been developed in this work

for computing the matrix hyperbolic cosine. We have implemented a MATLAB

routine that is also capable of using an existing GPU in the system. This

new algorithm has been compared with other MATLAB implementations and,

at the light of the tests carried out, we have veriﬁed that the new algorithm

behaves in a numerically stable manner showing very good results. One of

the main conclusions is that MATLAB implementations based on the Hermite

series (coshher and coshmtayher) have turned out to be much more accurate

and eﬃcient than others focused on the funm MATLAB function when using

funmcosh code. In addition, the new implementation based on Hermite series

(coshmtayher), proposed here, o↵ers much better numerical accuracy than that

of coshher algorithm, which is also based on the Hermite series, with a slightly

lower computational cost in terms of matrix products.

Acknowledgements

This work has been partially supported by Spanish Ministerio de Econom´ıa

y Competitividad and European Regional Development Fund (ERDF) grants

TIN2017-89314-P and by the Programa de Apoyo a la Investigaci´on y De-

sarrollo 2018 of the Universitat Polit`ecnica de Val`encia (PAID-06-18) grants

SP20180016.

References

[1] N. J. Higham, Functions of Matrices: Theory and Computation, SIAM,

Philadelphia, PA, USA, 2008.

[2] L. J´odar, E. Navarro, A. Posso, M. Casab´an, Constructive solution of

strongly coupled continuous hyperbolic mixed problems, Applied Numeri-

cal Mathematics 47 (3–4) (2003) 477–492.

[3] E. Estrada, D. J. Higham, N. Hatano, Communicability and multipartite

structures in complex networks at negative absolute temperatures, Physical

Review E 78 (2) (2008) 026102.

[4] E. Estrada, J. A. Rodr´ıguez-Vel´azquez, Spectral measures of bipartivity in

complex networks, Physical Review E 72 (4) (2005) 046105.

[5] E. Estrada, J. G´omez-Garde˜nes, Network bipartivity and the transporta-

tion eﬃciency of european passenger airlines, Physica D: Nonlinear Phe-

nomena 323 (2016) 57–63.

20

[6] J. Kunegis, G. Gr¨oner, T. Gottron, Online dating recommender systems:

The split-complex number approach, in: Proceedings of the 4th ACM Rec-

Sys Workshop on Recommender Systems and the Social Web, RSWeb ’12,

Association for Computing Machinery, 2012, pp. 37–44.

[7] E. Defez, J. Sastre, J. Ib´a˜nez, J. Peinado, Solving engineering models using

hyperbolic matrix functions, Applied Mathematical Modelling 40 (4) (2016)

2837–2844.

[8] N. J. Higham, P. Kandolf, Computing the action of trigonometric and

hyperbolic matrix functions, SIAM Journal on Scientiﬁc Computing 39 (2)

(2017) A613–A627.

[9] A. H. Al-Mohy, A truncated Taylor series algorithm for computing the

action of trigonometric and hyperbolic matrix functions, SIAM Journal on

Scientiﬁc Computing 40 (3) (2018) A1696–A1713.

[10] N. Dunford, J. T. Schwartz, Linear Operators, Part I: General Theory,

John Wiley & Songs, Inc., 1988.

[11] G. H. Golub, C. F. Van Loan, Matrix Computations, The Johns Hopkins

University Press, Baltimore, Maryland, USA, 2013.

[12] G. Dattoli, C. Cesarano, On a new family of Hermite polynomials associ-

ated to parabolic cylinder functions, Applied Mathematics and Computa-

tion 141 (1) (2003) 143–149.

[13] A. Yari, Numerical solution for fractional optimal control problems by Her-

mite polynomials, Journal of Vibration and Control 27 (5-6) (2021) 698–

716.

[14] D. Masoero, P. Ro↵elsen, Roots of generalised Hermite polynomials when

both parameters are large, Nonlinearity 34 (3) (2021) 1663–1732.

[15] J. J´odar, R. Company, Hermite matrix polynomials and second order ma-

trix di↵erential equations, Approximation Theory and its Applications

12 (2) (1996) 20–30.

[16] E. Defez, A. Herv´as, L. J´odar, A. Law, Bounding Hermite matrix polyno-

mials, Mathematical and Computer Modelling 40 (1) (2004) 117–125.

[17] E. Defez, L. J´odar, Some applications of the Hermite matrix polynomi-

als series expansions, Journal of Computational and Applied Mathematics

99 (1) (1998) 105–117.

[18] E. Defez, J. Ib´a˜nez, J. Peinado, J. Sastre, P. Alonso-Jord´a, An eﬃcient and

accurate algorithm for computing the matrix cosine based on new Hermite

approximations, Journal of Computational and Applied Mathematics 348

(2019) 1–13.

21

[19] J. Sastre, J. Ib´a˜nez, E. Defez, P. Ruiz, New scaling-squaring Taylor algo-

rithms for computing the matrix exponential, SIAM Journal on Scientiﬁc

Computing 37 (1) (2015) A439–A455.

[20] P. Alonso, J. Peinado, J. Ib´a˜nez, J. Sastre, E. Defez, Computing matrix

trigonometric functions with gpus through matlab, The Journal of Super-

computing (2018) 1–14.

[21] J. Sastre, Eﬃcient evaluation of matrix polynomials, Linear Algebra and

its Applications 539 (2018) 229–250.

[22] E. D. Rainville, Special functions, Vol. 442, New York, 1960.

[23] M. S. Paterson, L. J. Stockmeyer, On the number of nonscalar multiplica-

tions necessary to evaluate polynomials, SIAM Journal on Computing 2 (1)

(1973) 60–66.

[24] J. Sastre, J. Ib´a˜nez, P. Alonso, J. Peinado, E. Defez, Two algorithms for

computing the matrix cosine function, Applied Mathematics and Compu-

tation 312 (2017) 66–77.

[25] J. Sastre, J. Ib´a˜nez, P. Ruiz, E. Defez, Eﬃcient computation of the matrix

cosine, Applied Mathematics and Computation 219 (14) (2013) 7575–7585.

[26] P. Ruiz, J. Sastre, J. Ib´a˜nez, E. Defez, High perfomance computing of the

matrix exponential, J. Comput. Appl. Math. 291 (2016) 370–379.

[27] T. G. Wright, Eigtool, version 2.1, URL:

http://www.comlab.ox.ac.uk/pseudospectra/eigtool.

[28] N. J. Higham, The matrix computation toolbox, URL:

http://www.ma.man.ac.uk/ higham/mctoolbox.

[29] M. I. Smith, A Schur algorithm for computing matrix pth roots, SIAM J.

Matrix Anal. Appl. 24 (4) (2003) 971–989.

[30] NVIDIA, CUDA. CUBLAS library (2009).

22