# Nicholas J. HighamThe University of Manchester · School of Mathematics

Nicholas J. Higham

## About

126

Publications

26,490

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

5,395

Citations

Citations since 2017

## Publications

Publications (126)

The standard LU factorization-based solution process for linear systems can be enhanced in speed or accuracy by employing mixed precision iterative refinement. Most recent work has focused on dense systems. We investigate the potential of mixed precision iterative refinement to enhance methods for sparse systems based on approximate sparse factoriz...

Anymatrix is a MATLAB toolbox that provides an extensible collection of matrices with the ability to search the collection by matrix properties. Each matrix is implemented as a MATLAB function and the matrices are arranged in groups. Compared with previous collections, Anymatrix offers three novel features. First, it allows a user to share a collec...

This article is dedicated to Jack Dongarra on the occasion of him receiving the 2021 ACM Turing Award and concentrates primarily on his contributions to numerical linear algebra, particularly on the development of algorithms and software to reliably and efficiently solve linear algebra problems. We shall look at software projects, a number of them...

Today’s floating-point arithmetic landscape is broader than ever. While scientific computing has traditionally used single precision and double precision floating-point arithmetics, half precision is increasingly available in hardware and quadruple precision is supported in software. Lower precision arithmetic brings increased speed and reduced com...

The Wilson matrix, W, is a 4 × 4 unimodular symmetric positive definite matrix of integers that has been used as a test matrix since the 1940’s, owing to its mild ill-conditioning. We ask how close W is to being the most ill-conditioned matrix in its class, with or without the requirement of positive definiteness. By exploiting the matrix adjugate...

Stochastic rounding (SR) randomly maps a real number x to one of the two nearest values in a finite precision number system. The probability of choosing either of these two numbers is 1 minus their relative distance to x. This rounding mode was first proposed for use in computer arithmetic in the 1950s and it is currently experiencing a resurgence...

It is well established that reduced precision arithmetic can be exploited to accelerate the solution of dense linear systems. Typical examples are mixed precision algorithms that reduce the execution time and the energy consumption of parallel solvers for dense linear systems by factorizing a matrix at a precision lower than the working precision....

This article describes a standard API for a set of Batched Basic Linear Algebra Subprograms (Batched BLAS or BBLAS). The focus is on many independent BLAS operations on small matrices that are grouped together and processed by a single routine, called a Batched BLAS routine. The matrices are grouped together in uniformly sized groups, with just one...

Block low-rank (BLR) matrices possess a blockwise low-rank property that can be exploited to reduce the complexity of numerical linear algebra algorithms. The impact of these low-rank approximations on the numerical stability of the algorithms in floating-point arithmetic has not previously been analysed. We present rounding error analysis for the...

The efficient utilization of mixed-precision numerical linear algebra algorithms can offer attractive acceleration to scientific computing applications. Especially with the hardware integration of low-precision special-function units designed for machine learning applications, the traditional numerical algorithms community urgently needs to reconsi...

We identify and analyse obstructions to factorisation of integer matrices into products NTN or N2 of matrices with rational or integer entries. The obstructions arise as quadratic forms with integer coefficients and raise the question of the discrete range of such forms. They are obtained by considering matrix decompositions over a superalgebra. We...

We explore the floating-point arithmetic implemented in the NVIDIA tensor cores, which are hardware accelerators for mixed-precision matrix multiplication available on the Volta, Turing, and Ampere microarchitectures. Using Volta V100, Turing T4, and Ampere A100 graphics cards, we determine what precision is used for the intermediate results, wheth...

Double-precision floating-point arithmetic (FP64) has been the de facto standard for engineering and scientific simulations for several decades. Problem complexity and the sheer volume of data coming from various instruments and sensors motivate researchers to mix and match various approaches to optimize compute resources, including different level...

Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low-precision arithmetic. Software implementations commonly use alternat...

Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the server-line products are increasingly featuring low-precision special function units, such as the NVIDIA tensor cor...

Computing units that carry out a fused multiply-add (FMA) operation with matrix arguments, referred to as tensor units by some vendors, have great potential for use in scientific computing. However, these units are inherently mixed precision and existing rounding error analyses do not support them. We consider a mixed precision block FMA that gener...

A number of features of today’s high-performance computers make it challenging to exploit these machines fully for computational science. These include increasing core counts but stagnant clock frequencies; the high cost of data movement; use of accelerators (GPUs, FPGAs, coprocessors), making architectures increasingly heterogeneous; and multi- pl...

We consider ill-conditioned linear systems Ax = b that are to be solved iteratively, and assume that a low accuracy LU factorization A ≈ LÛ is available for use in a preconditioner. We have observed that for ill-conditioned matrices A arising in practice, A ⁻¹ tends to be numerically low rank, that is, it has a small number of large singular values...

We present Etymo (https://etymo.io), a discovery engine to facilitate artificial intelligence (AI) research and development. It aims to help readers navigate a large number of AI-related papers published every week by using a novel form of search that finds relevant papers and displays related papers in a graphical interface. Etymo constructs and m...

We derive explicit solutions to the problem of completing a partially specified correlation matrix. Our results apply to several block structures for the unspecified entries that arise in insurance and risk management, where an insurance company with many lines of business is required to satisfy certain capital requirements but may have incomplete...

We propose an adaptive scheme to reduce communication overhead caused by data movement by selectively storing the diagonal blocks of a block-Jacobi preconditioner in different precision formats (half, single, or double). This specialized preconditioner can then be combined with any Krylov subspace method for the solution of sparse linear systems to...

We derive an algorithm for computing the wave-kernel functions cosh √ A and sinhc √ A for an arbitrary square matrix A, where sinhc(z) = sinh(z)/z. The algorithm is based on Padé approximation and the use of double angle formulas. We show that the backward error of any approximation to cosh √ A can be explicitly expressed in terms of a hypergeometr...

We present Etymo (https://etymo.io), a discovery engine to facilitate artificial intelligence (AI) research and development. It aims to help readers navigate a large number of AI-related papers published every week by using a novel form of search that finds relevant papers and displays related papers in a graphical interface. Etymo constructs and m...

We derive an algorithm for computing the wave-kernel functions cosh √A and sinhc √ A for an arbitrary square matrix A, where sinhcz = sinh(z)/z. The algorithm is based on Padé approximation and the use of double angle formulas. We show that the backward error of any approximation to cosh √A can be explicitly expressed in terms of a hypergeometric f...

A current trend in high-performance computing is to decompose a large linear algebra problem into batches containing thousands of smaller problems, that can be solved independently, before collating the results. To standardize the interface to these routines, the community is developing an extension to the BLAS standard (the batched BLAS), enabling...

Solving large numbers of small linear algebra problems simultaneously is becoming increasingly important in many application areas. Whilst many researchers have investigated the design of efficient batch linear algebra kernels for GPU architectures, the common approach for many/multi-core CPUs is to use one core per subproblem in the batch. When so...

In a wide range of applications it is required to compute the nearest correlation matrix in the Frobenius norm to a given symmetric but indefinite matrix. Of the available methods with guaranteed convergence to the unique solution of this problem the easiest to implement, and perhaps the most widely used, is the alternating projections method. Howe...

We derive a new algorithm for computing the action $f(A)V$ of the cosine, sine, hyperbolic cosine, and hyperbolic sine of a matrix $A$ on a matrix $V$, without first computing $f(A)$. The algorithm can compute $\cos(A)V$ and $\sin(A)V$ simultaneously, and likewise for $\cosh(A)V$ and $\sinh(A)V$, and it uses only real arithmetic when $A$ is real. T...

Indefinite approximations of positive semidefinite matrices arise in various data analysis applications involving covariance matrices and correlation matrices. We propose a method for restoring positive semidefiniteness of an indefinite matrix M0 that constructs a convex linear combination S(α) = αM1 + (1 - α)M0 of M0 and a positive semidefinite ta...

Theoretical and computational aspects of matrix inverse trigonometric and inverse hyperbolic functions are studied. Conditions for existence are given, all possible values are characterized, and the principal values acos, asin, acosh, and asinh are defined and shown to be unique primary matrix functions. Various functional identities are derived, s...

Pathways-reduced analysis is one of the techniques used by the Fispact-II
nuclear activation and transmutation software to study the sensitivity of the
computed inventories to uncertainties in reaction cross-sections. Although
deciding which pathways are most important is very helpful in for example
determining which nuclear data would benefit from...

Several existing algorithms for computing the matrix cosine employ polynomial or rational approximations combined with scaling and use of a double angle formula. Their derivations are based on forward error bounds. We derive new algorithms for computing the matrix cosine, the matrix sine, and both simultaneously that are backward stable in exact ar...

The Frechet derivative L-f of a matrix function f : C-nxn bar right arrow C-nxn controls the sensitivity of the function to small perturbations in the matrix. While much is known about the properties of L-f and how to compute it, little attention has been given to higher order Frechet derivatives. We derive sufficient conditions for the kth Frechet...

The need to estimate structured covariance matrices arises in a variety of applications and the problem is widely studied in statistics. A new method is proposed for regularizing the covariance structure of a given covariance matrix whose underlying structure has been blurred by random noise, particularly when the dimension of the covariance matrix...

In 2011, version 8.6 of the finite element-based structural analysis package Oasys GSA was released. A new feature in this release was the estimation of the 1-norm condition number κ1(K)=∥K∥1∥K-1∥1κ1(K)=∥K∥1∥K-1∥1 of the stiffness matrix K of structural models by using a 1-norm estimation algorithm of Higham and Tisseur to estimate ∥K-1∥1∥K-1∥1. Th...

The Frechet derivative L-f of a matrix function f : C-nxn -> C-nxn is used in a variety of applications and several algorithms are available for computing it. We define a condition number for the Frechet derivative and derive upper and lower bounds for it that differ by at most a factor 2. For a wide class of functions we derive an algorithm for es...

The most popular method for computing the matrix logarithm is the inverse scaling and squaring method, which is the basis of the recent algorithm of Al-Mohy and Higham [SIAM J. Sci. Comput., 34 (2012), pp. C152--C169]. For real matrices we develop a version of the latter algorithm that works entirely in real arithmetic and is twice as fast as and m...

The Schur-Padé algorithm of N. J. Higham and L. Lin [SIAM J. Matrix Anal. Appl. 32, No. 3, 1056–1078 (2011; Zbl 1242.65091)] computes arbitrary real powers A t of a matrix A∈ℂ n×n using the building blocks of Schur decomposition, matrix square roots, and Padé approximants. We improve the algorithm by basing the underlying error analysis on the quan...

A popular method for computing the matrix logarithm is the inverse scaling and squaring method, which essentially carries out the steps of the scaling and squaring method for the matrix exponential in reverse order. Here we make several improvements to the method, putting its development on a par with our recent version [SIAM J. Matrix Anal. Appl.,...

Associated with an n×n matrix polynomial of degree , are the eigenvalue problem P(λ)x=0 and the linear system problem P(ω)x=b, where in the latter case x is to be computed for many values of the parameter ω. Both problems can be solved by conversion to an equivalent problem L(λ)z=0 or L(ω)z=c that is linear in the parameter λ or ω. This linearizati...

We present a collection of 46 nonlinear eigenvalue problems in the form of a MATLAB toolbox. The collection contains problems from models of real-life applications as well as ones constructed specifically to have particular properties. A classification is given of polynomial eigenvalue problems according to their structural properties. Identifiers...

A new algorithm is developed for computing $e^{tA}B$, where $A$ is an $n\times n$ matrix and $B$ is $n\times n_0$ with $n_0 \ll n$. The algorithm works for any $A$, its computational cost is dominated by the formation of products of $A$ with $n\times n_0$ matrices, and the only input parameter is a backward error tolerance. The algorithm can return...

The need to evaluate a function $f(A)\in\mathbb{C}^{n \times n}$ of a matrix
$A\in\mathbb{C}^{n \times n}$ arises in a wide and growing number of
applications, ranging from the numerical solution of differential equations to
measures of the complexity of networks.
We give a survey of numerical methods for evaluating matrix functions,
along with a b...

We show that the Fr\'echet derivative of a
matrix function $f$ at $A$ in the direction $E$,
where $A$ and $E$ are real matrices,
can be approximated by
$\Im f(A+ihE)/h$
for some suitably small $h$.
This approximation,
requiring a single function evaluation at a complex argument,
generalizes the complex step approximation known in the
scalar case.
T...

The polar decomposition of a square matrix has been generalized by several authors to scalar products on $\mathbb{R}^n$ or $\mathbb{C}^n$ given by a bilinear or sesquilinear form. Previous work has focused mainly on the case of square matrices, sometimes with the assumption of a Hermitian scalar product. We introduce the canonical generalized polar...

Hyperbolic matrix polynomials
are an important class of Hermitian matrix polynomials
that contain overdamped quadratics as a special case.
They share with definite pencils the spectral property that their eigenvalues
are real and semisimple.
We extend the definition of hyperbolic matrix polynomial
in a way that relaxes the requirement of definitene...

A 25-year old and somewhat neglected algorithm of Crawford and Moon
attempts to determine
whether a given Hermitian matrix pair $(A,B)$ is definite by exploring the
range of the function
$f(x) = x^*(A+iB)x / | x^*(A+iB)x |$,
which is a subset of the unit circle.
We revisit the algorithm and show that
with suitable modifications and careful attentio...

The scaling and squaring method for the matrix exponential is based on the approximation eA ≈ (rm(2-sA))2s, where rm(x) is the [m/m] Padé approximant to ex and the integers m and s are to be chosen. Several authors have identified a weakness of existing scaling and squaring algorithms termed overscaling, in which a value of s much larger than neces...

The most common way of solving the quadratic eigenvalue problem (QEP)
$(\l^2 M + \l D + K)x=0$ is to convert it into a linear problem
$(\l X + Y)z=0$ of twice the dimension and solve the linear problem
by the QZ algorithm or a Krylov method.
In doing so, it is important to understand the influence of the linearization
process on the accuracy and st...

The matrix exponential is a much-studied matrix function having many applica- tions. The Frechet derivative of the matrix exponential describes the first-order sensitivity of eA to perturbations in A and its norm determines a condition number for eA. Among the numerous methods for computing eA the scaling and squaring method is the most widely used...

The most widely used approach for solving the polynomial eigenvalue problem P(λ)x = ��m i=0 λi � Ai x =0inn × n matrices Ai is to linearize to produce a larger order pencil L(λ) =λX + Y, whose eigensystem is then found by any method for generalized eigenproblems. For a given polynomial P, infinitely many linearizations L exist and approximate eigen...

Hyperbolic quadratic matrix polynomials $Q(\lambda) = \lambda^2 A +
\lambda B + C$ are an important class of
Hermitian matrix polynomials
with real eigenvalues, among which the overdamped quadratics are those
with nonpositive eigenvalues.
Neither the definition of overdamped nor any of the standard
characterizations provides an efficient way to tes...

We study the nonsymmetric algebraic Riccati equation whose four
coefficient matrices are the blocks of
a nonsingular $M$-matrix or an irreducible singular
$M$-matrix $M$. The solution of practical interest is the minimal nonnegative
solution. We show that Newton's method with zero initial guess can be used to
find this solution without any further...

Newton's method for the inverse matrix $p$th root,
$A^{-1/p}$, has the attraction that it
involves only matrix multiplication.
We show that if the starting matrix is $c^{-1}I$ for $c\in\R^+$ then
the iteration converges quadratically to $A^{-1/p}$
if the eigenvalues of $A$ lie in
a wedge-shaped convex set containing the disc
$\{\, z: |z-c^p| < c^p\...

A standard way of treating the polynomial eigenvalue problem
$P(\l)x = 0$ is to convert it
into an equivalent matrix pencil---a process known as linearization.
Two vector spaces of pencils
$\Ell_1(P)$ and $\Ell_2(P)$, and their intersection $\DL(P)$,
have recently been defined and studied by
Mackey, Mackey, Mehl, and Mehrmann.
The aim of our work i...

Newton’s method for the inverse matrix $p$th root, $A^{-1/p}$, has the attraction that it involves only matrix multiplication. We show that if the starting matrix is $c^{-1}I$ for $c\in\mathbb{R}^+$ then the iteration converges quadratically to $A^{-1/p}$ if the eigenvalues of $A$ lie in a wedge-shaped convex set containing the disc $\{z: |z-c^p| <...

The standard way of solving the polynomial eigenvalue problem of degree $m$
in $n\times n$ matrices
is to ``linearize'' to a pencil in $mn\times mn$ matrices
and solve the generalized eigenvalue problem.
For a given polynomial, $P$, infinitely many linearizations exist
and they can have widely varying eigenvalue condition numbers.
We investigate th...

For matrix function f we investigate how to compute a matrix-vector product f(A)b without explicitly computing f(A). A general method is described that applies quadrature to the matrix version of the Cauchy integral theorem. Methods specific
to the logarithm, based on quadrature, and fractional matrix powers, based on solution of an ordinary differ...

For any matrix automorphism group $\G$ associated with a bilinear
or sesquilinear form, Mackey, Mackey, and Tisseur have recently
shown that the matrix sign decomposition factors of $A\in\G$ also
lie in $\G$; moreover, the polar factors of $A$ lie in $\G$ if the
matrix of the underlying form is unitary. Groups satisfying the
latter condition includ...

For any matrix automorphism group $\G$ associated with a
bilinear or sesquilinear form,
Mackey, Mackey, and Tisseur have recently shown that the
matrix sign decomposition factors of $A\in\G$ also lie in $\G$;
moreover, the polar factors of $A$ lie in $\G$ if the matrix of the underlying
form is unitary.
Groups satisfying the latter condition
includ...

For any matrix automorphism group G associated with a bilinear or sesquilinear form, Mackey, Mackey, and Tisseur have recently shown that the matrix sign decomposition factors of A # G also lie in G; moreover, the polar factors of A lie in G if the matrix of the underlying form is unitary. Groups satisfying the latter condition include the complex...

An algorithm for computing matrix functions is presented. It employs a Schur
decomposition with reordering and blocking followed by the block form of a
recurrence of Parlett, with functions of the nontrivial diagonal blocks
evaluated via a Taylor series. A parameter is used to balance the conflicting
requirements of producing small diagonal blocks...

This document describes version 1.0 of the toolbox, dated August 23, 2002

Definitions and characterizations of pseudospectra are given for rectangular matrix poly-nomials expressed in homogeneous form: P(α,β)=αdAd+αd−1βAd−1+⋯+βdA0. It is shown that problems with infinite (pseudo)eigenvalues are elegantly treated in this framework. For such problems stereographic projection onto the Riemann sphere is shown to provide a co...

An important class of generalized eigenvalue problems Ax=λBx is those in which A and B are Hermitian and some real linear combination of them is definite. For the quadratic eigenvalue problem (QEP) with Hermitian A, B and C and positive definite A, particular interest focuses on problems in which (x*Bx)2−4(x*Ax)(x*Cx) is one-signed for all non-zero...

We describe a parallel Fortran 77 implementation, in ScaLA- PACK style, of a block matrix 1-norm estimator of Higham andTisseur. This estimator differs from that underlying the existing ScaLAPACK code, PxLACON, in that it iterates with a matrix with t columns, where t ≥ 1 is a parameter, rather than with a vector, andso the basic compu- tational ke...

Pseudospectra associated with the standard and generalized eigenvalue problems have been widely investigated in recent years. We extend the usual definitions in two respects, by treating the polynomial eigenvalue problem and by allowing structured perturbations of a type arising in control theory. We explore connections between structured pseudospe...

For a symmetric positive definite matrix the relative error in the eigenvalues computed by the two-sided Jacobi method is bounded in terms of the condition number of the matrix scaled to have unit diagonal. Similarly, for a general matrix the relative error in the singular values computed by the one-sided Jacobi method is bounded in terms of the co...

We derive an upper bound on the normwise backward error of an approximate solution to the equality constrained least squares problem minBx=d
‖b − Ax‖2. Instead of minimizing over the four perturbations to A, b, B and d, we fix those to B and d and minimize over the remaining two; we obtain an explicit solution of this simplified minimization proble...

The null space method is a standard method for solving the linear least squares problem subject to equality constraints (the LSE problem). We show that three variants of the method, including one used in LAPACK that is based on the generalized QR factorization, are numerically stable. We derive two perturbation bounds for the LSE problem: one of st...

Applications in constrained optimization (and other areas) produce symmetric matrices with a natural block 2 × 2 structure. An optimality condition leads to the problem of perturbing the (1,1) block of the matrix to achieve a specific inertia. We derive a perturbation of minimal norm, for any unitarily invariant norm, that increases the number of n...

Introduction The effects of rounding errors on algorithms in numerical linear algebra have been much-studied for over fifty years, since the appearance of the first digital computers. The subject continues to occupy researchers, for several reasons. First, not everything is known about established algorithms. Second, new algorithms are continually...

The technique of iterative refinement for improving the computed solution to a linear system was used on desk calculators and computers in the 1940s and has remained popular. In the 1990s iterative refinement is well supported in software libraries, notably in LAPACK. Although the behaviour of iterative refinement in floating point arithmetic is re...