# Scalable Parallel Numerical Methods and Software Tools for Material Design.

**ABSTRACT** A new method of solution to the local spin density approximation to the electronic Schrödinger equation is presented. The method is based on an efficient, parallel adaptive multigrid eigenvalue solver. It is shown that adaptivity is both necessary and sufficient to accurately solve the eigenvalue problem near the singularities at the atomic centers. While preliminary, these results suggest that direct real space methods may provide a much needed method for efficiently computing the forces in complex materials.

**0**Bookmarks

**·**

**42**Views

- [Show abstract] [Hide abstract]

**ABSTRACT:**We design a Kohn–Sham equation solver based on hexahedral finite element discretizations. The solver integrates three schemes proposed in this paper. The first scheme arranges one a priori locally-refined hexahedral mesh with appropriate multiresolution. The second one is a modified mass-lumping procedure which accelerates the diagonalization in the self-consistent field iteration. The third one is a finite element recovery method which enhances the eigenpair approximations with small extra work. We carry out numerical tests on each scheme to investigate the validity and efficiency, and then apply them to calculate the ground state total energies of nanosystems C60, C120, and C275H172. It is shown that our solver appears to be computationally attractive for finite element applications in electronic structure study.Journal of Computational Physics 04/2012; 231(8):3166–3180. · 2.14 Impact Factor - SourceAvailable from: Kesheng Wu[Show abstract] [Hide abstract]

**ABSTRACT:**Predicting the structural and electronic properties of complex systems is one of the outstanding problems in condensed matter physics. Central to most methods used in molecular dynamics is the repeated solution of large eigenvalue problems. This paper reviews the source of these eigenvalue problems, describes some techniques for solving them, and addresses the difficulties and challenges which are faced. Parallel implementations are also discussed.BIT 08/1996; 36(3):563-578. · 0.98 Impact Factor

Page 1

arXiv:mtrl-th/9412005v1 8 Dec 1994

Scalable Parallel Numerical Methods

and Software Tools for Material Design∗

Eric J. Bylaska†

Scott R. Kohn‡

Scott B. Baden‡

Alan Edelman§

Ryoichi Kawai¶

John H. Weare∗∗

M. Elizabeth G. Ong?

Abstract

A new method of solution to the local spin density approximation

to the electronic Schr¨ odinger equation is presented. The method is

based on an efficient, parallel, adaptive multigrid eigenvalue solver. It

is shown that adaptivity is both necessary and sufficient to accurately

solve the eigenvalue problem near the singularities at the atomic cen-

ters. While preliminary, these results suggest that direct real space

methods may provide a much needed method for efficiently computing

the forces in complex materials.

1Introduction

To intelligently design materials with specific high performance properties,

it is necessary to have an understanding of the underlying atomic structure,

reactive sites, and other properties of complex candidate compounds. To

∗This work was supported by ONR contract N00014-93-1-0152, AFSOR contract

F49620-94-1-0286, and ONR contract N00014-91-J-1835.

†Department of Chemistry, University of California, San Diego.

‡Department of Computer Science and Engineering, University of California, San

Diego.

‡

§Department of Mathematics, MIT.

¶Physics Department, University of Alabama, Birmingham.

?Department of Mathematics, University of California, San Diego.

∗∗To whom correspondence should be addressed. Department of Chemistry, University

of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0340. Tel: (619) 534-

3286. Fax: (619) 534-7244. E-mail: jweare@chem.ucsd.edu.

1

Page 2

2

Bylaska et al.

achieve the generality and reliability needed to predict these properties,

methods based on the first principles solution to the electronic Schr¨ odinger

equation are required. For systems of typical size, the most reliable and

efficient first principles approach is based on the local density approximation

(LDA) of Kohn and Sham [8] to the full many-electron Schr¨ odinger equation.

However, current methods of solution scale as O(N3), where N is the number

of atoms. For systems of the size commonly encountered in materials science,

such calculations are too large to be practical.

The goal of our program is to develop methods that can efficiently treat

large and complex systems. To be successful, we must solve the following

computational issues:

• The method must be fast to allow simulations requiring thousands of

atomic interaction evaluations.

• The method must be capable of high accuracy: .02 eV/atom.

• The method must effectively capture the multiple length scales inherent

in the problem.

• The method must scale as N2or less to allow extension to larger sys-

tems.

To address these goals, we are developing the following techniques and soft-

ware tools:

• A rapidly converging method for the non-linear eigenvalue problem

arising in the LDA.

• Adaptive methods for resolving the locality of electronic wavefunctions

with multiple length scales.

• A software infrastructure to exploit the high performance parallel ar-

chitectures capable of providing the throughput and memory we require.

2The LDA Equations

In the LDA, the electronic wavefunctions are given by the solutions to the

eigenvalue problem:

Hψi= λiψi

(1)

Page 3

Scalable Numerical Methods for Material Design

3

where the Hamiltonian H is given by:

H =

?

−∇2

2m

+ Vext+ VH+ Vxc

?

(2)

λiis an eigenvalue, and the eigenvectors (the wavefunctions ψi) satisfy the

usual orthonormality constraints of a symmetric operator. In general, we

require the lowest N eigenvalues, where N is the number of electrons in the

system. Electron-electron interaction is included in the Hartree potential,

VH, and the exchange correlation potential, Vxc.

functions of the charge density ρ(? x) =?

occupied orbitals. VHis the solution to Poisson’s equation in free space with

this charge density.

Since VH and Vxcare functionals of the electron density, Eq. (1) must

be solved self-consistently. That is, an initial density is input and iterations

continue until the input and output densities are the same. The Vextpoten-

tial term represents the attractive interactions of the electrons to the atomic

nuclei and is a function of the positions of the atoms. In our simulations,

Eq. (1) must be solved many times as the position of the atoms change.

There are several length scales in the solution of Eq. (1). The overall

dimension of the system is determined by the atomic positions and the

associated electron density. However, each atomic center is associated with

a length related to the effective charge of its nucleus. For example, sodium

has a small atomic charge and, therefore, a fairly long length scale (≈ 2.5˚ A).

On the other hand, oxygen has a high effective charge and a corresponding

very short length scale (≈ .5˚ A).

The presence of several length scales in Eq. (1) poses significant difficul-

ties for present solution methods, based on the FFT. Since increases in the

overall dimension of the system and the resolution of the function in real

space (because of a short length scale) both require increases in the size of

the basis, the use of a planewave basis requires the retention of a very large

numbers of basis functions. The computational cost of this is somewhat

offset by the high parallelism and efficient vectorization of the algorithm.

However, because of the steepness of the atomic potentials, we have found

that on the order of 104to 106Fourier functions may be required to obtain

sufficient accuracy. Such calculations are extremely CPU intensive.

The eigenvalue equation for a real system is complicated by details which

obscure the essential difficulties of its solution [9]. To develop test problems

(see Section 4) which retain the essential singular behavior while removing

Both VH and Vex are

i, where the sum includes only

occψ2

Page 4

4

Bylaska et al.

nonessential details, we replace VH, Vext, and Vexby simple potentials lo-

cated at the atomic sites. The solution to these eigenvalue problems provide

little information as to the convergence properties of the numerical method

with respect to self-consistency or the efficiency of the solution to the em-

bedded Poisson problem. However, they enable us to address the critical

issues of multiple length scales and the singular behavior of the potential.

3Parallel Adaptive Solution to the Eigenvalue Prob-

lem

We have developed a parallel adaptive eigenvalue solver (AMG) which in-

tegrates adaptive mesh refinement techniques [3] [2] with a novel multigrid

eigenvalue algorithm [4]. To our knowledge, this is the first time such meth-

ods have been combined to solve materials science problems.

We solve the eigenvalue problem using the multigrid method of Cai et al.

[4]. Given the linear eigenvalue problem Hψ = λψ, the following efficiently

calculates the lowest eigenvalue and eigenvector:

let ψ be an initial guess (ψ ?= 0)

repeat

H-normalize ψ: (ψ,Hψ) = 1

let λ = (ψ,Hψ)/(ψ,ψ)

perform one multigrid V-cycle on (H − λI)ψ = 0

until ?(H − λI)ψ? < ε (some error tolerance)

Convergence is rapid; for a typical problem, machine precision is reached

within fifteen iterations. As with most iterative methods, a good initial guess

can significantly speed convergence. To calculate eigenvalues other than the

lowest, we apply the above procedure and, after each V-cycle, orthogonalize

the candidate eigenvector against all previously calculated eigenvectors.

Because of the multiple length scales present in our problems, we cannot

efficiently represent the eigenvector ψ using a uniform discretization of space.

Uniform grids cannot adapt in response to local changes; thus, the grid

spacing is dictated by the shortest length scale present in the entire problem.

Instead, we represent ψ as a composite grid (see Figure 1), which enables our

solver to locally refine the discretization as required by local phenomena. By

Page 5

Scalable Numerical Methods for Material Design

5

Composite Grid

Level 0

Level 1

Level 2

Figure 1: Wavefunctions are resolved on a composite grid which represents

a non-uniform discretization. In practice, composite grids are implemented

as a hierarchy of grid levels.

exploiting locality, we expend computational resources (flops and memory)

in those regions of the solution where they are most needed.

A composite grid logically consists of a single grid in which the discretiza-

tion is non-uniform. Such grids are actually represented using a hierarchy

of levels (see Figure 1). All grids at the same level have the same mesh

spacing, but successive levels have finer spacing than the ones preceding it,

providing a more accurate representation of the solution. We locally refine

the grid hierarchy according to an error estimate calculated at run-time. In

general, the location and extent of refinement areas must be computed by

the application, as they cannot be predicted a priori.

We implemented our solver using the LPARX [6] parallel programming

system, which provides efficient run-time support for scientific calculations

with dynamic, block structured data. The use of LPARX was essential in

facilitating code development; managing the complicated data structures

of a composite grid hierarchy would have been a daunting task without

LPARX, especially on parallel architectures. LPARX enables us to run the

same code on a diversity of high performance parallel architectures, including

the CM-5, Paragon, single processor workstations, Cray C-90, SP-1, and

networks of workstations. For more details concerning the implementation

and performance, refer to [7] in these proceedings.

Page 6

6

Bylaska et al.

4Model Problems

All of the following model problems were solved in 3d; we did not attempt to

exploit symmetry. Each AMG solution required approximately one minute

running on an IBM RS/6000 model 590.

4.1The Hydrogen Atom

In this problem, the Hamiltonian has a deceptively simple form with only a

single term:

H = −∇2

(3)

2m−Z

r.

While the eigenvalue problem corresponding to Eq. (3) can be solved an-

alytically, the singular behavior at r = 0 can cause significant problems

for numerical methods. In fact, it cannot be conveniently solved with our

present FFT methods. For example, for the lowest eigenvalue, our FFT

algorithm with 643mesh points gives the value -0.69 rather than the correct

value of -0.5. The lowest energy solution in our units is an exponential with

the form e−Zrand energy E =Z2

2. Note that the severity of the singularity

with increasing Z is reflected in the increasing localization of the solution

around the origin. As Z increases, the density of points in an adaptive

method will increase near the origin.

The Z = 1 solution corresponds to the hydrogen problem. It is plotted

in Figure 2(a). We note that the AMG solution and the exact solution (not

plotted) are identical on the scale of the graph. The cusp at the origin is a

result of the singular nature of the potential at this point. This behavior is

usually difficult to resolve with a numerical method [5].

As the singularity strengthens with increasing charge, the lowest energy

scales as Z2. Figure 2(b) illustrates how this behavior is reproduced by the

AMG solution. As expected, to obtain the correct scaling, it is necessary to

go to higher levels of adaptivity. However, because of increased localization,

the total number of points remains roughly the same. To illustrate the

efficiency of adaptivity, we note that the resolution at the finest level is

equivalent to a uniform grid with 40963basis elements, as compared to the

fewer than 643points required by the adaptive algorithm.

4.2 The H+

2Molecule

A problem that is similar to the hydrogen atom problem, but more com-

monly used as a test problem for chemical methods, is the H+

2molecule.

Page 7

Scalable Numerical Methods for Material Design

7

-15.0 -5.05.015.0

Distance (au)

0

0

0

Wavefunction

Hydrogen Wavefunction

0510

Z

1520

0.0

50.0

100.0

150.0

200.0

Eigenvalue

Eigenvalues for -Z/R Potential

Exact

Three Adaptive Levels

Four Adaptive Levels

Uniform Levels Only

Figure 2: The left graph displays the lowest energy eigenvector for the hy-

drogen atom; graph data was extracted from the 3d volume along the Z axis.

Tick marks on the abscissa represent mesh points. The right plot shows the

eigenvalues for a−Z

Rpotential.

In this problem, there is only one electron. However, there are two centers

with singularities. The Hamiltonian is:

H =−∇2

2m

−

1

|? r +

?

2|

Ra

−

1

|? r −

?

2|

Ra

,where? Rais the atomic separation.(4)

This problem can also be solved analytically [1]. Again, it is two stiff for

practical solution by FFT. On the other hand, the AMG method does quite

well as illustrated by the binding energy curve in Figure 3(b). (Binding

energy is defined as the total energy of the atoms at a specified distance

minus the energy at infinite separation.) The wave function is plotted in

Figure 3(a).Note the increased density of points in the vicinity of the

nuclei.

4.3Adaptive Multigrid vs. FFT

In this test problem, we soften the singularities in the original potential by

introducing an error function with a variable cut off (rcut). We replace the

Page 8

8

Bylaska et al.

-15.0-5.05.015.0

Distance (au)

0

0

0

0

Wavefunction

H2+ Wavefunction

1.03.05.0

Atomic Separation

-0.12

-0.08

-0.04

0.00

Binding Energy (Hartrees)

Morse Plot for H2+

Exact Energy

Calculated Energy

Figure 3: The left graph displays the lowest energy eigenvector for the hy-

drogen molecular ion; graph data was extracted from the 3d volume along

the Z axis. Tick marks on the abscissa represent mesh points. The right

plot shows binding energy as a function of atomic separation.

1

rpotentials of Eq. ( 4) with the smoothed potentials erf(

potential is sufficiently softened (i.e. rcut is large), the FFT, the uniform

grid, and the AMG methods will all converge to the same answer. Results

are summarized in Table 1. The exact answer for these parameters and

rcut= 0 is -0.911. It is clear that both the uniform grid method and the

FFT method lose accuracy quickly as rcutapproaches 0.

r

rcut)/r. If this

References

[1] D. R. Bates, K. Ledsham, and A. L. Stewart, Wave functions of the hy-

drogen molecular ion, Phil. Trans. Roy. Soc. London, 246 (1953), pp. 215–

240.

[2] M. J. Berger and P. Colella, Local adaptive mesh refinement for shock

hydrodynamics, Journal of Computational Physics, 82 (1989), pp. 64–84.

[3] M. J. Berger and J. Oliger, Adaptive mesh refinement for hyperbolic par-

tial differential equations, Journal of Computational Physics, 53 (1984),

pp. 484–512.

Page 9

Scalable Numerical Methods for Material Design

9

Table 1: A comparison of eigenvalues for the FFT, adaptive multigrid solver,

and a uniform grid solver. All methods used approximately the same number

of basis elements. The known solution for rcut= 0 is -0.911.

rcut

0.0

0.1

0.2

0.3

0.4

FFT

-1.0946

-0.9986

-0.8998

-0.8664

-0.8427

Adaptive

-0.9005

-0.8931

-0.8734

-0.8551

-0.8325

Uniform

-1.2009

-1.0353

-0.9035

-0.8672

-0.8430

[4] Z. Cai, J. Mandel, and S. McCormick, Multigrid methods for nearly sin-

gular linear equations and eigenvalue problems. (submitted for publica-

tion), 1994.

[5] K. Cho, T. A. Arias, J. D. Joannopoulos, and P. K. Lam, Wavelets

in electronic structure calculations, Physical Review Letters, 71 (1993),

pp. 1808–1811.

[6] S. R. Kohn and S. B. Baden, A robust parallel programming model for

dynamic non-uniform scientific computations, in Proceedings of the 1994

Scalable High Performance Computing Conference, May 1994.

[7]

, The parallelization of an adaptive multigrid eigenvalue solver with

LPARX, in Proceedings of the Sixth SIAM Conference on Parallel Pro-

cessing for Scientific Computing, San Francisco, CA, Februrary 1995.

[8] W. Kohn and L. Sham, Phys, Rev., 140 (1965), p. A1133.

[9] M. W. Sung, Molecular Dynamics Simulation Of Metallic Clusters and

Heteroclusters, PhD thesis, University of California, San Diego, 1994.

#### View other sources

#### Hide other sources

- Available from John H Weare · May 16, 2014
- Available from arxiv.org