Page 1

KSSOLV - A MATLAB Toolbox for Solving the

Kohn-Sham Equations

Chao Yang and Juan C. Meza and Byounghak Lee and Lin-Wang Wang

We describe the design and implementation of KSSOLV, a MATLAB toolbox for solving a class of

nonlinear eigenvalue problems known as the Kohn-Sham equations. These types of problems arise

in electronic structure calculations, which are nowadays essential for studying the microscopic

quantum mechanical properties of molecules, solids and other nanoscale materials. KSSOLV is

well suited for developing new algorithms for solving the Kohn-Sham equations and is designed

to enable researchers in computational and applied mathematics to investigate the convergence

properties of the existing algorithms. The toolbox makes use of the object-oriented programming

features available in MATLAB so that the process of setting up a physical system is straightfor-

ward and the amount of coding effort required to prototype, test and compare new algorithms

is significantly reduced. All of these features should also make this package attractive to other

computational scientists and students who wish to study small to medium size systems.

Categories and Subject Descriptors: G.1.10 [Numerical Analysis]: Applications – Electronic

Structure Calculation; G.1.3 [Numerical Analysis]: Numerical Linear Algebra; G.1.6 [Numer-

ical Analysis]: Optimization; G. 4. [Mathematics of Computing]: Mathematical Software-

Algorithm Design and Analysis

General Terms: nonlinear eigenvalue problem, density functional theory (DFT), Kohn-Sham equa-

tions, self-consistent field iteration (SCF), direct constrained minimization (DCM)

Additional Key Words and Phrases: planewave discretization, pseudopotential

1.INTRODUCTION

KSSOLV is a MATLAB toolbox for solving a class of nonlinear eigenvalue problems

known as the Kohn-Sham equations. These types of problems arise in electronic

structure calculations, which are nowadays essential for studying the microscopic

quantum mechanical properties of molecules, solids and other nanoscale materials.

Of the many approaches for studying the electronic structure of molecular systems,

methods based on Density Functional Theory (DFT) [Hohenberg and Kohn 1964]

have been shown to be among the most successful. Through the DFT formalism,

one can reduce the many-body Schr¨ odinger equation used to describe the electron-

electron and electron-nucleus interactions to a set of single-electron equations that

have far fewer degrees of freedom. These equations, which we will describe in more

detail in the next section, were first developed by W. Kohn and L. J. Sham [Kohn

and Sham 1965]. Discretizing the single-electron equations results in a set of non-

linear equations that resemble algebraic eigenvalue problems presented in standard

...

Permission to make digital/hard copy of all or part of this material without fee for personal

or classroom use provided that the copies are not made or distributed for profit or commercial

advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and

notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish,

to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.

c ? 2008 ACM 1529-3785/2008/0700-0001 $5.00

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008, Pages 1–34.

Page 2

2

·

KSSOLV

linear algebra textbooks [Demmel 1997; Golub and Van Loan 1989; Trefethen and

Bau III 1997]. The main feature distinguishing the Kohn-Sham equations from the

standard linear eigenvalue problem is that the matrix operator in these equations is

a function of the eigenvectors that must be computed. For this reason, the problem

defined by the Kohn-Sham equations is more accurately described as a nonlinear

eigenvalue problem.

Due to the nonlinear coupling between the matrix operator and its eigenvectors,

the Kohn-Sham equations are more difficult to solve than standard linear eigenvalue

problems. Currently, the most widely used numerical method for solving this type

of problem is the Self Consistent Field (SCF) iteration, which we will examine in

detail in Section 3. The SCF iteration has been implemented in almost all quantum

chemistry and physics software packages. However, the mathematical convergence

properties of SCF are not yet fully understood; for example, it is well known that

the simplest form of SCF iteration often fails to converge to the correct solution

[Le Bris 2005]. Although a number of techniques have been developed by chemists

and physicists to improve the convergence of SCF, these methods are also not well

understood, and they can fail in practice as well.

Clearly, more work is needed to investigate the mathematical properties of the

Kohn-Sham equations, to rigorously analyze the convergence behavior of the SCF

iteration, and to develop improved numerical methods that are both reliable and

efficient. Some progress has recently been made in this direction [Le Bris 2005;

Canc` es and Le Bris 2000b; Canc` es 2001]. However, many efforts have been ham-

pered within the larger applied mathematics community by the lack of mathemat-

ical software tools that one can use to quickly grasp the numerical properties of

the Kohn-Sham equations and to perform simple computational experiments on

realistic systems.

The lack of such software tools also makes it difficult to introduce basic DFT

concepts and algorithms into numerical analysis courses, even though these ideas

are relatively well developed in computational chemistry and physics curricula. Al-

though a number of well designed software packages are available for performing

DFT calculations on large molecules and bulk systems [Gonze et al. 2002; Baroni

et al. 2006; Kresse and Furthm¨ uller 1996; Kronik et al. 2006; Andreoni and Curioni

2000; Wang 2008; Shao et al. 2006], it is often a daunting task for researchers and

students with a minimal physics or chemistry background to delve into these codes

to extract mathematical relations from various pieces of the software. Furthermore,

because these codes are usually designed to handle large systems efficiently on paral-

lel computers, the data structures employed to encode basic mathematical objects

such as vectors and matrices are often sophisticated and difficult to understand.

Consequently, standard numerical operations such as fast Fourier transforms, nu-

merical quadrature calculations, and matrix vector multiplications become non-

transparent, making it difficult for a computational mathematician to develop and

test new ideas in such an environment.

The KSSOLV toolbox we developed provides a tool that will enable computa-

tional mathematicians and scientists to study properties of the Kohn-Sham equa-

tions by rapidly prototyping new algorithms and performing computational experi-

ments more easily. It will also allow them to develop and compare numerical meth-

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 3

KSSOLV

·

3

ods for solving these types of problems in a user friendly environment. One of the

main features of KSSOLV is its objected-oriented design, which allows users with a

minimal physics or chemistry background to assemble a realistic atomistic system

quickly. The toolbox also allows developers to easily manipulate wavefunctions and

Hamiltonians within a more familiar linear algebra framework.

We will present the main features and capabilities of KSSOLV in this paper. Since

KSSOLV is targeted primarily towards users who are interested in the numerical

analysis aspects of electronic structure calculations, our focus will be on numerical

algorithms and how they can be easily prototyped within KSSOLV. We provide

some background information on the Kohn-Sham equations and their properties in

Section 2. Numerical methods for solving these types of problems are discussed in

Section 3 along with some of the difficulties one may encounter. We then describe

the design features and the implementation details of KSSOLV in Section 4. In

Section 5, we illustrate how an algorithm for solving the Kohn-Sham equations

can be easily implemented in KSSOLV. Several examples are provided in Section 6

to demonstrate how KSSOLV can be used to study the convergence behavior of

different algorithms and visualize the computed results. Throughout this paper,

we will use ?·? to denote the 2-norm of a vector, and ?·?Fto denote the Frobenius

norm of a matrix.

2.KOHN-SHAM ENERGY MINIMIZATION

Properties of molecules, solids and other nanoscale materials are largely determined

by the interactions among electrons in the outer shells of their atomic constituents.

These interactions can be characterized quantitatively by the electron density, which

can be viewed as a multi-dimensional probability distribution. The electron density

of a many-atom system can be obtained by solving the well known many-body

Schr¨ odinger equation

HΨ(r1,r2,...,rne) = λΨ(r1,r2,...,rne).(1)

Here Ψ(r1,r2,...,rne) (ri∈ R3and neis the number of electrons) is a many-body

wavefunction whose magnitude squared characterizes an electronic configuration in

a probabilistic sense, i.e., |Ψ(r1,r2,...,rne)|2dr1dr2···drnerepresents the probabil-

ity of finding electron 1 in a small volume around r1, electron 2 in a small volume

around r2etc., and

?

Ω

Ψ∗ΨdΩ = 1, (2)

where Ω = Ω1×Ω2···Ωne, and Ωi⊆ R3. Furthermore, the wavefunction must also

obey the antisymmetry principle, defined by

Ψ(r1,...,ri,...,rk,...,rne) = −Ψ(r1,...,rk,...,ri,...,rne).

The differential operator H is a many-body Hamiltonian that relates the electronic

configuration to the energy of the system. When appropriate boundary conditions

are imposed, the energy must be quantized and is denoted here by λ ∈ R.

Using the Born-Oppenheimer approximation, which is to say that we assume the

positions of the nuclei ˆ rj,j = 1,2,...,nu, are fixed, where nudenotes the number

(3)

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 4

4

·

KSSOLV

of nuclei, the many-electron Hamiltonian H can be defined (in atomic units) by

H = −1

2

ne

?

i=1

∆ri−

nu

?

j=1

ne

?

i=1

zj

||ri− ˆ rj||+

?

1≤i,j≤ne

1

||ri− rj||, (4)

where ∆riis the Laplacian operator associated with the ith electron, and zjis the

charge of the jth nucleus.

Equation (1) is clearly an eigenvalue problem. In many cases, we are interested in

the eigenfunction Ψ associated with the smallest eigenvalue λ1, which corresponds

to the minimum (ground state) of the total energy functional

Etotal(Ψ) =

?

Ω

Ψ∗HΨ dΩ, (5)

subject to the normalization and antisymmetry constraints (2) and (3). For atoms

and small molecules that consist of a few electrons (less than three), we can dis-

cretize (1) and solve the eigenvalue problem directly. However, as neincreases, the

number of degrees of freedom in (1), after it is discretized, increases exponentially

making the problem computationally intractable. For example, if riis discretized

on an m × m × m grid, the dimension of H is n = m3ne. For m = 32 and ne= 5,

n is greater than 3.5 × 1022. Thus, it would be infeasible to solve the resulting

eigenvalue problem on even the most powerful computers available today.

To address the dimensionality curse, several approximation techniques have been

developed to decompose the many-body Schr¨ odinger equation (1) into a set of

single-electron equations that are coupled through the electron density (defined

below). The most successful among these is based on Density Functional The-

ory [Hohenberg and Kohn 1964]. In their seminal work, Hohenberg and Kohn

proved that at the ground-state, the total energy of an electronic system can be

described completely by a function of the 3-D electron density

ρ(r) ≡ ne

?

Ω\Ω1

|Ψ(r,r2,r3,...,rne)|2dr2dr3···drne.

Assuming all electrons are indistinguishable, the quantity ρ(r)dr/negives the prob-

ability of finding an electron within a small volume around r ∈ R3.

Unfortunately the proof given in [Hohenberg and Kohn 1964] is not constructive

and the analytical expression for this density-dependent total energy functional

is unknown. Subsequently, Kohn and Sham [Kohn and Sham 1965] proposed a

practical procedure to approximate the total energy by making use of single-electron

wavefunctions associated with a non-interacting reference system. Using this Kohn-

Sham model, the total energy (5) can be defined as

EKS

total=

1

2

ne

?

i=1

?

Ω

?

Ω

||∇ψi(r)||2dr +

?

Ω

ρ(r)Vion(r)dr +

1

2

?

Ω

ρ(r)ρ(r′)

||r − r′||drdr′+ Exc(ρ), (6)

where ψi, i = 1,2,...,neare the single-particle wavefunctions that satisfy the or-

thonormality constraint?ψ∗

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

iψj= δi,j, and Ω ⊂ R3. Here ρ(r) is the charge density

Page 5

KSSOLV

·

5

defined by

ρ(r) =

ne

?

i=1

ψ∗

i(r)ψi(r). (7)

The function Vion(r) =?nu

by the nuclei, and Exc(ρ) is known as the exchange-correlation energy, which is

a correction term used to account for energy that the non-interacting reference

system fails to capture.

As the analytical form of Exc(ρ) is unknown, several approximations have been

derived semi-empirically [Perdew and Zunger 1981; Perdew and Wang 1992]. In

KSSOLV, we use the local density approximation (LDA) proposed in [Kohn and

Sham 1965]. In particular, Excis expressed as

j=1zj/||r − ˆ rj|| represents the ionic potential induced

Exc(ρ) =

?

Ω

ρ(r)ǫxc[ρ(r)]dr,(8)

where ǫxc(ρ) represents the exchange-correlation energy per particle in a uniform

electron gas of density ρ. The analytical expression of ǫxcused in KSSOLV is the

widely accepted formula developed in [Perdew and Zunger 1981]. To simplify the

presentation, we have ignored the spin degree of freedom in ψi(r), ρ(r) and Exc.

For some applications, it is important to include this extra degree of freedom which

gives the local spin-density approximation (LSDA) of Exc.

It is not difficult to show that the first order necessary condition (Euler-Lagrange

equation) for the constrained minimization problem

min EKS

{ψi}

s.t

total({ψi})

ψ∗

iψj= δi,j

(9)

has the form

H(ρ)ψi = λiψi, i = 1,2,...,ne,

ψ∗

iψj = δi,j.

(10)

(11)

where the single-particle Hamiltonian H(ρ) (also known as the Kohn-Sham Hamil-

tonian) is defined by

H(ρ) = −1

2∆ + Vion(r) + ρ(r) ⋆

1

||r||+ Vxc(ρ), (12)

where ⋆ denotes the convolution operator. The function Vxc(ρ) in (12) is the deriva-

tive of Exc(ρ) with respect to ρ. Because the Kohn-Sham Hamiltonian is a function

of ρ, which is in turn a function of {ψi}, the set of equations defined by (10) results

in a nonlinear eigenvalue problem. These equations are collectively referred to as

the Kohn-Sham equations. Interested readers can learn more about these equations

from several sources (e.g. [Nogueira et al. 2003]).

3. NUMERICAL METHODS

In this section, we will describe the numerical methods employed in KSSOLV to

obtain an approximate solution to the Kohn-Sham equations (10)-(11). We begin

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 6

6

·

KSSOLV

by discussing the planewave discretization scheme that turns the continuous non-

linear problem into a finite-dimensional problem. The finite-dimensional problem is

expressed as a matrix problem in Section 3.2. We present two different approaches

to solving the matrix nonlinear eigenvalue problem in Sections 3.3 and 3.4. Both

of these approaches have been implemented in KSSOLV.

3.1Planewave Discretization

To solve the minimization problem (9) or the Kohn-Sham equations numerically,

we must first discretize the continuous problem. Standard discretization schemes

such as finite difference, finite elements and other basis expansion (Ritz-Galerkin)

methods [Ritz 1908] all have been used in practice. The discretization scheme we

have implemented in the current version of KSSOLV is a Ritz type of method that

expresses a single electron wavefunction ψ(r) as a linear combination of planewaves

{e−igT

lexicographical order. The planewave basis is a natural choice for studying periodic

systems such as solids. It can also be applied to non-periodic structures (e.g.,

molecules) by embedding these structures in a fictitious supercell [Payne et al.

1992] that is periodically extended throughout an open domain. The use of the

planewave basis has the additional advantage of making various energy calculations

in density functional theory easy to implement. It is the most convenient choice for

developing and testing numerical algorithms for solving the Kohn-Sham equations

within the MATLAB environment, partly due to the availability of efficient fast

Fourier transform (FFT) functions.

It is natural to assume that the potential for R-periodic atomistic systems is a

periodic function with a period R ≡ (R1,R2,R3). Consequently, we can restrict

ourselves to one canonical period often referred to as the primitive cell and impose

periodic boundary conditions on the restricted problem. It follows from Bloch’s

theorem [Ashcroft and Mermin 1976; Bloch 1928] that eigenfunctions of the re-

stricted problem ψ(r) can be periodically extended to the entire domain (to form

the eigenfunction of the original Hamiltonian) by using the following formula:

jr}, where gj ∈ R3(j = 1,2,...,ng) are frequency vectors arranged in a

ψ(r + R) = eikTRψ(r),(13)

where k = (k1,k2,k3) is a frequency or wave vector that belongs to a primitive cell

in the reciprocal space (e.g., the first Brillouin zone (BZ) [Ashcroft and Mermin

1976]). If the R-periodic system spans the entire infinite open domain, the set of

k’s allowed in (13) forms a continuum in the first Brillouin zone. That is, each

ψ(r) generates an infinite number of eigenfunctions for the periodic structure. It

can be shown that the corresponding eigenvalues form a continuous cluster in the

spectrum of the original Hamiltonian [Ashcroft and Mermin 1976]. Such a cluster

is often referred to as an energy band in physics. Consequently, the complete set of

eigenvectors of H can be indexed by the band number i and the Brillouin frequency

vector k (often referred to as a k-point), i.e., ψi,k. In this case, the evaluation of

the charge density must first be performed at each k-point by replacing ψi(r) in (7)

with ψi,kto yield

ne

?

i=1

ρk(r) =

|ψi,k(r)|2.

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 7

KSSOLV

·

7

The total charge density ρ(r) can then be obtained by integrating over k, i.e.,

ρ(r) =

|Ω|

(2π)3

?

BZ

ρk(r)dk,(14)

where |Ω| denotes the volume of the primitive cell in the first Brillouin zone. Fur-

thermore, an integration with respect to k must also be performed for the kinetic

energy term in (6).

When the primitive cell (or supercell) in real space is sufficiently large, the first

Brillouin zone becomes so small that the integration with respect to k can be

approximated by a single k-point calculation in (6) and (14).

To simplify our exposition, we will, from this point on, assume that a large

primitive cell is chosen in the real space so that no integration with respect to k

is necessary. Hence we will drop the index k in the following discussion and use

ψ(r) to represent an R-periodic single particle wavefunction. The periodic nature

of ψ(r) implies that it can be represented (under some mild assumptions) by a

Fourier series, i.e.,

ψ(r) =

∞

?

j=−∞

cjeigT

jr, (15)

where cjis a Fourier coefficient that can be computed from

cj=

?R/2

−R/2

ψ(r)e−igT

jrdr.

To solve the Kohn-Sham equations numerically, the Fourier series expansion (15)

must be truncated to allow a finite number of terms only.

treated equally, the number of terms required in (15) will be extremely large. This

is due to the observation that the strong interaction between a nucleus and the

inner electrons of an atom, which can be attributed to the presence of singularity

in Vion(r) at the the nuclei position ˆ rj, must be accounted for by high frequency

planewaves. However, because the inner electrons are held tightly to the nuclei, they

are not active in terms of chemical reactions, and they usually do not contribute to

chemical bonding or other types of interaction among different atoms. On the other

hand, the valence electrons (electrons in atomic orbits that are not completely filled)

can be represented by a relatively small number of low frequency planewaves. These

electrons are the most interesting ones to study because they are responsible for

a majority of the physical properties of the atomistic system. Hence, it is natural

to focus only on these valence electrons and treat the inner electrons as part of

an ionic core. An approximation scheme that formalizes this approach is called

the pseudopotential approximation [Phillips 1958; Phillips and Kleinman 1958; Yin

and Cohen 1982]. The details of pseudopotential construction and their theoretical

properties are beyond the scope of this paper. For the purpose of this paper, we

shall just keep in mind that the use of pseudopotentials allows us to

If all electrons are

(1) remove the singularity in Vion;

(2) reduce the number of electrons ne in (6) and (7) to the number of valence

electrons;

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 8

8

·

KSSOLV

(3) represent the wavefunction associated with a valence electron by a small number

of low frequency planewaves.

In practice, the exact number of terms used in (15) is determined by a kinetic

energy cutoff Ecut. Such a cutoff yields an approximation

ψ(r) =

ng

?

j=1

cjeigT

jr, (16)

where ngis chosen such that

||gj||2< 2Ecut, (17)

for all j = 1,2,...,ng. Although, the value of ngwill depend on many parameters

such as the size and type of the system being studied, it is typically an order of

magnitude smaller than n = n1× n2× n3.

Once Ecutis chosen, the minimal number of samples of r along each Cartesian

coordinate direction (n1, n2, n3) required to represent ψ(r) (without the aliasing

effect) can be determined from the sampling theorem [Nyquist 1928]. That is, we

must choose nk(k = 1,2,3) sufficiently large so that

1

2

?2πnk

Rk

?

> 2

?

2Ecut, (18)

is satisfied, i.e., nkmust satisfy nk> 2Rk

We will denote the uniformly sampled ψ(r) by a vector x ∈ Rn, where n = n1n2n3

and the Fourier coefficients cj in (16) by a vector c ∈ Cnwith zero paddings used

to ensure the length of c matches that of x. If the elements of x and c are ordered

properly, these two vectors satisfy

√2Ecut/π.

c = Fx. (19)

where F ∈ Cn×nis a discrete Fourier transform matrix [Van Loan 1987].

After a sampling grid has been properly defined, the approximation to the to-

tal energy can be evaluated by replacing the integrals in (6) and (8) with simple

summations over the sampling grid.

The use of a planewave discretization makes it easy to evaluate the kinetic energy

of the system. Since

∇reigT

jr= igjeigT

jr,

the first term in (6) can be computed as

1

2

ne

?

ℓ=1

ng

?

j=1

||gjc(ℓ)

j||2, (20)

where c(ℓ)

valence electron (denoted by xℓ). Here, one can take advantage of the orthogonality

properties of the planewave basis, which allows one to remove the integral from the

equation.

j

is the jth Fourier coefficient of the wavefunction associated with the ℓth

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 9

KSSOLV

·

9

3.2 Finite-Dimensional Kohn-Sham Problem

If we let X ≡ (x1,x2,...,xne) ∈ Cn×nebe a matrix that contains ne discretized

wavefunctions, the approximation to the kinetic energy (6) can also be expressed

by

ˆEkin=1

2trace(X∗LX), (21)

where L is a finite-dimensional representation of the Laplacian operator in the

planewave basis. Due to the periodic boundary condition imposed in our problem,

L is a block circulant matrix with circulant blocks that can be decomposed as

L = F∗DgF,(22)

where F is the discrete Fourier transform matrix used in (19), and Dgis a diagonal

matrix with ||gj||2on the diagonal [Davis 1979]. If follows from (19) and (22) that

(20) and (21) are equivalent.

In the planewave basis, the convolution that appears in the third term of (6)

may be viewed as the L−1ρ(X), where ρ(X) = diag(XX∗). (To simplify notation,

we will drop X in ρ(X) in the following.) However, since L is singular (due to the

periodic boundary condition), its inverse does not exist. Similar singularity issues

appear in the planewave representation of the pseudopotential and the calculation

of the ion-ion interaction energy. However, it can be shown that the net effects of

these singularities cancel out for a system that is electrically neutral [Ihm et al.

1979; Pickett 1989]. Thus, one can simply remove these singularities by replacing

L−1ρ with L†ρ, where L†is the pseudo-inverse of L defined as

L†= F∗D†

gF,

where D†

gis a diagonal matrix whose diagonal entries (dj) are

dj=

?||gj||−2if gj?= 0;

0 otherwise.

Consequently, the third term in (6), which corresponds to an approximation to the

Coulomb potential, can be evaluated as

ˆEcoul= ρTL†ρ = [Fρ]∗D†

g[Fρ],

However, removing these singularities results in a constant shift of the total energy,

for which a compensation must be made. It has been shown in [Ihm et al. 1979]

that this compensation can be calculated by adding a term Erepthat measures the

degree of repulsiveness of the local pseudopotential with a term that corresponds

to the non-singular part of ion-ion potential energy. Because the second term can

be evaluated efficiently by using a technique originally developed by Ewald [Ewald

1921], it is denoted by EEwald. Both Erep and EEwald can be computed once and

for all in a DFT calculation. We will not go into further details of how they are

computed since they do not play any role in the algorithms we will examine in this

paper.

To summarize, the use of a planewave basis allows us to define a finite-dimensional

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 10

10

·

KSSOLV

approximation to the total energy functional (6) as

ˆEtotal(X) = trace[X∗(1

2L +ˆVion)X] +1

2ρTL†ρ + ρTǫxc(ρ) + EEwald+ Erep, (23)

whereˆViondenotes the ionic pseudopotentials sampled on the suitably chosen Carte-

sian grid of size n1× n2× n3.

It is easy to verify that the KKT condition associated with the constrained min-

imization problem

min

X∗X=I

ˆEtotal(X) (24)

is

H(X)X − XΛne= 0, (25)

X∗X = I,

where

H(X) =1

2L +ˆVion+ Diag(L†ρ) + Diag(µxc(ρ)), (26)

µxc(ρ) = dǫxc(ρ)/dρ, and Λneis a ne× nesymmetric matrix of Lagrangian multi-

pliers. For simplicity, we will frequently denote the last three terms in (26) by

Vtot=ˆVion+ Diag(L†ρ) + Diag(µxc(ρ)), (27)

BecauseˆEtotal(X) =ˆEtotal(XQ) for any orthogonal matrix Q ∈ Cne×ne, we can

always choose a particular Q such that Λneis diagonal. In this case, Λnecontains

neeigenvalues of H(X). We are interested in the nesmallest eigenvalues and the

invariant subspace X associated with these eigenvalues.

3.3 The SCF Iteration

Currently, the most widely used algorithm for solving (25) is the self-consistent field

(SCF) iteration which we outline in Figure 1 for completeness.

SCF Iteration

Input:

Output:

An initial guess of the wavefunction X(0)∈ Cn×ne, pseudopotential;

X ∈ Cn×nesuch that X∗X = Ineand columns of X span the invariant

subspace associated with the smallest ne eigenvalues of H(X) defined in (26).

1.

2.

3.

for k = 1,2, ... until convergence

Form H(k)= H(X(k−1));

Compute X(k)such that H(k)X(k)= X(k)Λ(k), and Λ(k)

contains the ne smallest eigenvalues of H(k);

end for

4.

Fig. 1.The SCF iteration

In [Yang et al. 2007], we viewed the SCF iteration as an indirect way to minimize

ˆEtotalthrough the minimization of a sequence of quadratic surrogate functions of

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 11

KSSOLV

·

11

the form

q(X) =1

2trace(X∗H(k)X),(28)

on the manifold X∗X = Ine. This constrained minimization problem is solved in

KSSOLV by running a small number of locally optimal preconditioned conjugate

gradient (LOBPCG) iterations [Knyazev 2001].

Since the surrogate functions share the same gradient withˆEtotalat X(k), i.e.,

∇ˆEtotal(X)|X=X(k) = H(k)X(k)= ∇q(X)|X=X(k),

moving along a descent direction associated with q(X) is likely to produce a reduc-

tion inˆEtotal. However, because gradient information is local, there is no guarantee

that the minimizer of q(X), which may be far from X(k), will yield a lowerˆEtotal

value. This observation partially explains why SCF often fails to converge. It also

suggests at least two ways to improve the convergence of SCF.

One possible improvement is to replace the simple gradient-matching surrogate

q(X) with another quadratic function whose minimizer is more likely to yield a

reduction inˆEtotal. In practice, this alternative quadratic function is often con-

structed by replacing the charge density ρ(k)in (26) with a linear combination of

m previously computed charge densities, i.e.,

ρmix=

m−1

?

j=0

αjρ(k−j),

where a = (α0,α2,...,αk−m+1) is chosen as the solution to the following minimiza-

tion problem:

min

aTe=1?Ra?2

(29)

where R = (∆ρ(k)∆ρ(k−1)... ∆ρ(m−1)),∆ρ(k)= ρ(k)−ρ(k−1)and e = (1,1,...,1)T.

This technique is often called charge mixing. The particular mixing scheme defined

by the solution to (29) is called Pulay mixing because it was first proposed by Pu-

lay for Hartree-Fock calculations [Pulay 1980; 1982]. (In computational chemistry,

Pulay mixing is referred to as the method of direct inversion of iterative subspace or

simply DIIS). Other mixing schemes include Kerker mixing [Kerker 1981], Thomas-

Fermi mixing [Raczkowski et al. 2001] and Broyden mixing [Kresse and Furthm¨ uller

1996]. Charge mixing is often quite effective in practice for improving the conver-

gence SCF even though its convergence properties are still not well understood. In

some cases, charge mixing may fail also [Canc` es and Le Bris 2000a; Yang et al.

2005].

Another way to improve the convergence of the SCF iteration is to impose an

additional constraint to the surrogate minimization problem (28) so that the wave-

function update can be restricted within a small neighborhood of the gradient

matching point X(k), thereby ensuring a reduction of the total energy function as

we minimize the surrogate function. In [Yang et al. 2007], we showed that the

following type of constraint

?XX∗− X(k)X(k)∗?2

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

F≤ ∆,

Page 12

12

·

KSSOLV

where ∆ > 0 is a suitably chosen parameter, is preferred because it is rotationally

invariant (i.e., post-multiplying X by an unitary matrix does not change the con-

straint) and because adding such a constraint does not increase the complexity of

solving the surrogate minimization problem. It is not difficult to show [Yang et al.

2007] that solving the following constrained minimization problem,

minq(X)

XX∗= I

?XX∗− X(k)X(k)∗?2

F≤ ∆

(30)

is equivalent to solving a low rank perturbed linear eigenvalue problem

?

H(k)− σX(k)X(k)∗?

X = XΛ, (31)

where σ can be viewed as the Lagrange multiplier for the inequality constraint in

(30), and Λ is a diagonal matrix that contains the nesmallest eigenvalues of the

low-rank perturbed matrix H(k). When σ is sufficiently large (which corresponds to

a trust region radius ∆ that is sufficiently small), the solution to (31) is guaranteed

to produce a reduction inˆEtotal(X).

When neis relatively small compared to n, the computational complexity of the

SCF iteration is dominated by the floating point operations carried out in the mul-

tiplications of H(k)with discretized wavefunctions in X. These multiplications are

performed repeatedly in an iterative method (e.g., the LOBPCG method or the

Lanczos method) used at Step 3 in Figure 1 to obtain an approximate minimizer

of (28). When a planewave expansion is used to represent X, each multiplication

requires the use of a 3-D FFT operation to convert the Fourier space representa-

tion of each column of X into the real space representation before multiplications

involving local potential terms in (27) can be carried out. An inverse 3-D FFT is

required to convert the product back to the Fourier space. The complexity of each

conversion is O(nlogn). If m LOBPCG iterations are used on average to obtain

an approximate minimizer of (28), the total number 3-D FFTs required per SCF

iteration is 2mne. In addition, each SCF iteration also performs O(n · n2

linear algebra (BLAS) operations. When nebecomes larger, these operations can

become a significant part of the computational cost.

The amount of memory required by the SCF iteration consists of 3ngnedouble

precision and complex arithmetic words that must be allocated to store the current

approximation to the desired wavefunctions, the gradient of the total energy, and

additional workspace required in the LOBPCG or Lanczos algorithm for eigenvector

calculations. An additional γn words are needed to store the various potential

components in the Hamiltonian, the charge density approximation ρ as well as

vectors that must be saved to perform charge mixing in (29), where the value of γ

is typically less than 20.

e) basic

3.4Direct Constrained Minimization

Instead of focusing on Kohn-Sham equations (25) and minimizing the total energy

indirectly in the SCF iteration, we can minimize the total energy directly in an

iterative procedure that involves finding a sequence of search directions along which

ˆEtotal(X) decreases and computing an appropriate step length. In most of the

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 13

KSSOLV

·

13

earlier direct minimization methods developed in [Arias et al. 1992; Gillan 1989;

Kresse and Furthm¨ uller 1996; Payne et al. 1992; Teter et al. 1989; VandeVondele

and Hutter 2003; Voorhis and Head-Gordon 2002], the search direction and step

length computations are carried out separately. This separation sometimes results

in slow convergence. We recently developed a new direct constrained minimization

(DCM) algorithm [Yang et al. 2005; 2007] in which the search direction and step

length are obtained simultaneously in each iteration by minimizing the total energy

within a subspace spanned by columns of

Y =

?

X(k),M−1R(k),P(k−1)

?

,

where X(k)is the approximation to X obtained at the kth iteration, R(k)=

H(k)X(k)−X(k)Λ(k), M is a Hermitian positive definite preconditioner, and P(k−1)

is the search direction obtained in the previous iteration. It was shown in [Yang

et al. 2005] that solving the subspace minimization problem is equivalent to comput-

ing the eigenvectors G associated with the nesmallest eigenvalues of the following

nonlinear eigenvalue problem

ˆH(G)G = BGΩ, G∗BG = I, (32)

where

ˆH(G) = Y∗

?1

2L + Vion+ Diag

?

L†ρ(Y G)

?

+ Diag

?

µxc(ρ(Y G))

??

Y, (33)

and B = Y∗Y .

Because the dimension ofˆH(G) is 3ne× 3ne, which is normally much smaller

than that of H(X), it is relatively easy to solve (32) by, for example, a trust region

enabled SCF (TRSCF) iteration. We should note that it is not necessary to solve

(32) to full accuracy in the early stage of the DCM algorithm because all we need

is a G that yields sufficient reduction in the objective function.

Once G is obtained, we can update the wave function by

X(k+1)← Y G.

The search direction associated with this update is defined, using the MATLAB

submatrix notation, to be

P(k)≡ Y (:,ne+ 1 : 3ne)G(ne+ 1 : 3ne,:).

A complete description of the constrained minimization algorithm is shown in Fig-

ure 2. We should point out that solving the projected optimization problem in

Step 7 of the algorithm requires us to evaluate the projected HamiltonianˆH(G)

repeatedly as we search for the best G. However, since the first two terms ofˆH do

not depend on G. They can be computed and stored in advance. Only the last two

terms of (33) need to be updated. These updates require the charge density, the

Coulomb and the exchange-correlation potentials to be recomputed.

In each DCM iteration, ne Hamiltonian-wavefunction multiplications are per-

formed to obtain the gradient. When an iterative method is used to solved the

projected nonlinear eigenvalue problem (32), the charge density ρ(Y G) and the

projected Hamiltonian must be updated repeatedly. The update of the projected

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 14

14

·

KSSOLV

Algorithm: A Constrained Minimization Algorithm for Total Energy Minimization

Input: initial set of wave functions X(0)∈ Cn×ne; ionic pseudopotential; a precondi-

tioner M;

Output:X ∈ Cn×ksuch that the Kohn-Sham total energy functional Etotal(X) is

minimized and X∗X = Ik.

1.

Orthonormalize X(0)such that X(0)∗X(0)= Ik;

2.

for k = 0,1,2, ... until convergence

3.

Compute Θ = X(k)∗H(k)X(k);

4.

Compute R = H(k)X(k)− X(k)Θ,

5.

if (i > 1) then

Y ← (X(k),M−1R,P(k−1))

else

Y ← (X(k),M−1R);

endif

6.

B ← Y∗Y ;

7.

Find G ∈ C2ne×ne or C3ne×nethat minimizes Etotal(Y G)

subject to the constraint G∗BG = Ine;

8.

Set X(k+1)= Y G;

9.

if (i > 1) then

P(k)← Y (:,ne+ 1 : 3ne)G(ne+ 1 : 3ne,:);

else

P(k)← Y (:,ne+ 1 : 2ne)G(ne+ 1 : 2ne,:);

endif

10.

end for

Fig. 2. A Direct Constrained Minimization Algorithm for Total Energy Minimization

Hartree potential requires us to compute L†ρ(Y G). This calculation makes use

of two 3-D FFTs, hence has a complexity of O(nlogn). If m inner iterations are

taken in the DCM algorithm to solve the projected problem, the total number of

3-D FFTs used per DCM iteration is 2(ne+ m). The memory requirement of the

DCM algorithm is similar to that of an SCF iteration.

4.THE OBJECT-ORIENTED DESIGN OF KSSOLV

Both the SCF iteration and the DCM algorithm have been implemented in the

KSSOLV toolbox, which is written entirely in MATLAB. It is designed to be mod-

ular, hierarchical, and extensible so that other algorithms can be easily developed

under the same framework. In addition to taking advantage of efficient linear al-

gebra operations and the 3-D fast Fourier transform (FFT) function available in

MATLAB, the toolbox also makes use of MATLAB’s object-oriented programming

(OOP) features. KSSOLV contains several predefined classes that can be easily

used to build a physical atomistic model in MATLAB and to construct numerical

objects associated with planewave discretized Kohn-Sham equations. These classes

are listed in Table I. The class names that appear in the first column of this table

are treated as keywords in KSSOLV. We will demonstrate how specific instances

of these classes (called objects) are created and used in KSSOLV. The internal

structure of these classes are explained in detail in [Yang 2007].

The use of the object-oriented design allows us to achieve two main objectives:

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.

Page 15

KSSOLV

·

15

(1) Simplify the process of setting up a molecular or bulk system and converting

physical attributes of the system to numerical objects that users can work with

easily.

(2) Enable numerical analysts and computational scientists to easily develop, test

and compare different algorithms for solving the Kohn-Sham equation.

Class name

Atom

Molecule

Hamilt

Wavefun

FreqMask

Purpose

Defines attributes of an atom

Defines attributes of a molecule or a basic cell of a periodic system

Defines attributes of a Kohn-Sham Hamiltonian, e.g, potential

Defines one or a set of wavefunctions

Defines a mask used to filter high frequency components of a wavefunction

Table I. Classes defined in KSSOLV

In the following, we will illustrate how to define a molecular or bulk system in

KSSOLV by creating Atom and Molecule objects. We will then show how to set

up a Kohn-Sham Hamiltonian, which is represented as a Hamilt object, associ-

ated with a Molecule object. In KSSOLV, 3-D wavefunctions are represented as

Wavefun objects. Although each Wavefun object stores the Fourier coefficients of

a truncated planewave expansion of one or a few wavefunctions in a compact way,

it can be manipulated as either a vector or a matrix. Both the Hamilt and the

Wavefun objects are used extensively in the KSSOLV implementation of the SCF

and DCM algorithms. As we will see in the following, using these objects sig-

nificantly reduces the coding effort required to implement or prototype numerical

algorithms for solving the Kohn-Sham equations.

4.1From Atoms to Molecules and Crystals

To solve the Kohn-Sham equations associated with a particular molecular or bulk

system in KSSOLV, we must first construct a Molecule object. Even though a

bulk system (such as a crystal) is physically different from a molecule, we currently

do not make such a distinction in KSSOLV. Both systems are considered periodic.

In the case of a molecule, the periodicity is introduced by placing the molecule in

a fictitious supercell that is periodically extended.

To construct a Molecule object, we use

mol = Molecule();

to first create an empty object called mol (a user-defined variable name). This call

simply sets up the required data structure that is used to describe attributes of

mol.

Before mol can be used in subsequent calculations, we must initialize all of its es-

sential attributes, which include the number and type of atoms in this molecule, the

size and shape of the supercell that contains the molecule, etc. All these attributes

can be defined by using the set method associated with the Molecule class. The

syntax of the set function is

mol = set(mol,attrname,attrvalue);

ACM Transactions on Mathematical Software, Vol. V, No. N, June 2008.