Partly occupied Wannier functions: Construction and applications
ABSTRACT We have developed a practical scheme to construct partly occupied, maximally localized Wannier functions (WFs) for a wide range of systems. We explain and demonstrate how the inclusion of selected unoccupied states in the definition of the WFs can improve both their localization and symmetry properties. A systematic selection of the relevant unoccupied states is achieved by minimizing the spread of the resulting WFs. The method is applied to a silicon cluster, a copper crystal, and a Cu(100) surface with nitrogen adsorbed. In all cases we demonstrate the existence of a set of WFs with particularly good localization and symmetry properties, and we show that this set of WFs is characterized by a maximal average localization. Comment: 12 pages, 14 figures
-
Citations (0)
-
Cited In (0)
Page 1
arXiv:cond-mat/0506487v1 [cond-mat.mtrl-sci] 20 Jun 2005
Partly occupied Wannier functions: Construction and applications
K. S. Thygesen, L. B. Hansen, and K. W. Jacobsen
Center for Atomic-scale Materials Physics,
Department of Physics, Technical University of Denmark,
DK - 2800 Kgs.
(Dated: February 2, 2008)
Lyngby, Denmark
We have developed a practical scheme to construct partly occupied, maximally localized Wannier
functions (WFs) for a wide range of systems.
of selected unoccupied states in the definition of the WFs can improve both their localization
and symmetry properties. A systematic selection of the relevant unoccupied states is achieved by
minimizing the spread of the resulting WFs. The method is applied to a silicon cluster, a copper
crystal and a Cu(100) surface with nitrogen adsorbed. In all cases we demonstrate the existence of
a set of WFs with particularly good localization and symmetry properties, and we show that this
set of WFs is characterized by a maximal average localization.
We explain and demonstrate how the inclusion
PACS numbers: 71.15.Ap, 31.15.Ew, 31.15.Rh
I.INTRODUCTION
A characteristic property of the single-particle eigen-
states of most molecular and solid state systems is their
delocalized nature. For many practical purposes this
property is undesired and the construction of equivalent
representations in terms of localized orbitals becomes an
important issue.
Within the independent-particle approximation the
use of Wannier functions (WFs) allows for an exact de-
scription of the electronic groundstate in terms of a min-
imal set of localized orbitals1. The Wannier basis is truly
minimal in the sense that the number of orbitals is just
enough to accomodate the valence electrons of the sys-
tem. Moreover, these localized WFs provide a formal
justification of the widely used tight-binding2and Hub-
bard models3. Being the local analogue of the extended
Bloch states of solid state physics, the WFs formalize
standard chemical concepts such as bonding, coordina-
tion and electron lone pairs. Among the more technical
applications of Wannier functions we mention the con-
nection to polarization theory5,6and their use within
so-called “linear scaling” or “order-N” methods to ob-
tain the electronic groundstate4. Very recently numerical
methods for electron transport calculations employing a
Wannier function basis set have been developed7,8.
In the context of molecular systems the analogue of
Wannier functions for finite systems has been studied un-
der the name ”localized molecular orbitals”9,10,11,12,13,14.
These are traditionally defined by an appropriate unitary
transformation of the occupied single-particle eigenstates
and have been used for investigation of chemical bond-
ing. In the following we shall for simplicity use the term
WF to cover also localized molecular orbitals.
In 1997 Marzari and Vanderbilt developed a scheme
to perform practical calculations of maximally localized
Wannier functions for an isolated group of bands, i.e. a
set of bands which is separated by a finite gap from all
higher- and lower-lying bands15. Within this scheme, the
usual arbitrariness inherit in the definition of the Wan-
nier functions due to the unspecified set of unitary trans-
formations of the Bloch states at every wave vector, is
removed by requiring that the sum of second moments
of the resulting WFs is minimal. The method follows
the traditional idea of defining Wannier functions by a
unitary transformation of the occupied (Bloch-) orbitals.
In general, such methods fail to produce well localized
orbitals when applied to metallic systems because the
unoccupied states belonging to the partly filled valence
bands18are not considered. Of course, in cases where the
partly filled valence bands are separated by a gap from
all higher bands, the method of Marzari and Vanderbilt
still applies. However, in the more general case where
the bands of interest cross and/or hybridize with other
unwanted bands a different approach must be used.
In this paper we demonstrate how the localization and
in some cases also the symmetry of a set of WFs can
be drastically improved by including selected unoccupied
states in the definition of the WFs16. The determina-
tion of the relevant unoccupied states can be viewed as a
bonding-antibonding closing procedure, where occupied
bonding states are paired with their antibonding coun-
terparts to yield localized orbitals. To be more specific,
consider two well-localized atomic orbitals on neighbor-
ing atoms in a molecule. If we allow the two states to hy-
bridize, a bonding and an antibonding combination will
result – combinations which may be less localized than
+=
=
−
FIG. 1: Schematic of the bonding-antibonding closure for a
hydrogen molecule. The construction of well-localized atomic
s-orbitals involves a matching of bonding and antibonding
orbitals, independent of their occupation. The sign of the
wave functions is indicated by the shading.
Page 2
2
the individual atomic orbitals. To regain the localized
atomic orbitals from the molecular orbitals we need both
the bonding and the antibonding combination indepen-
dent of their occupation, see Fig. 1. In some cases the
antibonding state may have hybridized further with other
states and the state which “matches” the bonding state
will be a linear combination of eigenstates. The prob-
lem we address here is the construction of a method for
systematically identifying the relevant unoccupied states.
We show that this can be achieved by optimizing the lo-
calization of the resulting WFs. The paper gives a more
detailed and extended account of the work previously
published in a Letter.16
For periodic systems the bonding-antibonding closure
can be viewed as a procedure for disentangling the partly
occupied valence bands from higher-lying bands. This
problem has previously been addressed by Souza et al.19
who proposed a disentangling method based on a mini-
mization of the change in character of the Bloch states
across the Brillouin zone (BZ). While this is a natural
strategy for crystalline systems, it is not clear how this
disentanglement procedure applies to non-periodic sys-
tems like isolated molecules, a surface with adsorbates or
a metal with impurities.
The present method is related to that of Souza et al.19,
however, instead of minimizing the dispersion across the
BZ we suggest a disentanglement procedure based exclu-
sively on a minimization of the spread of the WFs. In
this way we omit any reference to the wave-vector and are
therefore not limited to periodic systems. The generality
of the method is demonstrated by application to three
different systems: an isolated Si5cluster, a copper crys-
tal, and a Cu(100) surface with nitrogen adsorbed. Our
results for the copper crystal are very similar to those ob-
tained by Souza et al.19, and this indicates the similarity
of the two localization schemes for periodic systems.
The paper is organized as follows: In Sec. II we in-
troduce the spread functional and outline the strategy
behind the localization algorithm. In Sec. III we give the
formal definition of partly occupied WFs in the limiting
case of a large supercell and derive the corresponding ex-
pressions for the gradient of the spread functional. The
extension to periodic systems is discussed in Sec. IV. In
Sec. V we apply the method to a Si5 cluster, a copper
crystal and a Cu(100) surface with adsorbed nitrogen.
II. DESCRIPTION OF THE METHOD
In this section we introduce the spread functional used
to measure the degree of localization of a set of orbitals,
and give an introductory description of the localization
scheme including its relation to the method of Souza et
al.19
A.Spread functional
Within the localization scheme of Marzari and Van-
derbilt15the spread of a set of functions {wn(r)}N
measured by the sum of second moments
n=1is
S =
N
?
n=1
(?wn|r2|wn? − ?wn|r|wn?2). (1)
When periodic boundary conditions are applied, as in the
present study, and the supercell is sufficiently large, the
minimization of S is equivalent to the maximization of21
Ω =
N
?
n=1
NG
?
α=1
Wα|Zα,nn|2, (2)
where the matrix Zαis defined as
Zα,nm= ?wn|e−iGα·r|wm?. (3)
The {Gα} is a set of at most six reciprocal lattice vectors
and {Wα} are corresponding weights which account for
the shape of the unit cell. For a definition and discussion
of these quantities we refer to Refs. 13,14.
B. The localization scheme
The starting point is the set of single-particle eigen-
states, {ψn}, resulting from a conventional electronic
structure calculation. For simplicity we shall assume that
the system is isolated or is contained in a large supercell
such that reference to k-points can be omitted. The aim
is to obtain a set of Nw localized WFs with the prop-
erty that any eigenstate below a specified energy, E0,
can be exactly reproduced as a linear combination of the
WFs. An obvious way to achieve this would be to ap-
ply the method of Marzari and Vanderbilt to compute
the unitary transformation of the Nwlowest eigenstates
leading to the most localized WFs. The problem with
this strategy is, however, that it is in general not possi-
ble to localize all WFs simultaneously, and the problem
cannot be overcome by increasing Nw.
Instead, we define an external localization space as
the space spanned by the Nb lowest-lying eigenstates
(Nb> Nw). Within this space we consider the subspace
spanned by the eigenstates with energy below E0, to-
gether with L extra degrees of freedom (EDF). We shall
refer to this subspace as the active localization space or
simply the localization space. The EDF are assumed to
be orthogonal and L is chosen such that the dimension
of the active localization space equals Nw. We then per-
form a simultaneous optimization of the WFs within the
active localization space and of the active localization
space itself. In practice this is achieved by optimizing an
Nw×Nwunitary matrix together with the coordinates of
the EDF such that the functional Ω becomes maximal.
Page 3
3
It is the determination of the EDF that distinguish
our method from that of Souza et al.19In the latter,
the spread functional is decomposed into two terms:
Ω = ΩI+?Ω, where ΩI is related to the k-space disper-
first step, the EDF are determined by maximizing ΩI,
which depends only on the localization space itself and
not on the internal unitary transformation. In the sec-
ond step?Ω, or equivalently Ω, is then maximized within
rate maximization of ΩI and?Ω does not amount to the
however, see that the two methods lead to very similar
results in the case of periodic systems.
sion of the band-projection operator, see Ref. 19. In the
the fixed localization space. It is clear, that the sepa-
global maximization of Ω that we propose here. We shall,
III. LARGE SUPERCELLS
In this section we give a detailed description of the
localization scheme in the limiting case of a large super-
cell where a Γ-point sampling of the first Brillouin zone
is a good approximation. For simplicity we discuss this
case separately before extending it to periodic systems,
although the latter contains the former as a special case.
After giving the definition of partly occupied Wannier
functions we derive expressions for the gradients of the
spread functional and discuss how to combine these with
a Lagrange multiplier scheme to determine the maximum
of Ω.
A. Definition of partly occupied Wannier functions
We denote the total number of eigenstates obtained
from the electronic structure calculation by Nband the
number of eigenstates below the energy E0by M. Our
aim is to construct a set of Nw WFs which span at
least the M lowest-lying eigenstates.
L = Nw− M degrees of freedom are simply used to im-
prove the localization of the resulting WFs as much as
possible. We expand the WFs in terms of the M lowest
lying eigenstates and L extra degrees of freedom, {φl},
belonging to the (Nb− M)-dimensional space of eigen-
states with energy above E0:
The remaining
wn=
M
?
m=1
Umnψm+
L
?
l=1
UM+l,nφl, (4)
where the extra degrees of freedom (EDF) are written as
φl=
Nb−M
?
m=1
cmlψM+m. (5)
The columns of the matrix c are orthonormal and
represent the coordinates of the EDF with respect
to the eigenstates lying above E0. The matrix U
is unitary and represents a rotation of the functions
{ψ1,...,ψM,φ1,...,φL}.
In order to simplify the notation we introduce the ma-
trices
?IM×M0
0c
C =
?
,V = CU =
?UM
cUL
?
, (6)
where UMand ULdenotes the M uppermost and L low-
ermost rows of U, respectively. The ith column of V
gives the coordinates of wiwith respect to the full set of
eigenstates {ψn}.
Substituting the expansions (4) and (5) into Eq. (3)
we obtain a compact matrix expression
Zα= V†Z(0)
αV = U†C†Z(0)
αCU, (7)
where Z(0)
states {ψn} in the inner product,
Z(0)
α
is obtained from Eq. (3) by using the eigen-
α,nm= ?ψn|e−iGα·r|ψm?. (8)
B.Gradient of Ω
Through Eq. (7) the spread functional, Ω, in Eq. (2)
becomes a function of the matrices U and c. The max-
imum of Ω can be found iteratively by updating U and
c in the direction given by the gradient. In the following
we derive expressions for the gradient of Ω.
We write the unitary matrix at iteration n as U(n)=
U(n−1)exp(−A), where A is an anti-hermitian matrix.
Since we are only concerned with small variations, we
expand the exponential to first order, i.e. exp(−A) ⋍
1 − A. Inserting this into Eqs. (7) and (2) we find
∂Ω
∂Aij
α=1
=
NG
?
Wα[Zα,ji(Z∗
α,jj−Z∗
α,ii)−Z∗
α,ij(Zα,ii−Zα,jj)].
(9)
All matrices in this expression refer to iteration n − 1.
The new rotation at iteration n is then obtained by mul-
tiplying U(n−1)by exp[−d(∇AΩ)] where d is the length
of the steepest-ascent step and [∇AΩ]ij= ∂Ω/∂Aij.
We now turn to the problem of determining the steep-
est uphill direction of Ω with respect to variations in c.
In general, for a real-valued function f(z = x + iy) the
direction of steepest ascent with respect to z is given by
∂f
∂z∗≡1
To calculate the gradient ∂Ω/∂c∗
2(∂f
∂x+ i∂f
∂y). (10)
ijwe use that
∂|Zα,nn|2
∂c∗
ij
= Zα,nn
∂Z∗
∂c∗
α,nn
ij
+ Z∗
α,nn
∂Zα,nn
∂c∗
ij
. (11)
From Eq. (7) it follows that
∂Zα,nn
∂c∗
ij
=
?
abcd
?
abcd
U†
na
∂C†
∂c∗
ab
ij
Z(0)
α,bcCcdUdn
(12)
+U†
naC†
abZ(0)
α,bc
∂Ccd
∂c∗
ij
Udn,
Page 4
4
and from definition (6)
∂Cnm
∂c∗
ij
= 0(13)
∂C†
∂c∗
nm
ij
= δm,M+iδn,M+j. (14)
It is now easy to establish that
∂Zα,nn
∂c∗
∂Z∗
α,nn
∂c∗
ij
ij
= [Z(0)
αV ]M+i,nU∗
M+j,n
(15)
= [(Z(0)
α)†V ]M+i,nU∗
M+j,n.(16)
Combining Eq. (11) with (15) and (16) we arrive at the
desired expression
∂Ω
∂c∗
ij
=
NG
?
α=1
Wα[Z(0)
αV D(Z∗
α)U†+(Z(0)
α)†V D(Zα)U†]M+i,M+j,
(17)
where D(Zα) is a diagonal matrix with (Zα,nn) in the
diagonal.
To treat the constraint that the EDF {φl} should be
orthonormal during the maximization procedure we in-
troduce the Lagrange multipliers λijand perform an un-
constrained maximization of the functional
?
ij
ΩL= Ω −
λij?φi|φj?. (18)
The Lagrange multipliers are initially unknown and must
be estimated at each iteration. At the maximum we have
∇c∗ΩL= 0 which is equivalent to the condition
∇c∗Ω − c λT= 0.
Multiplying by c†from the left leads to
(19)
λT= c†∇c∗Ω. (20)
This relation can be used to estimate the Lagrange multi-
pliers at each iteration. A step of length d in the steepest
uphill direction is thus accomplished by adding to c the
matrix d(1 − cc†)∇c∗Ω, followed by an orthonormaliza-
tion of the columns of c.
IV. PERIODIC SYSTEMS
We consider a periodic system with a unit cell defined
by basis vectors a1,a2,a3 which in turn define the ba-
sis vectors of the reciprocal lattice b1,b2,b3. The Bloch
states, {ψnk}, resulting from the electronic structure cal-
culation are characterizedby a band index n and a crystal
momentum k. The total number of bands is denoted by
Nband the number of eigenstates at a given k-point with
energy below E0 is denoted by Mk. We assume a uni-
form sampling of the first BZ such that any k-point can
be written as
k =n1
N1b1+n2
N2b2+n3
N3b3,(21)
where Ni is the number of k-points in the direction bi
and ni = 0,...,Ni− 1. Note that the Γ point is al-
ways included. With this convention the Bloch states,
{ψnk}, correspond exactly to the Γ-point eigenstates of
the repeated cell defined by the extended basis vectors
N1a1,N2a2,N3a3. An alternative way of stating this cor-
respondence is to say that the k-points in Eq. (21) fall
on the reciprocal lattice of the repeated cell, see Fig. 2.
As we shall see below, this correspondence allows us to
use the spread functional Ω defined in Eq. (2) also for the
periodic system. We stress that the formalism developed
in the following section contains the Γ-point formalism
described in the preceding sections as a special case.
A. Definition of partly occupied Wannier functions
We write the nth Wannier function related to unit cell
i as
?
k
wi,n=
1
√Nk
e−ik·Ri ˜ψnk,(22)
where Nk is the total number of k-points and˜ψnk is a
generalized Bloch state to be defined below15. Each gen-
eralized band, i.e. each set {˜ψnk} for fixed n, gives rise
to one WF per unit cell. These WFs are simply related
by translation, i.e. wi,n(r) = w0,n(r − Ri), and thus it
suffices to consider the WFs of the cell at the origin. In
doing this we can omit the cell index and simply denote
the WFs by {wn}. We denote the number of WFs per
cell by Nw.
Following the idea behind Eq. (4) we expand the gen-
eralized Bloch state˜ψnkin terms of the Mklowest lying
Bloch states and Lkextra degrees of freedom, {φlk}, from
the remaining (Nb− Mk)-dimensional space
Mk
?
m=1
˜ψnk=
Uk
mnψmk+
Lk
?
l=1
Uk
Mk+l,nφlk,(23)
where the EDF are expanded as
φlk=
Nb−Mk
?
m=1
ck
mlψMk+m,k.(24)
The number of EDF at a given k-point is determined by
the condition Lk+ Mk = Nw. If Mk exceeds Nw, we
simply put Mk= Nw. Due to the exact correspondence
between the Bloch states {ψnk} and the Γ-point eigen-
states of the repeated cell, we can use the functional (2)
to measure the spread of the Wannier functions. The
matrices Zαare still defined by Eq. (3) but it should be
remembered that the inner product as well as the recip-
rocal lattice vector Gα now refer to the repeated cell.
From Eqs. (23,24) we find the following generalization of
Eq. (7)
?
k,k′
Zα=Zkk′
α , (25)
Page 5
5
where
Zkk′
α
= (Uk)†(Ck)†Z(0),kk′
α
Ck′Uk′. (26)
The matrix Ckis given by the obvious k-point analogue
of Eq. (6) and the matrix Z(0),kk′
α
is defined by
Z(0),kk′
α,nm
= ?ψnk|e−iGα·r|ψmk′?. (27)
Most of the matrices Z(0),kk′
the Bloch functions as ψnk = unk(r)exp(ik · r), where
unkhas the periodicity of the lattice, we get
?
α
are in fact zero. Writing
Z(0),kk′
α,nm
=u∗
nk(r)umk′(r)ei(k′−k−Gα)·rdr, (28)
which is non-zero only when
k′= k + Gα. (29)
Here it is implicit that k and k′belong to the first BZ
and thus it might be necessary to translate k′by a re-
ciprocal lattice vector. The relation between k and k′is
illustrated in Fig. 2. Note that the condition in Eq. (29)
reduces the double sum in Eq. (25) to a single sum over
k.
G α
k
k’
1
2
b
b
FIG. 2: Relation between the first BZ of the unit cell, defined
by the reciprocal basis vectors b1,b2,b3 (light gray), and the
first BZ of the repeated unit cell (dark gray). In this case N1
and N2 from Eq. (21) both equals 3. The relation between k
and k′, given in Eq. (29), is indicated.
The derivation of the gradient of Ω follows closely the
Γ-point case discussed in Sec. IIIB and is therefore omit-
ted. The result is
∂Ω
∂Ak
ij
=
NG
?
α=1
Wα[(Zα,jj)∗Zk−Gα,k
α,ji
+ Zα,jj(Zk,k+Gα
α,ij
)∗− (Zα,ii)∗Zk,k+Gα
α,ji
− Zα,ii(Zk−Gα,k
α,ij
)∗]. (30)
∂Ω
ij)∗=
∂(ck
NG
?
α=1
Wα[Z(0),k,k+Gα
α
Vk+GαD(Z∗
α)(Uk)†+ (Z(0),k−Gα,k
α
)†Vk−GαD(Zα)(Uk)†]Mk+i,Mk+j. (31)
We note that these expressions, of course, reduce to
Eqs. (9,17) in the limit of a single k-point. The max-
imization of Ω proceeds along the same lines as for the
Γ-point case, except that Lagrange multipliers are needed
for each k-point. For example the analogue of Eq. (18)
reads
?
ij,k
ΩL= Ω −λij,k?φik|φjk?. (32)
B.Optimizing the number of extra degrees of
freedom
For given values of Nb, Nwand E0, the algorithm in-
troduced above produces the Nw most localized WFs
that can be formed within the external localization space
when all eigenstates below E0 should be exactly repro-
ducible in terms of the WFs. It remains to determine the
optimal values for Nband Nwfor a given E0. Let us start
by considering the situation where Nbhas been fixed at
a value which is large enough to include all anti-bonding
states relevant for the localization. In practice this typi-
cally means ∼ 10 eV above the Fermi level. It seems as a
natural strategy to choose Nwsuch that the localization
per orbital is maximal. To quantify this condition we
define the average localization per orbital as
?Ω? =Ω[E0,Nb,Nw]
Nw
,(33)
where we have indicated the dependence of Ω on the
three parameters explicitly. We note that since the value
of Ω also depends on the size and shape of the super-
cell, it does not make sense to compare the value of Ω
for systems described in different supercells. Fixing Nw
on the basis of ?Ω? represents a completely general cri-
terion which can be applied in any situation. However,