BAND: part of PhD Thesis G te Velde

advertisement
BAND
a Fortran program for bandstructure calculations
Contents
B AND
1
Flo w cha rt
3
1 Intro d uctio n
fo r malism
ap p licab ility and r esults
b asic char acter istics o f the metho d
6
6
7
8
2 Im p lem enta tio n
sur vey
S o ftwar e S ectio ns
11
11
14
3 Ba sis
gener al
linear d ep end ency
fr o zen co r e
valenc e-co r e d ep end ency
the Stab le State Ap p r o ach (SSA)
16
16
17
17
18
20
4 BZ - integ ra tio n
d ata str uctur e
simp lices and p o ints
integr atio n
22
24
24
25
5 Cha rg e d ensity
evaluatio n
analysis
28
28
30
6 Co ntro l
31
7 Co ulo m b p o tentia l a nd la ttice sum s
fit functio ns and fit p o tentials
lattice sums
p o tential and d ensity, an o ther relatio n
33
35
37
50
8 DOS: d ensity o f sta tes a nd p o p ula tio n a na lysis
53
9 Energ y: to ta l a nd co hesive
fit er r o r
electr o static inter actio n b etween neutral ato ms
electr o static Mad elung energy
56
59
59
60
1 0 Files
62
1 1 Fo rm f a cto rs
65
1 2 Geo m etry
p lanes
o r ientatio n and r o tatio n
lines
p o lygo ns
p o lyhed r a
symmetr y
simp lices
67
67
67
68
68
69
70
71
1 3 Inp ut
72
Acronyms
2
geo metr y
functio n sets
p r int d ir ectives
gener al co ntr o l
keys with b lo ck typ e inp ut
74
75
77
77
78
1 4 Integ ra tio n
inp ut
78
79
1 5 Interp o la tio n a nd b lo ch sum s
inter p o latio n
b lo ch sums
85
85
87
1 6 Itera tio n
o p timized d amp ing
Cheb yshev acceler atio n
88
88
93
1 7 La ttice vecto rs
Or ientatio n
96
96
1 8 Leg end re p o ints
the d istr ib utio n functio n
96
98
1 9 SCF linea riza tio n
ap p r o ximatio n o f the p o tential
ap p r o ximatio n o f the d ensity
98
99
101
2 0 Sym m etry
co mp utatio n o f the o p er ato rs
symmetr y in k -sp ace
integr atio n in k-sp ace
integr atio n in r eal sp ace
sp ecial symmetr y r o utines
symmetr y ad ap ted functio ns
102
104
107
109
110
111
116
2 1 Tem p era ture
120
2 2 Wo rk sp a ce
d ynamical allo catio n o f arrays
o p timized vecto r lengths
123
123
124
2 3 XC: ex cha ng e a nd co rrela tio n
lo cal XC functio ns
Sto ll's co r r ectio n fo r the co rrelatio n
gr ad ient co r r ectio n fo r the exchange
127
127
129
130
So ftwa re Ref ere nce List A: sub ro utines
134
This chapter gives an extensive discussion of the computer program with many details concerning the
software. We start with a list of acronyms and abbreviations that will be used and a global flow chart.
BZ
Brillouin Zone
NAO
numerical atomic orbital
DF
density functional
PW
plane wave
DOS
density of states
SCF
self consistent field
irrep
irreducible representation
SSA
stable state approach
LCAO
Linear combination of atomic
STO
Slater type orbital
XC
exchange and correlation
orbitals
LD
local density
Flow chart
BAND
INIT
GETINP
GEMTRY
POINTS
SYMCRY
ATMSET
SYMPRJ
SYMADD
KPNT
PREPAR
RADIAL
RADFNC
DIRAC
SLTORB
FITRAD
RADMAX
CELLS
CELMAX
FITSYM
NUMGRD
RPNTID
initiation of variables
input analysis
geometry master routine
symmetry unique points and weights for numerical integration
space group operators
division of atoms in sets of symmetry equivalent ones
point group operators in reciprocal space, derived from the space group
operators
inclusion of inversion symmetry into the k-space point group
integration points and related data for integrals over the BZ
master routine for the preparation of crystal functions
master routine for radial parts of one-center function
master routine for generation of radial function tables
numerical solution of DF equations for spherically symmetric free
atoms; output tables (NAOs, atomic densities and coulomb potentials)
STOs, tables
Slater type fit functions (STF's) and potential functions
maximum radial extension of any tabulated function (valence, fit, atomic
density, atomic coulomb potential)
coordinates of lattice points that are relevant for the calculation of bloch
sums
for each radial function table: the number of cells needed in the bloch
sum, derived from its radial extension
number of totally symmetric fit functions
master routine for computation of function values in the crystal
integration points: interpolation from the radial tables, bloch summation
by a loop over the relevant cells
generation of all symmetry equivalent integration points, from the unique
ones
Flow chart
RPNTRE
VMULTI
ATOMIC
ELSTAT
ATMFNC
BASORT
PLANEW
BASPNT
BASCOR
BASOVL
BASTRA
HAMFIX
FITORT
FITPNT
FITOVL
FITTRA
(end of NUMGRD)
4
organization of the file with points: blocks of points with a predetermined maximum length (the vector length)
multipole potentials: potential values in the integration points due to a
Bravais lattice of multipoles; each {lm}-multipole; Bravais lattices
centered on each atom; symmetrized functions by combining symmetry
equivalent atoms
master routine for data related to the free-atom charges and coulomb
potentials
electrostatic interaction between unrelaxed free atoms (energy term in
the cohesive energy)
superposition of the atomic densities and coulomb potentials in the
integration points (reference charge density and potential for the SCF
procedure)
master routine for the crystal core and valence functions
characteristics of the planes waves in the valence basis (lattice points in
reciprocal space)
bloch summations of one-center basis functions (core and valence); PWs
are added to the valence function set
explicit orthogonalization of the valence basis on all core states
overlap matrix of the basis
transformation to an orthonormal basis
fixed part of the hamiltonian matrix (in the orthonormal basis): kinetic
energy and coulomb potential due to the free atoms; matrices (one for
each k-point) are stored on file
master routine for the fit functions
interpolation and bloch summation (k=0); combination into functions
that are totally symmetric with respect to all space group operators
overlap matrix of the (symmetrized) fit set
transformation to an orthonormal basis
(end of PREPAR)
REORGF
reorganization of all files containing function values in the integration
points: fewer blocks of points / more points per block
SCF
master routine for self consistency iterations
test termination conditions of the SCF procedure
charge density in the integration points from the density matrix
(RHOPMT) or
from the eigenstates (RHOPSI)
fit coefficients: expansion of the deformation density in fit functions, to
solve (approximately) the Poisson equation
the potential (coulomb plus XC), computed from the density and the fit
coefficients
evaluation and diagonalization of the hamiltonian matrices in the
respective k-points
master routine for k-space analysis
SCFTST
RHOPMT,
RHOPSI
RHOFIT
RHOPOT
EIGSYS
FERMI
Flow chart
FERMIE
PMATRX
5
fermi energy from the energy bands; occupation numbers for all oneelectron states
the density matrix in the representation of the crystal valence basis
(end of SCF)
PRPRTS
ENERGY
CHARGE
DOS
DOSTRA
DOSCAL
POPANA
FORMFA
PLANEW
CELRED
FORMF1
(end of PRPRTS)
(end of BAND)
master routine for the analysis of results; computation of properties
cohesive and total energy
charge distribution over the atoms by numerical integration
master routine for the (total and partial) density of states
back transformation of the self-consistent wave functions to the original,
non-orthogonal bloch-valence basis: for the partial densities of states and
for the Mulliken population analysis
partial and total density of states for a sequence of energy values
Mulliken population analysis
master routine for X-ray structure factors (form factors)
characteristics of the relevant plane waves
reduction to the symmetry unique set of plane waves
computation of the X-ray factors: fourier transformation of the charge
density by numerical integration
Introduction
1
6
Introduction
BAND is a density functional (DF) program for electronic structure calculations on periodic systems. The
programming language is FORTRAN77. Where in the text explicit reference is made to the software,
subroutine names are in small capitals (SCF, INIT), variables are underlined (pot, overlp) and key-words
used in input are outlined (
,
). Details of the implementation will be discussed in a number of
'software sections'; these are referred to as SS^input, SS^basis, and so on.
Energies and lengths are in atomic units (hartree, bohr), unless explicitly stated differently.
fo rm a lism
In the Kohn-Sham theory of the DF method [Hohenberg and Kohn 1964, Kohn and Sham 1965] the twoparticle coulomb interaction 1/r12 between electrons is replaced by the sum of two one-particle operators.
The first is the coulomb potential due to the average charge distribution. The second is the exchangecorrelation (XC) potential, representing exchange and correlation effects in an average way. The XC
potential is a functional of the charge distribution.
With the usual approximation of motionless point nuclei the hamiltonian equation reads
Hn(k;r)  {T + VC(r) + VXC(r) } n(k;r) = en(k) n(k;r)
(1.1)
T is the kinetic energy operator, /2 in atomic units; VC(r) is the total coulomb potential, due to the
nuclear charges and the electron cloud; VXC(r) is the XC potential. Relativistic effects have not (yet) been
included in BAND.
n(k;r) is a one-electron state with wave vector k; k serves also as a symmetry label, denoting the
irreducible representations (irreps) of the translation group corresponding to the Bravais lattice. It is a
(pseudo) continuous variable that may assume all values in the (first) Brillouin Zone ( BZ).
The solutions {en(k) ,n(k;r)} vary continuously with k and form thus 'bands'; the subscript n, which
enumerates the distinct solutions in k, is called the band index.
The translation properties of irrep k are expressed by (Bloch's theorem)
n(k;r+R) = eik·R n(k;r)
(1.2)
where R is any point of the Bravais lattice. The states n(k;r) are computed as linear combinations of basis
functions . These are chosen of course to belong to the same irrep and are labeled accordingly: i(k;r),
i=1,2,...
The electronic charge density (r) is obtained by a summation over all occupied states, i.e. all states with
energy en(k) below the fermi energy eF . The summation includes the integration in k-space
Introduction
7
(r) =  

n
dk n(k;r)2 (eF en(k) )
(1.3)
BZ
(x) is the Heaviside step function: =1 (x>0) or 0 (x<0).
The fermi energy eF is determined by the total amount of electronic charge Q per unit cell

 (r) dr = Q
(1.4)
unit cell
The potentials VC(r) and VXC(r) in (1.1) are computed from the density (r) and hence they depend on the
solutions n(k;r). The self-consistent solutions are found by an iterative self-consistent field (SCF)
procedure.
a p p lica b ility a nd results
#
Any type of periodic system can be handled. The systems will be called n-dimensional crystals, by
which we understand polyatomic systems with periodicity in n directions; n may be 3 ('normal' bulk
crystals), 2 (slabs) or 1 (polymers).
It seems natural to include also the case n=0 (molecules). BAND is based on the same theoretical model
as the Amsterdam DF molecule program and the two are very similar as regards the general set-up. In
fact they share a substantial portion of their software. A future melting together may well be undertaken
and should not present fundamental problems.
#
The two spins can be treated independently: both spin-restricted and spin-unrestricted calculations are
possible.
#
For VXC(r) several forms advocated in the literature are available in the program: the classical X , the
Gunnarsson-Lundqvist (GL) and the Vosko-Wilk-Nusair (VWN) formulas. Any of these can be elected
in a calculation. Other varieties are easily implemented if desired.
#
Apart from the self consistent solution of (1.1), the program computes the total and cohesive energies,
X-ray factors, total and partial densities of states and it performs a Mulliken population analysis.
#
There are no restrictions on computational parameters such as a maximum number of atoms, basis
functions and so on. Of course computer resources may limit the applicability. To have an indication: a
slab containing 18 transition metal atoms and two first row atoms per 2D unit cell (periodic
chemisorption of CO on a copper layer) is a big system, as regards both CP-time and disc-usage. If the
basis set and other parameters are chosen to achieve high accuracy (0.001a.u. in the cohesive energy),
the program will take ten or more hours on a Cyber205 and handle ~10 9 words of data. Bulk silicon is
computed with fair precision (better than 0.01a.u. in the cohesive energy) in 20 minutes and stores
7107 words on disc during the run.
We will discuss in other sections how the demands on storage facilities and computer time may be cut
down by future developments of the program, primarily by making better use of symmetry properties
than is done currently (SS^symmetry). In the mentioned example of Cu-CO adsorption for instance this
Introduction
8
would reduce costs by roughly an order of magnitude. The gain in the silicon calculation would be even
more, due to the high symmetry.
An improvement in efficiency may also result from a refining of the SCF procedure (the Stable State
Approach, see SS^basis).
b a sic cha ra cteristics o f the m etho d
starting up, one-center function tables
BAND contains a fully numerical Herman-Skillman [Herman and Skillman 1963] type subprogram DIRAC.
DIRAC solves the DF equations for the spherically symmetric free atoms from which the crystal is built up.
The superposition of the atomic densities and the corresponding potential is used to start up the selfconsistency iterations for the crystal. Cohesive energies are computed with respect to these atoms. The
atomic one-electron states are optionally used in the basis set for the crystal.
All data from DIRAC such as the atomic density, the coulomb potential and the orbitals are obtained in the
form of tables f(ri ), i=1,2,.. , giving the values of f(r) for a sequence of radial values. Other one-center
functions (see below) are represented in the same way, as radial tables, even if they could alternatively be
treated analytically.
The radial grid points ri of a table constitute a logarithmic mesh: ri+1 /ri =constant. The gridcharacteristics, i.e. the number of points, the quotient-constant and the smallest value r1 are the same for all
one-center functions associated with a certain type of atoms. The radial meshes may be different for
different types of atoms.
basis
The program employs two types of basis functions (k;r):
a) Bloch functions [Bloch 1928]. These are computed as linear combinations of localized functions (r)
(k;r) =  eik·R (rR)
(1.5)
R
The phase factors eik·R in the summation over the direct lattice points R assure that (k;r) has the correct
translational symmetry (1.2). The localized functions (r) are one-center functions, centered on some atom
. They have the form
(r) = Zlm() P(r )
(1.6)
The subscript  signifies that the coordinates are relative to the position of atom . Zlm() is a (real
valued) spherical harmonic (23.30) and P(r) is a radial function.
Two types of radial functions are used in the program. They are the radial parts of
Introduction
9
a1) Slater type orbitals (STOs): P(r)=rl+n er .
a2) Numerical atomic orbitals (NAOs) from DIRAC.
b) Plane waves (PWs): (k;r)=ei(k+K)·r . K is a point of the reciprocal Bravais lattice. Plane waves are used
only for 3-dimensional crystals. In other systems the asymptotic behaviour away from the crystal is
unsuitable to describe bound electron states.
Note: although the use of a pure PW valence basis is possible (orthogonalized on the core states, as the case
might be, see below), such an application should be undertaken cautiously. All integrals are evaluated
numerically and no use is made of the analytical integrability of PWs. The integration scheme has not been
devised for rapidly oscillating functions and hence it will be less accurate for them, or, if the number of
integration points is accordingly increased, we are confronted with higher costs. A high-precision PW
calculation with large numbers of PWs may therefore be less efficient.
The cause of this is purely historical: BAND has been developed as an LCAO program; the PWs have been
included later and are (currently) treated in the same way as other functions.
frozen core
Some of the NAOs may be specified to be core states. These are not iteratively computed in the crystal SCF
procedure. Each (valence) basis function is orthogonalized on all core states by explicitly projecting out the
core functions.
integrals
Integrals in real space over the crystal unit cell, such as overlaps and hamiltonian matrix elements, are
evaluated numerically. The applied integration scheme is based on product Gauss formulas and a
partitioning of space in specific regions: atomic polyhedra ('cells'), inside each polyhedron an atomic 'core;
sphere, and the 'outer' region far away from the nuclei (not for 3D crystals). Any desired accuracy can be
achieved, but of course with a concomitant number of integration points [chapter III].
k-space
Integrations over the BZ, for instance to compute the density (1.3), are performed numerically, using the
analytic-quadratic method [Wiesenekker et al. 1988, Wiesenekker and Baerends 1990]. Any quantity F is
approximated as
F


n
BZ
dk Fn(k) (eF en(k) ) 

n
k
on(k) Fn(k)
(1.7)
Fn(k) is the contribution to F from the eigenstate n(k;r); in case of the density for instance F=(r), Fn(k)
=n(k;r)2. The integration weights on(k) may obviously be associated with occupation numbers for
the states n(k;r). Henceforth we will refer to them in this way, although the analytic-quadratic method
occasionally results in 'non-physical' negative occupation numbers.
Introduction
10
The analytic-quadratic method implicitly makes a piecewise quadratic expansion in the variable k both of
Fn(k) and of the energy function en(k) . The occupation numbers are then determined such, that these
second order polynomials are integrated exactly. The method is accurate and converges quickly for all types
of systems: isolators, metals and semiconductors [Wiesenekker et al. 1988, Wiesenekker and Baerends
1990].
Poisson's equation
The coulomb potential VC(r) due to the nuclear charges Z and the electronic density (r) is defined in the
program as
VC(r) = 

{(r') + 

Z (r'r )} rr'
1
dr'
(1.8)
The tables from DIRAC give the coulomb potentials V(r) due to the spherically symmetric atomic
densities (r) plus nuclei Z . Defining the deformation density def(r) as the difference between the
crystal charge distribution and the superposition of atomic densities gives
VC(r) =


V(r) + 
 def(r') rr'
1
dr'
(1.9)
The form in which def(r) is obtained precludes an analytical evaluation of the second term and the
singularity of the denumerator makes application of the normal numerical integration scheme unsuitable.
The problem is solved by a fitting procedure as introduced by Baerends et al. [1973]. A set of fit functions
fi(r) is chosen such that a) the density def(r) can accurately be expanded in them and b) the corresponding
c
coulomb potentials fi (r) are easily evaluated. Then, with
c
fi (r)  
 fi(r') rr'
1
dr'
def(r)   ci fi(r)
(1.10)
(1.11)
the coulomb potential is approximated by
VC(r) 


V(r) +

c
ci fi (r)
(1.12)
The fit functions are derived from one-center 'atomic' functions (r)=Zlm() P(r ). This form allows an
easy evaluation of the corresponding potential functions c(r) (SS^coulomb potential).
The density is a symmetric function, invariant under all operators of the space group. The generating
functions (r) are therefore combined into symmetric functions f(r); these are the crystal fit functions. The
same combination coefficients yield the associated fit potentials fc(r) from the potential functions c(r).
Implementation
2
11
Implementation
We deal with the implementation on two levels. First we survey the flow of normal execution. We do this
from a conceptual point of view, neglecting details and simplifying matters for sake of clarity. This gives
insight in the organization of BAND and tells us globally what happens when and where.
In the second stage we look closer by giving attention to special aspects in a series of Software Sections
(SS^basis, SS^input, SS^files, etc). There we examine input specifications, usage of files, treatment of
symmetry, details of some algorithms and so on. Apart from being useful in itself as documentation, this
may be of aid in case of error-tracing and when modifications or extensions of BAND are contemplated. A
few suggestions for improvements of the existing code have been added.
survey
BAND consists conceptually of three parts. The heart is part two, the iterative SCF procedure to find the
self-consistent solution of the hamiltonian equations. Part one is the preparation for this: input reading,
geometry analysis, construction of the valence basis, etc.; it starts with subroutine INIT, where several
variables are initiated. In part three the results are analyzed and various properties are computed.
The global flow-chart at the beginning of this chapter lists the main routines (names in capitals) with a
summary of their purposes. The indentations reflect the hierarchical structure; for instance: INIT, GEMTRY
and PREPAR are called by the main program (BAND); GETINP in turn is called from INIT.
crystal functions
The primary goal of PREPAR is the preparation of various crystal functions (valence, fit, potential, density).
These functions are treated as purely numerical functions, whatever their origin; they are represented by the
values in the crystal integration points. These values are in most cases (the only exception being the PWs in
the valence basis) defined and computed as bloch sums: the contributions from the various cells are added
in a loop over the crystal lattice points.
As said before, all primitive one-center functions, whose contributions are to be added, are represented by
tables containing the radial values for a number of distances from their respective 'origins'. They are stored
on various files, together with the angular quantum numbers which define their angular dependency.
The preparation of crystal functions thus consists roughly of two steps: a) generation of the radial tables
(DIRAC, SLTORB, FITRAD), b) interpolation from the tables to compute them in the crystal integration
points and bloch summation (ATMFNC, BASPNT, FITPNT).
Implementation
12
bloch sums
Depending on the size of the unit cell and the dimensionality of the crystal, the loops over the cells (bloch
summations) may involve several thousands of terms and take a substantial time. To reduce the costs, BAND
attempts to limit these loops. For this purpose each involved radial function table is analyzed to find out at
which distance the function becomes negligible; the corresponding number of cells, different for each
function, is determined (CELMAX) and used in the bloch summation procedures.
As a preliminary the maximum extension of any radial function is determined (RADMAX); this is used to set
up an appropriate list of lattice points (cell coordinates: CELLS).
orthonormal function sets
The valence function set and the fit function set are both transformed to an orthonormal basis via
diagonalization of the overlap matrices (BASOVL, BASTRA, respectively FITOVL, FITTRA). This is
convenient in the subsequent employment: some computational procedures are simplified and memory
usage in the SCF procedure is alleviated substantially (the overlap matrices may be rather large).
Linear dependency is controlled by checking the eigenvalues of the overlap matrix.
symmetry and integration points
Symmetry plays an important role in reducing computational efforts ( SS^symmetry). Furthermore the
crystal density and potential are explicitly symmetrized to preserve the symmetry of the hamiltonian
(symmetry breaking is prevented).
Some symmetry-analyzing routines (SYMCRY, ATMSET) are subordinate to the numerical integration
routine (POINTS) for the following reason:
POINTS is the master routine of an extensive and sophisticated numerical integration package for
polyatomic systems [chapter III]; input are the coordinates of the atoms and the lattice structure, plus
specifications concerning the required precision. In order to generate an integration scheme that reflects the
symmetry of the polyatomic system, the symmetry operators are determined and the atoms are organized in
groups of symmetry equivalent ones. This symmetry information, in fact only a by-product of POINTS, is
subsequently used in various parts of BAND.
In principle it is trivial to derive the point group operators in k-space from the space group operators: the
point group parts of the affine symmetry transformations are symmetry operators in k-space. However, in an
n-dimensional crystal k-space has only n dimensions, which may be lower than 3. Moreover BAND
optionally neglects dispersion in certain directions in k-space. In general the k-space operators are therefore
obtained by projection into the space with the appropriate dimensionality ( SYMPRJ).
Inversion is a symmetry operator in k-space, regardless of the space group. So, if it is not yet present after
the 'projection' procedure, it must be added (SYMADD).
Implementation
13
workspace and vector length
FITSYM determines only the number of symmetric fit functions; this is used for appropriate dimensioning of
some arrays (BAND simulates dynamical allocation: SS^workspace).
The integration points are organized in blocks of points. All related data on files, such as the values of
various crystal functions in the points, are structured accordingly. The processing of these data thus takes
place in a loop over the blocks. The maximum number of points per block, npx, i.e. the block length, is
computed to organize work space in an optimal way (SS^workspace).
In comparison with the preparation part the SCF part has effectively more workspace available (in particular
thanks to the absence of various overlap matrices) to load in core e.g. the valence functions in a block of
integration points. Increasing the lengths of the blocks (REORGF) leads then to longer vector lengths in
vector operations, for instance in the evaluation of hamiltonian matrix elements by numerical integration.
This results in a substantial improvement of efficiency (depending on the machine).
multipole lattice sums
BAND solves the Poisson equation by a fitting procedure for the deformation density. The corresponding fit
potentials are lattice sums of functions that are asymptotically multipole potentials. Naturally these lattice
summations are split in a rapidly converging summation of exponentially decaying functions and the lattice
sum of point-multipole potentials. The first term is treated in the fit section ( FITPNT) by a straightforward
loop over lattice points. The second term gives the familiar solid state problem of slowly (or even
conditionally) convergent lattice sums. VMULTI evaluates the pure multipole lattice sums (by an
unconventional algorithm: SS^coulomb potential); the data are stored on file itvmul, read again in FITPNT
and combined with the other term to the fit potentials.
Finally all fit data (function values and potential values) are stored on file to be used in the SCF procedure.
SCF
When all crystal functions have been prepared, the SCF procedure is started. The initial density and the
corresponding potential are given as the 'sum-of-overlapping-free-atoms'. Iteratively the following steps are
then executed:
a. numerical integration of the potential matrix elements; this is added to the fixed part of the hamiltonian
(kinetic energy). The hamiltonian matrices (one for each k-point) are diagonalized (EIGSYS).
b. analysis of the energy bands to determine the fermi energy and the occupation numbers for the oneelectron states (FERMI).
c. computation of the density matrix in the representation of the basis functions (P-matrix) (PMATRX).
d. construction of the charge density in the integration points from the P-matrix (RHOPMT).
e. expansion of the deformation density in the fit functions, for the solution of Poisson's equation
(RHOFIT).
Implementation
f.
14
calculation of the new potential (RHOPOT) from the fit functions (coulomb potential) and the density
(XC potential).
SCFTST tests various conditions for the termination of the SCF procedure (convergence, insufficient time,
maximum number of iterations).
post-SCF
After the SCF procedure BAND computes various properties. This is organized in the master section
PRPRTS. PRPRTS calls subsequently
a. ENERGY to compute the bonding energy, partitioned in various terms, and the crystal total energy.
b. CHARGE to determine by numerical integration the charge distribution over the atoms.
c. DOS, for an analysis of the density of states (total and partial) and a Mulliken population analysis.
d. FORMFA to compute the X-ray structure factors (form factors).
S o ftwa re S ectio ns
In the remainder of this chapter we focus attention on the implementation. Theoretical remarks and
derivations are included, but some acquaintance with the theory of solid state, quantum chemistry and
numerical mathematics is assumed. The treated subroutines and subjects are listed in the Software
Reference Lists A and B, after the last Software Section.
A program like BAND is apt to be used on several types of computers. Furthermore its size and complexity
make it probable that errors are hidden in it, even after extensive testing. We have adhered therefore to
standard FORTRAN77. The use of 'smart and dirty tricks' and of machine-dependent FORTRAN dialects may
let the program run faster, but it is heavily paid for in human time, when implementation on another
machine is on stage, or when bugs show up.
It is hardly avoidable however that the program be machine-dependent in some respects. The imperfection
of compilers, as regards the generation of the most efficient code, may lead in some cases to unreasonably
bad performance. Although this is in principle a matter of awaiting better compilers and 'not our business', it
is more practical to take measures. In particular we have done so with vectorizable loops. Consider
do 20 j=1,m-1
(2.1)
do 10 i=1,n
10
a(i,j+1)=a(i,j)+b(i)
20 continue
The inner loop should be executed in vector mode, but the compiler may not consider it vectorizable (e.g.
suspicion of recurrence) and generate scalar code. If it is vectorized, memory banking conflicts may occur
when the inner loop length n is very large.
To solve these problems we utilize a number of special subroutines for vector operations. The code above is
replaced by
Implementation
15
do 20 j=1,m–1
(2.2)
call vauvw(n,a(1,j),b,a(1,j+1))
20 continue
Subroutine VAUVW (vector addition: u+v=w) adds two vectors and stores the result into a third. Similar
routines multiply, divide, perform triadic operations etcetera. The disadvantage is the overhead due to the
call; this is relatively small if n is large enough. The advantage is that the problems are transferred from a
very large number of similar pieces of code to a small number of simple routines.
The vectorization problem is solved automatically in this way: the simplicity of the routines implicates
already that any compiler will generate vector code for them. We might even comfortably use explicit
vector statements if the local FORTRAN dialect allows so, because the resettings to standard FORTRAN are
few and straightforward.
Memory banking conflicts are solved by splitting up vectors that are too long. A special constant lveccp, the
maximum vector length for CP-operations, controls this; it is of course machine-dependent and should be
adapted to the situation. lveccp can be set via input, with the key
.
All explicit machine-dependency in BAND is controlled by a few similar constants; these are stored in
common block MACHIN.
BAND frequently tests the state of affairs during execution. Many of these tests are superfluous if the input
is sensible and the code is correct. The tests are there, because neither of these conditions is guaranteed. It
would not be the first time that a bug in the program, showing up in a new type of calculation, is noticed and
traced in this way. Warnings are issued if intermediate results are suspicious and in some cases BAND stops,
to avoid the waste of human time and computer resources.
We have attempted in several ways to make the program flexible and easy to use. In particular input is
largely optional, with reasonable default settings for all omitted specifications; input is governed by keys
(strings, usually single words). This is discussed in SS^input. In many other Software Sections the related
input options and the corresponding keys are mentioned. Keys will be typed outlined (
).
Basis
3
16
Basis
The construction of the crystal basis functions is organized in BASORT. The generating one-center functions
, produced in DIRAC and/or SLTORB for the NAOs and STOs respectively are on file itpsi and the kinetic
energy functions (/2) on itkin.
The plane waves ei(k+K)·r in the basis are characterized by the lattice points K of the reciprocal lattice.
BASORT calls PLANEW to generate these vectors from the input-specified number of stars nwavst
(SS^input). The resulting number of plane waves is nvalwv. The total number of valence basis functions for
each k-point is nbas.
g enera l
BASPNT interpolates the one-center functions, computes their bloch sums and adds the plane waves to the
basis set. This is done for a number of k-points simultaneously (SS^workspace); after BASPNT the k-points
are processed one at a time.
BASOVL calculates and diagonalizes the overlap matrix of the basis {}. From the eigensystem the
transformation to an orthonormal basis is constructed. Let S be the overlap matrix and {E,} the eigen
system: S=EE+ . Define the transformation matrix U as
Uij = Eji i
1/2

(3.1)
The transformed functions '
i' =  Uij j
(3.2)
j
are orthonormal:
i'j' = 
=
 Uik k Ujl l = 
k
l
kl
 ( E*kiEkm)( EljE*lm)
m
k
l
*
Eki Elj (ij)
1/2
Skl =
m(ij)1/2 = imjmm(ij)1/2 = ij
(3.3)
We used here that the eigenvectors of S are orthonormal. BASTRA performs the transformation.
The final orthonormal valence basis and the transformed kinetic energy functions (/2)' are written to
(scratch) files itbas0 and itkbas. These are used (and deleted again) in HAMFIX to construct the matrices T
and H0 . T is the kinetic energy matrix in the orthonormal basis; it is used to calculate the valence kinetic
energy of the crystal after completion of the SCF procedure (SS^energy).
H0 T+∑V is the fixed part of the hamiltonian, consisting of the sum-of-atoms coulomb potential and the
kinetic energy. In the SCF part this is combined with the iteratively computed XC potential and the coulomb
Basis
17
potential of the deformation density. The H0 -matrix is written to file ith0, the kinetic energy matrix T to
itdatm.
HAMFIX copies also the valence function values from itbas0 to itbas, where the data for all k-points are
accumulated for later use in the SCF part.
linea r d ep end ency
The eigenvalues of the overlap matrix are used to check linear dependency of the valence basis. Strictly we
have dependency only if S is singular, that is, if at least one of the eigenvalues is zero. Non-zero but very
small eigenvalues are already hazardous however. The corresponding eigenfunctions are linear
combinations of the original functions with large (and opposite) coefficients. Small numerical errors, for
instance in the kinetic energy of the primitive functions, are blown up and indeed we find in practice that the
results become unreliable.
Therefore BAND stops (in BASOVL) if the smallest eigenvalue is less than some criterium scrval. scrval is
input with the key
,
or
; (first and second) defaults are 104 and 105 .
The coefficients in all eigenvectors with small eigenvalues indicate which of the basis functions cause the
dependency problems. We have summarized this information in 'dependency coefficients', defined as
wi =

j
scrval
j
Eij 2
(3.4)
The summation runs over all eigenvectors; j is the eigenvalue and Eij is the coefficient of the i-th function
in the j-th eigenvector. The dependency coefficients are printed if they exceed a threshold. This threshold
depends on iprntp, the general print option for the preparation part.
fro zen co re
Some of the one-electron states computed in DIRAC may be specified to be core states (SS^input). Their
function tables are written to the files itpsic and the kinetic energy functions to itkinc. Like the valence set
they are interpolated and combined in a bloch sum by BASPNT.
Core states are assumed to remain fixed when the atom is embedded in the crystal. A necessary, though not
sufficient condition is of course that core states of different atoms have no overlap; in particular they must
not extend into neighbouring (Wigner-Seitz) cells. Their bloch sum consists therefore effectively of only
one term for any evaluation point r. So they display no dispersion and the bloch 'sum' has to be constructed
only for the -point k=0. This is the ideal situation. In practice however one may, e.g. for computational
reasons, wish to define larger core spaces, containing states that have non-negligible dispersion. To take
care of that the core is constructed for every k-point separately; the computational overhead is relatively
small because the bloch-sum for core states is obtained always in a very short loop over lattice points. Core
Basis
18
dispersion may be neglected by specifying the input key
. In that case the core bloch sums
are computed only once (for the -point) and used for all k-points.
The overlap matrix of the core is computed (BASOVL) and the transformation performed to an orthonormal
set (BASTRA), in the same way as it is done for the valence basis. Ideally this is trivial: the overlap matrix
should be the unit matrix. It is useful however to check this: the numerical integration may be inaccurate
and, more importantly, some of the core functions may be too diffuse, thereby violating the implicit
assumptions about core states. The eigenvalues of the core-overlap matrix are compared with a criterium
scrcor (input by key
,
or
; defaults are 0.98 and 0.90). BAND terminates
when the test reveals that the frozen core approximation is unjustified.
The core functions are kept frozen and they are not processed in the SCF procedure. The valence basis must
then be orthogonal on the core. This is achieved by explicitly projecting out the core components (before
the valence set is orthonormalized).
valence
i
valence
 i
  j
core
j
core
j
valence
i
valence
 = i
  Sji j
core

(3.5)
j
BASCOR computes the core-valence overlap matrix S and orthogonalizes the valence on the core. The core
function values are stored on files itcor and itkcor. The functions are retrieved again from these files when
the valence basis is constructed; after the orthogonalization of the valence space on the core the core-files
are deleted.
va lence - co re d ep end ency
The frozen core approximation introduces an additional linear dependency problem: the valence space and
the core space may have a vector (almost) in common. Analogous to the situation in the valence set itself
this must be checked as it may lead to the blowing up of numerical errors. (In fact we realized this problem
in a sequence of calculations on diamond; varying a double- STO-basis we found an unexpected lowering
of the total energy by several eV for a particular valence set; indeed the valence basis contained effectively
almost the carbon 1s core orbital.)
BAND deals with this by computing for each k-point the maximum overlap between normalized vectors in
the core and valence space respectively. This is analyzed in BASOVL, where the transformation to an
orthonormal valence basis is to be computed.
The core basis has already been orthonormalized at that moment. A general core state, with normalization
condition, is then
core
fcore = ∑ci i
c† c = 1

(3.6a)
(3.6b)
Basis
19
The valence basis is not (yet) orthonormal. Denote the valence overlap matrix by V; a general valence
function with normalization is
valence
fvalence = ∑dj j

(3.7a)
d† Vd = 1
(3.7b)
core
Let S be the rectangular core-valence overlap matrix, computed in BASCOR, Sij =i
valence
  j
. In
BASOVL the valence set has already been orthogonalized on the core, but the matrices V and S above refer
to the valence set before the core-orthogonalization; the core-orthogonalized functions are denoted

valence and their overlap matrix W.
valence
valence
i
= i
  Sji j
core

(3.8)
j
W is the overlap matrix actually computed in BASOVL. The relation between V and W is
valence
Vij = i
valence
j
= Wij +

kl
valence
+
 = i

k
core
Ski k
valence
+
j

l
core
Slj l
 =
*
Ski Slj kl = Wij + (S† S)ij
(3.9)

where we used the orthonormality of the core set and orthogonality of v on all core functions. The overlap
between a general core and valence function is
core
O  ∑ci i
valence
∑dj j
  c† Sd
(3.10)
O is a complex number. We maximize therefore the real scalar OO* under variation of the coefficients c
and d. The normalization conditions are incorporated with Lagrange multipliers. Define
F = OO*  c† c  d† Vd = c† Sdd† S† c  c† c  d† Vd
(3.11)
From the variational equations we derive
∂F
O*
= 0  SdO*  c = 0  c =
Sd

∂c†
(3.12)
∂F

= 0  OS† c  Vd = 0  S† Sd = OO* Vd
†
∂d
(3.13)
Multiplication from the left by V
(V
1/2
gives
1/2 †
1/2
+1/2
1/2  
S SV
)(V
d) = V
 +1/2 +1/2
OO* V (V d)  Gd' = d'
(3.14)
Basis
20
with
G=V

 = OO*
1/2 †
1/2
S SV
d' = V
+1/2
d
(3.15)
(3.14) is an eigenvalue equation for the hermitian matrix G. Combining (3.10) and (3.12) gives

O  c† Sd = c† c O*  = OO*
(3.16)
Similarly
O
O
O
1/2 †
1/2
O  c† Sd =  d'†S† Sd = d† V
S SV
d' =  d'† d'  = 



(3.17)

So that ===OO*. The maximum overlap between core and valence can then be defined as  max)1/2 .
implementation
In BASCOR, where the original valence set is orthogonalized on the (orthonormal) core, the matrix (S† S) is
computed from the core-valence overlap matrix S. The variables in the program are sdagsr and sdagsi for
the real and imaginary parts respectively
This is added in BASOVL to W (ovlre, ovlim), the overlap matrix of the valence set that has been
orthogonalized on the core: V=W+(S† S). V is then diagonalized to construct V
V = U U†  V
1/2
= U
1/2
from its eigensystem:
1/2 †
U
(3.18)
Next we obtain G by the similarity transformation (3.15) (routine SIMTRF). Finally G is diagonalized to
find the maximum eigenvalue max. (1max) is used as the measure for the core-valence dependency.
(This is analogous to the smallest eigenvalue of the overlap matrix of a set of functions.)  is compared with
the criterium scrcv (input by key
or
; defaults 104 , 105 ).
Like in the analysis of the valence basis itself, dependency coefficients are computed (3.4), which tell us
which valence functions are dominant in the overlap with the core space. If the valence and core are found
to be dependent the program stops (in BASOVL) and prints the dependency coefficients.
the Sta b le Sta te Ap p ro a ch ( SSA)
The Stable State Approach (SSA) is a proposal for (future) improvement of accuracy and efficiency of the
SCF procedure. Conceptually it resembles the frozen core approximation; it is based on the following
consideration: the convergence of the one-particle eigenstates n(k;r) towards the final self consistent
solution will in general be quite different for the various states. It is well conceivable that some states attain
rapidly their final form while others converge slowly. If we keep track of the developments and remove
from the valence basis the states that do not change anymore, the efficiency is enhanced in the remaining
cycles: less data to keep on file, fewer matrix elements to compute, etcetera.
Basis
21
This is (partially) counterbalanced by the necessary transformation of the basis to single out the stable
states. Analogous to the frozen core treatment the new function basis must be orthogonal to the fixed states.
The transformation overhead scales as the basis size squared and may therefore be appreciable. Possibly a
net gain results only if several stable states are removed at a time; such 'details' should of course be
incorporated in a thoughtful implementation.
It may be noted here, as a side remark, that the reduction of the function space may not come only (or not
even predominantly) from finding stable occupied states, but also from the detection and removal of stable
virtual states.
basic set-up
The eigenstates are expressed in the basis functions: i=∑cij j. A state i is stable if its coefficient
vector {cij }j and the number of electrons in it, the occupation number ni are constant within some
tolerance. This is easily checked if we have the coefficient matrices and occupation numbers on file for the
last few cycles.
In this formulation some of the possibilities of the SSA may be missed. Two examples:
1) suppose 1 and 2 are two degenerate occupied states with, of course, equal occupation numbers.
Obviously any orthogonal transformation of {1,2} yields the same physical situation. So, if we have a
stable space, spanned by 1 and 2, we may not detect it when the computed eigenvectors rotate
arbitrarily around from one cycle to another.
2) suppose 1..N are (non-degenerate) virtual states and at the next cycles we find N virtual states
spanned by the same functions. We can then remove these functions as they are apparently irrelevant for the
crystal electrons; the individual virtual eigenstates need not be stable for this, only the space spanned by
them.
Both these cases are detected if we do not examine the individual eigenstates but the P-matrices (density
matrices). Denote by Pn the P-matrix in the n-th cycle and choose as its representation not the employed
n
basis functions but the eigenstates of cycle n0 . P 0 is then diagonal and the diagonal elements are the
occupation numbers. A state i is stable and can be singled out if at the following cycles the corresponding
diagonal element Pii remains the same and its off-diagonal elements Pij , j≠i, are zero.
The criterium for stability has to be determined by experimentation. It must scale somehow with the mixing
parameter parmix employed in the iterative update of the crystal potential (SS^iteration).
the SSA and the frozen core
The frozen core approximation has some fundamental problems. In the first place the core states may not be
strictly orthonormal. Whereas they are explicitly orthonormalized in order to facilitate the projecting out of
the core components from the valence basis, the corresponding energy term is not evaluated and implicates
Basis
22
thus an unknown error in the computed energy. The tests on the eigenvalues of the overlap matrix, as
discussed above, is only a very rough safety guard against extreme cases.
In the second place the relaxation of the crystal potential, especially its change near the nuclei means that
even if the core states were orthonormal from the start, they may not be eigenstates of the final crystal
hamiltonian. This has consequences both for the energy of the core space and for the valence states. This
aspect has usually no severe consequences but nevertheless it is unsatisfactory.
The SSA alleviates this problem because we may include (part of) the core in the valence basis. The
corresponding crystal eigenstates will soon be converged (if the frozen core assumption is reasonable) so
that they have to be processed only during a few cycles. In principle we could thus discard the frozen core
idea altogether in the SSA, but in some situations (heavy atoms, many k-points) the enlarged basis size and
the resulting amount of data on file may be problematic, even if it would be for only two or three cycles.
(Data storage can and should be reduced however by a more sophisticated treatment of symmetry than it is
currently being done in BAND (SS^symmetry)).
4
BZ-integration
Two types of integrals over the Brillouin Zone occur: the 'volume' integral and the 'surface' integral,
respectively
J(E) =


fn(k) (Een(k) ) dk


fn(k) (Een(k) ) dk =
n
I(E) =
n
(4.1a)

f (k)
n

 en(k) dS
(4.1b)
n en(k)=E
en(k) is the dispersion relation of the n-th energy band and fn(k) is the contribution to the integral from the
one-electron state n(k;r). The results depend on the energy parameter E. The surface and volume integral
are related by I(E)=dJ(E)/dE.
Example: the electronic charge density (r) is given by the volume integral
2
(r) =  
 n(k;r) (eF en(k) ) dk
(4.2)
n
eF is the fermi energy.
As discussed in SS^symmetry the integrations can effectively be restricted to integrals over the symmetry
unique region in k-space, the irreducible wedge of the BZ; we will take this for granted here.
BZ-integration
23
The eigensystems (en(k) , n(k;r), and hence any derived quantities fn(k) ) are computed in discrete points
of the BZ. The integrals (4.1a) and (4.1b) are thus evaluated numerically. The presence of the - and
-functions in (4.1) makes 'normal' methods of numerical integration, such as Gauss-Legendre, unsuitable.
BAND employs the (repeated) analytic quadratic procedure [Wiesenekker et al. 1988, Wiesenekker and
Baerends 1990]: the (irreducible wedge of the) BZ is first written as a conjunction of (large) basic
simplices; these are further partitioned into smaller simplices. In each (small) simplex en(k) and fn(k) are
both approximated by a quadratic form; the resulting quadratic integrals are evaluated exactly (analytically).
The precision in the total accumulated integral can be increased by using a finer partitioning of the BZ.
Whereas the presence of the - and -function makes the integrals (4.1) a little complicated, en(k) and
fn(k) themselves are smooth. The piecewise quadratic approximations converge therefore rapidly with the
partitioning (compare the commonly used repeated Simpson integration for 1D line integrals). Since the sub
integrals, employing the quadratic forms, are calculated exactly, the total integral approximation is expected
to be accurate and quickly converging. This is borne out in practice.
For the construction of the quadratic approximations in a given simplex we need the function values of
en(k) and fn(k) in sufficiently many points to solve the defining linear systems of equations. It is
convenient is to use the vertices and the midpoints of the edges of the simplex.
Even without a further partitioning this implies already several k-points for each basic simplex. BAND
optionally uses fewer points, but of course a quadratic approximation is not possible then.
The integration accuracy is determined by the parameter kinteg. Kinteg is the number of sample points on
any edge of a (large) basic simplex. Kinteg=1 implicates the use of only the -point k=0; the numerical
integral over the BZ is then reduced to a single term; we may call this the zero-th order approximation.
Kinteg=2 leads to first order (=linear) approximations: the basic simplices that span the irreducible wedge
of the BZ are generated; the only integration points are the vertices of these simplices, the midpoints of the
edges are not involved. A linear approximation of the functions is then feasible and the ensuing integrals are
evaluated analytically.
Kinteg=3 is the lowest setting that allows a quadratic approximation; the basic simplices are used directly,
i.e. no further partitioning takes place. With kinteg=5 every edge of a basic simplex is bisectioned, yielding
two intervals with 3 points each (one point is shared). Each basic simplex is then divided into 2 n smaller
simplices, in each of which the quadratic method is applied (n is the dimensionality of the BZ). Similarly
with kinteg=7 we get 3n smaller simplices and so on.
When kinteg is chosen even (4,6,..) it is not possible to have a partitioning of non-overlapping intervals
such that each has 3 points; one remains with only two points so that in some part of the BZ we have to
resort to the linear approximation. In such a case the linear method is applied throughout: all edges are
partitioned into intervals with two points only. This is the approach of the commonly employed 'linear
tetrahedron method' [Lehman and Andersen 1972], which is by far inferior in performance to the quadratic
BZ-integration
24
one [Wiesenekker et al. 1988, Wiesenekker and Baerends 1990] . For higher integration accuracies
(kinteg≥3) one should therefore stick to odd values.
Kinteg is input; key:
,
, or
; first and second defaults: 3, 5. A (non-fatal) warning message
is issued by the program when K is chosen even.
d a ta structure
KPNT generates the data structure needed for the BZ integration: a list of kt k-points xyzkpt(3,kt) and a list
of nsimpl simplices, represented as a pointer set ksimpl(nvertk,nsimpl). Nvertk is the number of points
associated with each simplex; this depends on the type of approximation and on the dimensionality; in two
dimensions for instance nvertk is 3 for the linear approximation (vertices only) and 6 for the quadratic
approximation (with the midpoints).
The values ksimpl(i,j),i=1..nvertk indicate which points in the list xyzkpt() belong to the j-th simplex.
The generated simplices span precisely an irreducible wedge of the BZ. Some of the k-points may
nevertheless be symmetry equivalent, e.g. by a Bravais translation of the reciprocal lattice. KPNTEQ, called
from KPNT, checks the symmetry relations and generates a list of equivalence indices kequiv(kt).
Kequiv(k0 ) is the index of a point equivalent to point k0 . For all k: kequiv(k)≤k; the symmetry unique
points have kequiv(k)=k. Only in these points the hamiltonian equation is solved. The other points serve to
describe the simplices and to generate the (linear or quadratic) function approximations. The total number
of symmetry unique k-points is kuniqu.
The data arrays xyzkpt, ksimpl and kequiv are stored on file itbz; the scalar variables kt, kuniqu, nsimpl and
nvertk are in common FIXDAT.
sim p lices a nd p o ints
The dimensionality of the BZ is ndimk and the primitive k-space lattice vectors are stored in the left-upper
ndimkndimk part of array bvec(3,3).
ndimk=0
This trivial case implies the zero-th order approximation: only the -point, one 'simplex', one point per
simplex (nvertk=1).
ndimk=1
1
As inversion is always a k-space symmetry operator, the irreducible wedge is the interval (0, 2 bvec(1,1)).
This is also the basic simplex. Depending on kinteg exceeding 3, the interval is partitioned to obtain smaller
simplices.
The partitioning of a simplex in smaller ones is (for any dimensionality) performed by SIMPLS. SIMPLS
delivers both the pointer structure ksimpl() and the list of points xyzkpt(). In case of more than one basic
BZ-integration
25
simplex (in two or three dimensional regions, see below) SIMPLS is called for each of them. SIMPLS
updates the data arrays at every call, merging the data resulting from the current basic simplex with the
existing structure.
ndimk=2
LATTPT generates a few stars of lattice points (in k-space) around the origin. The bisecting 'planes' (lines)
halfway them define the Wigner-Seitz BZ as the polygon inside these lines. POLYGN removes the redundant
lines to retain only the sides of the polygon. POLYG1 computes then the vertices of the polygon and PLGIRR
calculates the vertices of the irreducible wedge (a polygon again). The basic simplices are now the triangles
spanned by the origin and the edges of the irreducible polygon. For each of these SIMPLS is called for
further partitioning into smaller triangles.
ndimk=3
LATTPT gives again the surrounding lattice points; the planes halfway them define the Wigner-Seitz
polyhedron. POLYHE removes the redundant planes and computes also the vertices of the polygonal faces
of the polyhedron. The symmetry unique faces are subsequently rotated to the xy-plane (PYRROT), reduced
to the irreducible part (PLGIRR) and then back rotated again to the original frame. The basic simplices are
defined by the origin and the triangles that span the irreducible polygons on the faces. SIMPLS may further
partition them.
The geometric routines POLYGN, PLGIRR, POLYHE and SIMPLS are discussed in SS^geometry.
Since SIMPLS only delivers the (ndimk+1) vertices of the small simplices we have to add the midpoints of
the edges later (for the quadratic approximation). This is done in KPNT after the generation of all simplices.
The pointer array ksimpl() is updated with pointers to the midpoints.
integ ra tio n
The BZ integrals are evaluated numerically. The fixed set of integration points xyzkpt() is stored on file
itbz. The weights on(k) , that is, the occupation numbers for the one-electron states n(k;r) are computed
by OCCUPA from the energy bands en(k) and an energy parameter E (4.1).
During the SCF procedure we need only one specific integral over the BZ: the charge density
(r) =  on(k) n(k;r)2
(4.3)
nk
The constraint that the density contain the number of electrons qelec,

nk
on(k) = qelec
defines the fermi energy eF (the energy parameter E in (4.1a)) .
(4.4)
BZ-integration
26
FERMIE repeatedly calls OCCUPA with trial values for the fermi energy until condition (4.4) is satisfied.
The resulting occupation numbers are then passed back to the master routine FERMI, which calls OCCSTA.
OCCSTA writes them to file, together with the corresponding eigenstates for further processing later.
Since the energy bands en(k) change as the SCF iterations proceed the fermi energy and the occupation
numbers may also vary; they are computed anew at every cycle.
OCCUPA sets up a loop over the (small) simplices in which the irreducible BZ has been partitioned and calls
OCCUPS for each of them, to obtain the occupation numbers for that simplex. OCCUPS discriminates
several cases: in the first place whether a surface or a volume integral is requested, in the second place the
order of the interpolation (linear or quadratic) and finally the dimensionality ndimk of the BZ.
linear approximation
The algorithm is discussed in detail by Wiesenekker et al. [1988]. The formulas have been implemented in
BZINTL.
quadratic approximation
The 1D case is more or less obvious (routines VIQD1 and VJQD1 for surface and volume integration
respectively). For the 2D case we refer to Wiesenekker et al. [1988], where all technical aspects are
explained. The formulas have been implemented in QUAD2 and a few auxiliary routines. The 3D case
finally is fairly complicated [Wiesenekker and Baerends 1990]. Implementation (QUAD3) is on its way. A
satisfactory alternative, currently applied in BAND, is the hybrid-quadratic method: routine HYBRID.
hybrid-quadratic approximation
As discussed by Wiesenekker et al. [1988] we need for the quadratic method the integrals
Vh(E) =

 h(k) (Een(k) ) dk
(4.5)
simplex
where the functions h(k) are all monomials up to second order (and similarly for the surface integrals with
the -function replaced by the -function, c.f. (4.1a) and (4.1b)). The occupation numbers are computed
from the monomial integrals Vh as fixed linear combinations of them. The fundamental problem is
therefore to compute the integrals Vh .
In the hybrid method we construct first explicitly the quadratic approximation of the energy dispersion
en(k) in a particular simplex (one of the small simplices in which the BZ has been divided): equad(k) say.
This simplex is then partitioned in still smaller simplices (using the same algorithm as in SIMPLS. For each
of these sub-simplices the energy values en(k) in the vertices are computed from the analytical form
equad(k) and used to calculate the monomial integrals with the linear interpolation method (BZINTL). The
contributions from all small sub-simplices are added and yield finally the required monomial integrals over
the original simplex.
BZ-integration
27
The hybrid method is essentially a two step proces: first a quadratic approximation is made. This is then
further approximated by a piecewise linear form. Obviously the precision of the result depends on the
hybrid partitioning. The number of additional hybrid partition steps is kmesh. This is input by the key
; first and second defaults are kmesh=2 and kmesh=3. Thusfar we have never encountered a system
where the values 2 and 3 (or higher) resulted in significantly different results. Apparently the first default
value is adequate.
remark
The hybrid method is often used in a different form by integrating only the monomials up to order 1 (instead
of 2) over the sub simplices [MacDonald et al. 1979]. The compounded integrals over the large simplex are
then used to compute the occupation numbers, but for lack of information, these correspond of course to an
approximation of the property function f(k) to first order only. If we assume that the hybrid partitioning is
pushed to its limit, then our form approaches the analytic quadratic results, while the simplified form is
equivalent to the linear-quadratic approximation: linear approximation of f(k) and quadratic approximation
of en(k) ; as demonstrated by Wiesenekker et al. [1988] this gives results that are by far inferior to the fully
quadratic approximation.
temperature
BAND attempts to determine the occupation numbers in accordance with a finite temperature (the fermi-
dirac distribution). In routine FERMIE, where the fermi energy and the occupation numbers are determined,
we have to call OCCUPA therefore with a (small) array of energy values instead of only the (trial) fermi
energy. The energy values are distributed around the trial fermi energy and the resulting sets of occupation
numbers, one set for each of the energies, are combined; this amounts to a numerical integration of the
fermi-dirac distribution. See SS^temperature for a more detailed account.
efficiency
The determination of the fermi energy (FERMIE) at every cycle of the SCF procedure could be rather time
consuming (especially in 3D crystals) due to the following inefficiency. For each trial fermi energy all
bands might be analysed, piecewise-quadratically fitted etcetera to determine the occupation numbers. In
general however the majority of the bands are either completely occupied or empty. The occupation
numbers for such bands do not depend on their energy functions en(k) . It is therefore unnecessary to
compute them again and again. This aspect is taken care of as follows
a) the fixed set of occupation numbers for all k-points in a completely occupied band are determined at the
start of the SCF procedure and stored in a separate array.
b) at every cycle, when the fermi energy and the occupation numbers are to be determined, the energy
range of every band is computed first. For each trial energy it is then quickly surveyed which bands are
empty, fully occupied or to be analyzed in more detail. The determination of the band widths occurs in
routine EMNMXB, with three auxiliary routines EMNMX1, 2 and -3 (for 1-, 2- and 3D crystals). The
same piecewise quadratic approximations for the energy bands are applied in fact by EMNMXB, but this
is done only once (per cycle), not for every trial (fermi-)energy.
BZ-integration
28
In one case (at least) the occupation numbers are zero for some specific k-points, even if the band is not
empty. This known special case occurs for the quadratic integration in a 2D crystal. Each fully occupied
simplex has zero occupations for the vertices (and occupations 1/3 for the mid-points of the edges). This is
an accidental effect of the integration scheme [Nooijen et al. 1990, Wiesenekker et al. 1988]. We do not
know whether there are more of such special cases.
Suppose now that for all bands the occupation numbers are zero in a particular k-point. If this remains so
during the whole calculation it must be possible to avoid most of the computational effort associated with
that k-point, such as the evaluation of matrix elements and so on.
Thusfar we have not worked out how to proceed with this possibility. As a preliminary for future work on
this we have implemented an array kzero(kt) in SCF, which is passed on to several sub-ordinated routines.
kzero stores for each k-point whether (and in how many subsequent cycles) the occupation numbers are zero
for all bands. Output messages are issued whenever such k-points are found.
5
Charge density
eva lua tio n
The solution of the hamiltonian equation in each k-point yields (EIGSYS) the one-electron eigenstates as
linear combinations of the basis functions
n(k;r) =  cnj(k) j(k;r)
(5.1)
j
The analysis of the energy bands en(k) gives (FERMIE) the occupation numbers on(k) . There are then two
ways to compute the charge density in the crystal integration points.
1.
From the eigenstates:
(r) =  on(k) n(k;r)
(5.2)
nk
n(k;r) is the crystal orbital density
n(k;r) =
|
j
cnj(k) j(k;r) 
|2
(5.3)
With kt k-points, nbas basis functions and nband occupied states the amount of work per integration point is
proportional to ktnbandnbas.
2.
From the density matrices. First we construct the P-matrix (PMATRX) in each k-point.
Charge density
k
Pij =
29
nband

n=1 ij
*
cni(k) cnj(k)
(5.4)
Since Pk is hermitian we have to store and process only the upper triangle; define
k
k
Pij = 2 Pij
i<j
(5.5)
k
k
Pii = Pii
This gives
(r) = 

k i≤j
k *
Re [Pij i (k;r) j(k;r)]
(5.6)
The amount of work per integration point is proportional to ktnbas(nbas+1)/2.
The construction of the P-matrices themselves can be neglected as they do not depend on the integration
points.
Depending on the number of occupied bands and the number of basis functions one or the other method is
more efficient. The break-even point may be expected to occur for nbandnbas/2, but of course it is also
determined by details of the computational procedures, in particular by the types of (vector) operations
involved and hence it may even be machine dependent.
In big calculations the evaluation of the charge density is a major part of the SCF procedure and it is crucial
to pick the most efficient method. Both ways have been implemented; RHOPSI uses the eigenstates,
RHOPMT employs the density matrices. The times used for them are measured and stored in an array
rhotim(2). The values are compared by SCF to make its choice.
Rhotim is initiated at zero for both and either of the two is used at the first cycle; its timing is then updated.
At the next cycle the other method is applied because it has still zero time-measurement and appears
therefore more efficient. At the third cycle a sensible decision can be made.
remarks
1. The execution times may change somewhat from cycle to cycle, for instance as a consequence of
varying circumstances in the computer during the run. Future developments of BAND (see e.g. SS^basis:
the SSA) may also induce variations in the cycle-times. To keep the decision procedure flexible, SCF
reduces by a small amount at each cycle the stored time-measurement of the not-used alternative.
Eventually this will cause it to appear more efficient than its rival, so that it is applied and clocked
again.
The disadvantage that the less efficient procedure is used from time to time is relatively small because
the artificial reduction of the stored timing is small (5% per cycle).
Charge density
30
2. As discussed in SS^SCF-linearization the charge density is at some cycles evaluated as a linear
combination of previous densities. This is then a completely different situation of course and the timings
of the exact evaluations via RHOPSI and RHOPMT are not updated.
The approximate evaluation does not depend on the choice discussed above: the expansion coefficients
are determined by comparing the density matrices Pk , k=1..kt with those of previous cycles (so these
matrices have to be computed anyway). The corresponding density functions are on file itrstr and have
only to be combined (see SS^charge density). This could be done in either routine, RHOPSI or
RHOPMT. After the exact evaluation however the data on itrstr have to be updated. The required
additional lines of code have therefore been implemented in both routines. On general principles SCF
chooses between the two routines also in the approximation case. The array rhotim has thus four
elements instead of two: rhotim(2,2).
a na lysis
The file with integration points, itpnt, contains for each point the index of the atom nearest to that point:
abs(idatom()); positive and negative values of idatom indicate whether the point is inside the atomic sphere
or in the interstitial region. Routine CHARGE, called from the properties section PRPRTS, applies this
information to calculate the distribution of the final self-consistent density over the atoms. This is done both
for the initial sum-of-atoms charge and for the deformation density. The data are presented separately for
the densities inside and outside the atomic spheres. In spin-polarized calculations the analysis is performed
for each spin.
The file itdatm contains the sum-of-atoms total density and valence density in all integration points; the
crystal valence density is on itrho.
The computed spin and charge polarizations may be compared with the Mulliken populations calculated in
POPANA. Often the agreement is poor. The discrepancies may be attributed to the arbitrariness in both
methods. In the Mulliken analysis the choice of basis functions influences the outcome, whereas in the
numerical integration employed by CHARGE, one may doubt the association of points with the atoms
nearest by: heavy atoms might for instance be attributed more space around them than hydrogen.
Any criterium related to the distances and e.g. the nuclear charges might be used and is easily implemented.
The index numbers idatom() are determined in RPNTID; adaptations of the algorithm should be
straightforward.
further developments
The analysis in CHARGE is very global and one would like to have more detailed information. An important
possibility is to generate plots of the density. This would be an interesting extension of BAND, because such
pictures provide a direct and instructive survey of the charge distribution; they are frequently published and
discussed in the literature. A relatively simple and efficient way to compute the desired values in some plotgrid is via the fit functions. The fit set is usually fairly accurate, but naturally the principal disadvantage
remains that it is not exact and some particular features of the distribution may be missed in this way.
Charge density
31
A general drawback of density plots is that the information is only two-dimensional. A three-dimensional
numerical analysis is useful as support and may also be valuable in itself. We may for instance expand the
density inside the atomic spheres in spherical harmonics. The numerical integration grid is well suited for
this: for a sequence of radial distances an angular integration mesh is present. The maximum possible
angular momentum value for the expansion depends on this mesh, varies with the radial value, but is usually
substantially higher than the values occurring in the fit set. The total density as well as the deformation
charge can then be described in detail. The data may be used to generate one-dimensional radial plots of the
densities for each (lm)-component separately.
6
Control
BAND contains a general control mechanism, which we will refer to as the controller. The purpose of it is
twofold. In the first place many tasks performed by the program are relatively independent. The controller
may be used to manipulate the execution of various operations, e.g. via instructions on the input file. This
will especially be useful when, in the future, restart possibilities are incorporated in a general way.
For instance we may have obtained the self consistent solution of some system and wish to know in what
respects the outcome is influenced when plane waves are added to the valence basis, what happens when the
temperature is increased, whether a spin-unrestricted calculation would make some difference, or we may
be interested in a few particular partial densities of states that have not been computed in the original run. In
all such cases we do not want to duplicate work. The controller should take care of that when supplied with
the appropriate data on file and a few instructions.
In the second place the control mechanism may be of some help to uncover and analyze errors, by checking
relevant data and printing information when a problem is detected.
Many of the data related to the control structure are stored in the common blocks CNTRLC (character-type
data) and CNTRLV (other variables). The controller in its present form is only a first, rather primitive
structure which might be embedded in a more sophisticated set-up.
A summary of the current implementation:
#
The execution of all major subroutines and sections in BAND is enclosed between calls to two special
routines, START and ENDOF.
call start('calc',iopt,execut)
if (execut) then
..
call calc(....)
..
call endof('calc',jopt)
endif
(6.1)
Control
32
START determines whether CALC is to be executed. The decision is output in the logical execut. If
execut is true the name 'calc' is added by START to an 'execution stack', the list of names exstck().
Routine ENDOF removes 'calc' again from the stack and checks whether the run should be stopped at
this point.
Execut is computed in START by calling SKIP('calc'). The logical function SKIP compares its argument
with a list of sections to be skipped, skplst(). The skip-list is empty at the start-up of the program;
entries can be added via input, with the key
(see SS^input).
ENDOF compares its first argument with the character variable exlast in common CNTRLC, which stores
the name of the last section or routine to be executed; if they are equal ENDOF calls STOPIT to
terminate the run. By default exlast equals 'band', implying that the whole program is to be executed.
Via input, with the key
(SS^input), another last-to-be-executed routine or section may be
specified.
Another variable, exfrst, states with what section the program should start. As no restart possibilities
have yet been implemented this variable can effectively not be used and must have the default value
'band'.
The admissible arguments for the input commands
and
must be names that are recognized
by the controller: they must correspond to the arguments of START and ENDOF as implemented in the
program. See SS^input for more comments on these keys.
The integer option-arguments of START and ENDOF, iopt and jopt in example (6.1) above, specify
additional actions to be undertaken by these routines. Currently this aspect is hardly used. START
optionally (iopt=1) copies the name of the section to the day-file. This provides information as to in
which stage the calculation is at a certain moment. ENDOF optionally calls the timing routine SECTIM.
(SECTIM keeps track of how many times a certain section is executed and the amount of CP-time used
there (i.e. the time lapse since the previous call of SECTIM). The data structure of SECTIM is initialized
by ITIMER (called from INIT) and all information is output by TSTAT, called from STOPIT).
#
STOPIT is the routine which terminates the program after executing a few final tasks. STOPIT is called
whenever a fatal problem is detected in some way, and also when the controller decides ( ENDOF) that
the calculation has to be finished. STOPIT outputs
-
the state of affairs in the file manager: the files currently in use by BAND, the pointer structure that
defines the relation between the files and the amount of data on them (SS^files),
-
the situation in the workspace manager: the currently allocated arrays and the markers around them
(SS^workspace),
-
the contents of several common blocks; the variables stored in them represent important aspects of
the calculation and may give a clue to the cause of some problems,
-
the execution stack, telling us in what stage of execution the calculation was terminated.
Control
33
Since STOPIT calls other subroutines, in particular those of the file manager, which may themselves call
STOPIT in case of errors, we have a potential recurrency. The ensuing problems are circumvented with a
variable istop in common CNTRLV. Istop is set to zero in INIT. The first actions in STOPIT are
if (istop.ne.0) stop'recurrency'
(6.2)
istop = istop + 1
This assures that (infinite) recurrency loops are cut short.
#
A few important aspects of BAND's control mechanisms have been organized in specific structures that
are discussed in other sections. These are
7
-
the use of files: SS^files.
-
the utilization of work space: SS^workspace.
Coulomb potential and lattice sums
The charge density in the crystal is given by a sum over occupied orbitals, expressed in basis functions :
 =  ni i2 =  Pij i j
*
i
(7.1)
ij
ni are occupation numbers and the sum includes integration over the Brillouin Zone. Pij is the density
matrix in the representation of the basis functions. The coulomb potential is
V(r) = 
 (r') rr'
1
dr' =

ij
*
Pij 
 i (r') j(r') rr'
1
dr'
(7.2)
The basis functions are (mostly) one-center functions: numerical orbitals from the free atom subprogram
and/or Slater type orbitals. The integrals in the r.h.s. of (7.2) are hard to evaluate when i and j are located
on different atoms. Therefore a set of fit functions fi is introduced such that a) the true density is accurately
c
approximated by a linear combination of them and b) their coulomb potentials fi can be computed.
The crystal density is split in the superposition of atomic densities and the deformation density
cry =   + def
(7.3)

with coulomb potential (including the contribution from the nuclei)
V=


V + Vdef
(7.4)
Coulomb potential and lattice sums
34
The atomic functions  and V are produced in tabular form by DIRAC and interpolated in ATMFNC to
obtain their values in the crystal integration points. The fit functions are used for the deformation part of the
density
def   ci fi
(7.5a)
i
Vdef 

i
c
ci fi
(7.5b)
The coefficients ci are the least squares solution of (7.5a) with the constraint that the fitted density contain
zero charge. Neglect of this condition, though leading to a more accurate description of the density,
generates a potential corresponding to an incorrect amount of charge. The resulting error in the potential is
much larger than the gain from the better density approximation.
The crystal fit functions constitute an orthonormal set (see below). The constraint is applied by the
Lagrange multiplier technique. Define
vi = 
 def fi dr
(7.6a)
ni = 
 fi dr
(7.6b)
vi would be the fit coefficient in absence of the constraint; ni is the charge content of the fit function. The
coefficients ci are obtained from the usual variational equation, with ∑ci ni =0
∂/∂cj
{∫(∑cini)2 dr∑ci ni } = 0  nj =2(vj cj )
(7.7)
Multiply by nj and sum over all j
=2
∑njvj∑cjnj
2
∑nj
=2
∑njvj
2
∑nj
(7.8)
Reinsert into (7.7):
ni
∑njvj
ci = vi  2 = vi ni
2
∑nj
(7.9)
The fit coefficients ci are computed in RHOFIT at every cycle of the self-consistency procedure. The fit
functions, potentials and charge contents are on file itfit.
Coulomb potential and lattice sums
35
fit functio ns a nd f it p o tentia ls
Two types of fit functions may be useful. In the first place atomic one-center functions (r)=Zlm() P(r).
Since the density is totally symmetric the crystal fit functions are symmetric combinations of the one-center
functions (SS^symmetry); the one-center functions themselves are the generating (fit) functions.
The options available in the program for P(r) are Slater type functions P(r)=rl+n er and numerical
functions. The latter are the squares of the free atom (numerical) orbitals; the angular quantum number for
the corresponding fit functions is set to l=0, so that the l0 components in the charge density have to be
represented by the Slater type fit functions. See also SS^input.
To evaluate the potential c(r) corresponding to a one-center function (r) the expansion of rr'
1
in
spherical harmonics is applied
l

c(r) = 
r<
4π
*
dr' Zlm(') P(r')
Z
(

')
Z
(

)
l'm'
l+1 =
2l'+1 l'm'
r>
l'm'

l
∞ r
4π
4π
<
2
= 2l+1 Zlm() 
 l+1 (r') P(r') dr'  2l+1 Zlm() I(r)
0 r>
(7.10)
For the numerical functions P(r), given as a table {P(ri )} I(r) is evaluated in routine COULOM, by
numerical integration.
For an analytical function P(r)=rn er we use the incomplete gamma function. Define
r'
x= r
 = r
(7.11)
and apply
1
 k  i
k ex dx = k! 1

x
  

k+1  i=0 i! 
0
(7.12)
to write
∞ rl
∞
1

<
n+2 e r ' dr'= rn+2  xn+l+2 ex dx +  xnl+1 ex dx =
I(r) = 
(r')


 l+1

0

0 r>
1


n+l+2


i=0
(n+l+2)! 1e
= rn+2
  n+l+3 


nl+1

i=0
 i
(nl+1)! 
i!  +  nl+2 e

 i!  =
 i

Coulomb potential and lattice sums
=


 1 (n+l+2)! 1e
rl+1  n+l+3 


36
n+l+2

i=0

 i
(nl+1)! l 
i!  +  nl+2 r e

nl+1

 i! 
 i
i=0

(7.13)
Some care is necessary in the numerical evaluation when r is small. e ∑(i/i!) equals almost 1.0 then
and subtraction from unity in the first term of the r.h.s. of (7.13) may lead to a loss of significant digits. This
is remedied by using in such a case
m
1e
 i!
i
∞
= e
i=0

i=m+1
i
i!
(7.14)
The infinite series in the r.h.s. converges rapidly (<<1) and can be truncated after a few terms.
The generating fit functions and the potentials are stored on file as tables of function values in a radial
logarithmic mesh, like all other one-center functions in BAND. The radial values are computed in FITRAD,
with auxiliary routines FITRA1 and FITRA2.
In the second place plane waves might be useful as fit functions (in 3D crystals). The potential functions are
straightforward from the Poisson equation V = 4:
f(r) = eiK·r
4π
fc(r) = 2 eiK·r
K
(7.15)
Again these have to be combined into symmetric combinations.
Plane wave fit functions have not (yet) been implemented in BAND and are in fact not necessary. The
atomic one-center functions have proven to be very adequate in practice (see SS^energy for a note on the
errors involved).
FITORT is the master routine for the crystal fit functions and potentials. FITPNT interpolates the radial
tables, constructs the bloch sums and combines them into symmetric functions.
FITOVL computes (from the overlap matrix) the transformation to an orthonormal set. FITTRA performs the
transformation and FITQ determines the charge contents ni (7.6b) of the final orthonormal fit functions. All
data are written to file itfit.
The eigenvalues of the overlap matrix are used to check linear dependency of the fit set. Eigenvectors with
too small eigenvalues are discarded. The smallest eigenvalue allowed is scrfit. Scrfit is input with key
or
; first and second defaults are 105 , 106 .
,
Dependency coefficients, analogous to the valence basis (SS^basis) are computed and printed, depending on
the print option iprntp.
Coulomb potential and lattice sums
37
la ttice sum s
A crystal fit function is a (symmetrized) bloch sum of one-center functions. The potential function contains
the bloch sum of a multipole potential 1/rl+1 . This is exposed by rewriting (7.13) as

(n+l+2)! 1
(nl+1)! rl
I(r) = n+l+3 l+1 + er
  nl+2

r

nl+1

i=0
(  r)i (n+l+2)! 1
i!   n+l+3 rl+1
n+l+2

i=0

(  r)i
i! 

(7.16)
(The numerical fit functions have a similar behaviour).
The second term decays exponentially with the distance and hence its bloch sum can be computed
efficiently in a limited loop over lattice points. The first term gives rise to the familiar lattice summation
problem of solid state theory: we have to calculate sums like
Vlm(r) =

R
Zlm(R)
rRl+1
(7.17)
In a three-dimensional crystal the sum diverges for l=0, is conditionally convergent for l=1,2 and it is
properly defined for higher l-values. But even then convergence is so slow, that a straightforward
summation would be unpractical. We will concentrate on the l=0 lattice sum (for 3D crystals), since that is
the most difficult case.
Let a charge qi be given at position si in the unit cell. The potential due to the associated lattice of charges
is
V(r) = qi
 rR1si
(7.18)
R
The divergence of this sum is not a fundamental physical problem because the net charge in the unit cell is
zero. The addition of the sums (7.18) due to all charges qi leads to a cancelling of the 'infinities', but of
course only when properly combined. Hence the expression for the total Madelung potential
VM(r) =
 qj  rR1sj
j
(7.19a)
R
with the charge neutrality condition

j
qj = 0
is conditionally convergent. The same holds for the related Madelung energy (per unit cell)
(7.19b)
Coulomb potential and lattice sums
1
EM = 2
qi V'M(si)
38

(7.20)
i
where V'M(si) is the Madelung potential at position si given by (7.19), but without the singular term (R=0,
j=i):
V'M(si) =
  qj siR1 sj +  qj si1sj
R≠0
j
(7.21)
j≠i
Conditional convergence implies that we can obtain any answer from the summation, depending on the
order in which the terms are taken. Since the Madelung energy is a well defined physical quantity the
conditional convergence of the sums (7.19a) and (7.21) is a little puzzling at first sight. This is not a purely
mathematical inconvenience here, but it is related to the physical aspect that we cannot have a truly infinite
crystal. It is not difficult to show that for any large, but finite crystal the (Madelung) potential in the interior
does not only depend on the charge distribution {qi , si } in the unit cell and on the lattice structure {R}, but
also on the boundary of the macroscopic crystal. In the limit of a very large crystal it depends on the
average charge density on the surface [Young 1987]. The lattice sum may thus be interpreted as follows: we
take a large but finite crystal of a specific shape, evaluate the finite (and hence well defined) sum and let
then increase the crystal size to infinity preserving the shape. The conditional convergence is now translated
in that the limiting result depends on the assumed shape of the crystal. The physically correct value is
obtained only if the shape has been chosen such that the average surface (and bulk) charge density is zero
[Young 1987].
The evaluation of lattice sums has attracted much attention in the literature (see Tosi [1964] and Glasser
and Zucker [1980] for a discussion; a more recent, but concise survey with references is given by
Bhowmick et al. [1988]). The numerous proposed methods fall in two classes. The first uses some kind of
integral transform to replace the slowly or even conditionally convergent sum by (usually) two series which
converge both rapidly. The most famous exponent of this class is Ewald's method [Ewald 1921]. Part of the
sum (7.18) is transformed then into a fourier series, i.e. a sum over lattice points in reciprocal space, and the
remaining real space part is a sum over rapidly decaying error functions. The above mentioned boundary
condition for expanding shapes is in this case translated in the implicit assumption (fourier series) that the
potential has the same periodicity as the Bravais lattice; a surface charge density in a macroscopic crystal
would invalidate this, also in the interior far from the crystal boundary.
In the second class of methods a direct real space summation is performed. Often some kind of expandingshapes procedure is utilized. The intuitively attractive method of expanding spheres is inappropriate since
charge neutrality is not preserved and the average charge density on a spherical surface is not zero. In
Evjen's method, using expanding cubes, charge neutrality is imposed as an auxiliary condition [Evjen
1932]. For the NaCl structure this gives indeed the correct value for the Madelung constant. Expanding
cubes fail however for the CsCl structure [Evjen 1932, Bhowmick et al. 1988] because in that case the
average charge density on a crystal boundary plane is not zero.
Coulomb potential and lattice sums
39
In the straightforward real-space summation techniques the assumed shape, i.e. the way in which the limit of
an infinite crystal is approached, is thus of crucial importance. The appropriateness of a choice depends of
course on the crystal and has to be determined separately for every new situation. When this is taken care
of, one may have obtained a sequence that converges to the correct limit but only slowly. Various of the
well known convergence accelerating methods for sequences may then be applied to improve the efficiency.
Fortunately these sequence transforms may effectively also alleviate the mentioned shape dependence, as
we will see, and thus they make the (transformed) real space summation much more convenient for practical
use [Bhowmick et al. 1988]. To understand this we return to the definition of the Madelung potential.
A mathematically satisfactory and robust way to solve the apparent arbitrariness of the Madelung lattice
1
sum is provided by the technique of analytic continuation. The coulomb potential r is replaced by rt
with a complex parameter t. The corresponding lattice sum is well defined for Re(t)>3. By analytic
continuation its value for other t, and in particular for t=1 can be derived; the outcome coincides with the
physically correct value and with the value obtained by expanding shapes with the appropriate condition on
the surface charge [Young 1987, Borwein et al. 1985, 1988, Crandall and Buhler 1987].
The situation bears a close resemblence to other, more familiar sequences with problematic convergence.
Consider the power series representation of the logarithm
z2 z 3 z4
ln(1+z) = z  2 + 3  4 + ... = 
∞

m=1
(z)m
m
(7.22)
The sum converges only for z≤1, z≠1 but the l.h.s. is defined for all complex z not on the cut
(conventionally: z real and z≤1). So, even where the sequence of partial sums
n
An = 

m=1
(z)m
m 
(7.23)
does not converge we may assign the value ln(1+z) to the infinite sum, that is, to the limit {An }n∞ .
Putting it another way: the infinite, non-convergent sum is a correct, though somewhat inconvenient
representation of ln(1+z).
Following a discussion by Shanks [1955], many sequences occurring in physics and mathematics can be
written as 'transients'
k
An = B +

i=1
i qin
(qi≠0,1)
(7.24)
The constant B is called the base and the (complex) ratios qi and amplitudes i constitute the spectrum of
the sequence. The number of transient terms, k, is the order of the sequence. The order may be infinite or
Coulomb potential and lattice sums
40
even continuous so that the sum is to be replaced by an integral. (Levin [1973] treats a more general form
than (7.24), which covers more types of sequences and is in fact more related to our lattice sums [Weniger
et al. 1986], but the simpler case of Shanks suffices for the discussion here).
Depending on the qi the sequence may have monotonic and oscillatory, converging and diverging
components. If all qi <1 the sequence converges to B and hence B is the limit of the sequence. If at least
one of the qi >1 the sequence diverges and B is then called the antilimit: the sequence diverges away
from B. In general B is the 'intrinsic part'.
An example. Consider the power series representation of (1z)
1
= 1 + z + z2 + z3 + ...
1z
1
(7.25)
The partial sum can be written in the transient form (7.24):
n
An =

m=0
zm =
1
1
   zn
1z 1z
(7.26)
so that it turns out to be a first order transient with base 1/(1z) and only one ratio, z, with amplitude
1/(1z). For z>1 the sequence An diverges, but the infinite sum is the base 1/(1z). This is an example
of the general phenomenon that the base of a diverging sequence is the 'correct' value for the 'limit', that is,
the intrinsic part equals the analytic continuation of the sums of the convergent series [Shanks 1955].
1
The same applies to our lattice sum. Replacing the coulomb potential r by rt the corresponding lattice
sum
V(r) =
 rR1sit
(7.27)
R
is a series representation of a generalized multidimensional Riemann -function [Glasser and Zucker 1980,
Crandall and Buhler 1987, Borwein et al. 1988]. It has poles for t=0 and t=3 (in the 3-dimensional case)
and it is properly defined everywhere else in the complex plane. Its representation by the lattice sum yields
a sequence (when we take all terms inside a sphere of radius R and let R increase) which converges for
Re(t)>3. For other t, and in particular for the 'coulomb value' t=1 the sequence diverges but the physically
correct value is the intrinsic part.
Obviously it is not possible to compute the intrinsic part of a diverging sequence by just monitoring the
sequence of partial sums. As noted already: the evaluation of lattice sums by the method of 'expanding
spheres' fails. It would be desirable therefore to transform the diverging transient part into a converging
sequence (without affecting the intrinsic part of course). Two closely related methods are often applied in
such cases: 'screening' and 'sequence transformations'. We will shortly discuss both and indicate the
connection between them, because BAND employs a screening form which is heuristically motivated by a
particular type of sequence transformations.
Coulomb potential and lattice sums
41
screening
Screening functions serve to suppress long-range tails of functions (such as the coulomb potential here) that
cause problems by the slowness of their decay. A screening function which is often used because of its
mathematical simplicity is the exponential function. Replace the coulomb potential 1/r by er /r with real
and positive ; finally we will then take the limit 0.
Although for finite  the lattice sum due to one single lattice of charges qi

Vi (r) = qi

R
er
rRsi
(7.28)
is defined, its value diverges smoothly as we diminish  to zero. If we apply the screening however to the
total Madelung potential due to the combination of all charges, which is in fact the quantity of interest,
V(r) =
r
 qi  reRsi
(7.29)
R
i
then the result stays bounded and converges to the correct value in the limit 0. This is caused by the
'cancelling of infinities' from the compensating charges qi . Compare the 1D analogues
lim
0
∞

en = ∞
(7.30a)
n=0
and
lim
0
∞
 ()n
en = 1/2
(7.30b)
n=0
This confirms what may be intuitively obvious: screening may work for conditionally converging sums, but
it fails for diverging sums.
The coulomb screening factor er has been used for instance by Born [1921] in an analysis of the
electrostatic energy of cubic lattices. In BAND we employ a somewhat different screening function, namely
a fermi-dirac function h(r)
1
r
1

1+e
r
1
 h(r) r
(rr0)/d
(7.31)
Like er the function h(r) decays exponentially for large r. If we choose r0 large enough however the
nearest terms in the lattice sum are only negligibly affected by the screening function h(r) and this turns out
to be an advantage in the convergence behaviour. The limiting case is obtained by taking d∞ and
simultaneously r0 /d∞, so that h(r) converges uniformly to unity for all r as the limit is approached. The
Coulomb potential and lattice sums
42
limit is not actually reached in the computational application; instead the calculations are performed with a
finite screening; the results are therefore only (good) approximations of the limiting values.
The motive for this form of screening function originates from a particular type of sequence
transformations.
sequence transformations
A large number of transformations are known that accelerate the convergence for various types of
sequences [Brezinski 1977]. Among the most widely used are the -transforms of Shanks [Shanks 1955] of
which Aitken's 2-procedure [Aitken 1926] is a special case, Levin's u-transformation [Levin 1973], the
Padé approximations [Sarkar and Bhattacharyya 1988] and Euler's transformation. The latter will have our
special attention and is treated in detail by Hardy [1949].
Sequence to sequence transformations are not only used to speed up convergence but also to induce
absolute convergence in diverging or conditionally converging sequences [Shanks 1955, Weniger et al.
1986, Bhowmick et al. 1988]. The motive to do so is of course that we are interested in the intrinsic part, or
the base in Shank's terminology, of an ill-converging sequence. If the transformation leaves the base
unaffected the transformation is justified and may help us to handle the diverging extrinsic part by
transforming it into a converging sequence.
That sequence transformations are capable to transform diverging into converging sequences (with the
correct limit) may be illustrated by applying Shank's 2-transform (i.e. Aitken's method) to the sequence
(7.26)
A
2
n

An1An+1  (An)2
An1 + An+1  2An
 1 2 {(1zn1)(1zn+1)  (1zn)2}
1z
1
= 1
=
1z
  {(1zn1) + (1zn+1)  2(1zn)}
1z
(7.32)
This transformation turns thus out to be extremely well suited for the sequence at hand. In other cases the
transformed sequence may of course not be converged immediately, as it is here, but at least it may be much
more pleasant than the original one. See Shanks [1955] and Weniger et al. [1986] for a more general
discussion of this subject.
A transformation of interest for us is Euler's transformation, defined by simple averaging
1
An  An =
An + An1
2
(7.33)
Euler's transformation is well suited to accelerate convergence in summations of alternating terms. The
reason is easily understood: due to the alternation in the terms the sequence of partial sums {An } oscillates
and averaging two consecutive values will 'damp' the oscillations.
One may repeat the proces by applying the transformation also to the transformed sequence. In this case we
obtain the second-order Euler-transformed sequence
Coulomb potential and lattice sums
43
1
1
2 An + An1 An + 2An1 + An2
An =
=
2
4
(7.34)
and in general
1 
m
An = m 
2 
m

k=0

 (mk) Ank
n=m,m+1,...
(7.35)
The effect of this simple transformation and its iterates is exemplified by the following. Let the terms be
()k
1 1 1
ak = k+1 = {1, 2, 3 , 4,...}
(7.36a)
The partial sums and the transformed sequences are then
0
An = { 1.0000,
1
An = {
2
An = {
3
An = {
0.5000,
0.8333,
0.5833,
0.7833,
0.6167,...}(7.36b)
0.7500,
0.6667,
0.7083,
0.6833,
0.7000,...}
0.7083,
0.6875,
0.6958,
0.6917,...}
0.6979,
0.6917,
0.6938,...}
The exact limit is log(2)=0.693147... and each successive transformation improves the convergence
considerably.
N
If N+1 terms ak are known, the transformation can be repeated at most N times to yield the number AN ; the
expression 'Euler transformation' applies in fact to this ultimate value. For convenience in the discussion we
N
use the phrase 'repeated (Euler) transformation' and we denote the sequence AN , N=0,1,.. as the diagonal
(Euler) transformation. Assuming that each successive Euler transformation yields a faster convergence, the
diagonal sequence represents the best approximation.
An attractive aspect of Euler's transform is that it is a linear transformation. Whereas non-linear transforms,
such as the -transforms, may in comparison be spectacularly more effective for some sequences, they do
not give the correct limit for all sequences. Linear transformations are more reliable in this respect [Hardy
1949, Shanks 1955, Weniger et al. 1986]. So one may be willing to accept a somewhat reduced efficiency
as an insurance premium for correctness of the results.
historical note
Already before the analysis of divergent sequences was given a solid mathematical foundation [Stieltjes
1886, Poincaré 1886] Euler assumed and used in his work that 'summa cujusque seriei est valor expressionis
1
illius finitae, ex cujus evolutione illa series oritur' . Whereas this is strictly speaking not true, in that
1
the sum of the series is the finite numerical value of the arithmetic expression from which the series is
Coulomb potential and lattice sums
44
distinct expressions with different numerical values may yield the same diverging series, it is essentially
correct, with a somewhat stricter definition and his 'repeated averaging' procedure is a very natural method
for the numerical evaluation of the sum of such (alternating) non-convergent series [Hardy 1949].
1.0
1.0
c
0.8
h(r)
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
0
5
10
15
k
0
20
N
Fig.1. Coefficients ck in the diagonal Euler
10
20
30
r
40
sequences, with interpolating curve, for N=5,10
Fig.2. The fermi dirac screening function
1
h(r)={1+exp((rr0 )/d)} ; the parameters are
and 20.
here arbitrarily taken as: r0 =20, d=2.
The contribution of the primitive terms ak in the Euler sequences can be expressed as sums of binomial
coefficients. Let
m
An =
n

k=0
(mn)
ck
ak
(7.37)
All terms k≤nm are fully present (ck =1), the next coefficients decrease and ck =0 for all k>n. The
fractional coefficients are, from (7.35)
1
(mn)
ck
= m
2
nk
 mi
k=nm, nm+1.....,n
(7.38)
i=0
The coefficients in the diagonal sequence are obtained by putting m=nN:
1
N
ck = 2 N
Nk
 Ni
i=0
derived [Hardy 1949]
k=0,1,...,N
N=0,1,...
(7.39)
Coulomb potential and lattice sums
45
Fig.1 displays these coefficients, with an interpolating curve, for a few values of N.
As shown by the interpolating curves, the coefficients may also be described by continuous functions cN(x)
which reproduce the values (7.39) for the integer arguments x=0,1,2,..., or more generally for a set of
equidistant points xi , i=1,2,.... This suggests already the correspondence between screening and sequence
transformations: we might have started with the (screening) function c(x); then, if the members of the
original sequence An , n=0,1,... correspond to equidistant values of x, we have effectively performed a
sequence transformation by application of the screening.
A second aspect suggested by the curves in fig.1 is that the functions cN(x) , N=0,1,2,.. can also be
interpolated in the variable N. We may thus define a general function c(p;x) which coincides with cN(x) for
p=N and interpolates between these curves for non-integer p. Now we have arrived at a general screening
with a continuous screening parameter p, similar to the traditional exponential screening parameter  as in
(7.29).
The application of the (diagonal) Euler transformation to the lattice summation poses some problems. First:
how do we define the zero-th order sequence to be transformed? It is natural and intuitively attractive to
order the contributions according to their distance from the evaluation point r, and to group together all
terms with the same distance: an 'expanding spheres' picture. For the 'special' evaluation points si , where
the charges qi in the unit cell are located, we obtain then 'stars' of terms. For a general off-center evaluation
point r these stars are scattered and fall apart in distinct terms. In all cases the discrete distance-values for
which we find contributions are not equidistant and, although on the average the terms will be alternating,
this may not be true on the smallest scale and their sizes may be rather irregular. So, whereas we have the
general structure of an alternating sequence, on closer inspection the development of the sequence is far
from smooth and regular. This makes it a little unclear how to apply exactly the Euler transformations.
Secondly, and this is possibly even more problematic from a practical point of view, the grouping of the
terms has to be determined anew for every evaluation point, because the sequence looks different, with
possibly even a different number of terms inside a sphere with fixed radius, and the combination
coefficients (7.39) will have to be computed many times. Considering the large number of evaluation points
(all points of the crystal integration grid) this is computationally unwieldy.
Moreover the definition of the zero-th order sequence is to some extent arbitrary. We may for instance
introduce formally a very dense radial mesh with equidistant points Ri , i=0,1,... and define the sequence as
the potential due to all terms inside the corresponding spheres. Of course, in the limit of a continuous mesh
we will find large subsequences of constant values in the sequence and sudden jumps at a few radial values.
The limiting long-range development of the sequence is not essentially changed by this however and we
may still expect that screening, or equivalently a suitable sequence transformation, can be applied to remove
the divergence and to extract the desired intrinsic part of the sequence. We will use this approach (i.e.
conceptually introducing the dense and equidistant radial mesh) and use thus a continous screening
function.
Coulomb potential and lattice sums
46
The next problem is then how to define that continuous function. Ideally we would like to reproduce the
Euler transformation by it, but we have not succeeded in finding an analytical form c(x) to compute
efficiently the binomial expressions for equidistant arguments xi . Fortunately the precise form of the
screening function is not essential for our final purpose, the calculation of the intrinsic part of the sequence.
Only we expect that, the more it resembles the 'Euler screening', the more effective it will be as regards the
convergence with ever smoother and 'weaker' screening. The fermi-dirac function which we employ in
BAND resembles the Euler function reasonably well and has shown to give satisfactory results; fig.2
displays this function h(r) (7.31) for a comparison with fig.1.
remarks
#
We have treated screening and sequence transformations as essentially different and only accidentally
similar. In fact both can be analyzed more rigorously as methods to 'sum' certain types of nonconvergent series. The screening as presented here is a special case of the Abelian method. In the
chemical and physical literature the phrase 'screening' is often used, which is the reason to denote it here
as such.
#
The interpolation in the variable N (7.39), i.e. the definition of a continuous-order Euler transformation
can also be given a strict meaning by a generalization of (7.39).
#
These and many more fascinating aspects of the summation of non-convergent series are discussed
extensively by Hardy [1949] and the reader is referred to his work for mathematical rigor and more
background information. The intention of our presentation here has been to explain in an intuitively
appealing way (we hope) why and how our screening method works.
the screening function h(r)
Finally we have to determine the two fermi-dirac screening parameters r0 and d. Sticking to the analogy
with the Euler transformations we may say that d corresponds to the order of the transformation (the
parameter m in (7.35)), but now as a continuous parameter, and r0 is similarly the member-index n of the
(transformed) sequence. The diagonal transformation corresponds to tuning d and r0 relative to each other
in such a way that h(r) starts to fall off from unity already at small r (compare (7.38) and (7.39): all
coefficients in the diagonal transform are fractional). The diagonal sequence, that is, increasing N in (7.39)
finally corresponds to increasing d and simultaneously adapting r0 in the manner just discussed.
From our test calculations we learned the following:
#
The 'order of the transformation' d should at least be of the same size as the nearest neighbour distance;
otherwise the oscillations in the sequence, i.e. the oscillations as a function of r0 , are too large to make
the results reliable.
#
The complete interval over which h(r) changes should be used; in particular all terms at large r must be
taken into account until h(r) has become truly negligible; neglect of this condition leads quickly to
unacceptable errors.
#
Since the actual summation over lattice points must be limited as much as possible for computational
reasons, the extension of h(r) has to be restricted, so that d and hence r0 cannot be chosen arbitrarily
large. A good practical compromise is expressed by the following conditions:
a) d0.65DNN
Coulomb potential and lattice sums
47
b) r0 13d
where DNN is the nearest neighbour distance in the crystal.
In the implementation these figures are adapted to the general (integration) accuracy parameter accint (see
SS^integration); higher accuracy induces larger values for d and r0 .
The association of the screening function h(r) with repeated Euler transformations is confirmed by the test
results for the Madelung energy. Let E(r0 ,d) be the computed Madelung energy as a function of the
screening parameters. Then, for fixed d, E oscillates as a function of r0 . The amplitude of these oscillations
depends on d and diminishes if we increase d. The energy values oscillate around the limit, provided r0 is
large enough. This corresponds to the general phenomenon in Euler transformations that each of the series
oscillates around the limit and the higer order transformations display smaller oscillations.
integration of the oscillations
Except for very small d, the oscillations as a function of r0 are found to be fairly regular over a
considerable range in r0 , so that E(r0 ) is reasonably well described by a simple cosine function around the
limit. The periodicity of the cosine depends on the crystal structure, is in the order of the nearest neighbour
distance (slightly larger), but we have not been able to detect a strict relation.
Nevertheless the regularity of the oscillation suggests that we may determine the limit more precisely by
averaging the results for a sequence of different values r0 distributed over a cycle of the oscillation. We do
so in BAND by taking five equidistant points; this corresponds to a numerical integration for periodic
functions if the spacing between the points is one fifth of the period. BAND assumes that the period equals
1.15DNN ; this will usually not be exact and consequently the 'integration' is not optimal. It should be
noted however that some improvement results anyway from the averaging.
remark
It would be possible to set up a standard pilot computation in BAND to determine a priori the periodicity of
the oscillation in the Madelung potential as a function of the parameter r0 . This could then be used to
replace the assumed periodicity, thereby raising the efficiency of the averaging. At first sight this does not
seem very important because the results are fairly accurate anyway. However with an improved averaging
the conditions on the parameters r0 and d may be weakened, both can be reduced and this leads to a smaller
lattice summation loop: increased efficiency. It may be noticed here that in 3-D crystals the construction of
the multipole potentials, though not a bottleneck, takes a significant amount of time (a few percents of the
total calculation).
implementation
VMULTI computes the multipole potentials

Vlm(r) =
Zlm(R+s )

l+1 h(R+s r)
R+s
r
R


(7.40)
Coulomb potential and lattice sums
48
for all l-values occurring in the fit set. s is the position of an atom in the unit cell; h(r) is the screening
function.
The parameters r0 and d are in BAND the variables rmadel and dmadel. One may specify them in input with
the keys
and
. By default they are derived (in RADIAL) from the nearest neighbour
distance dneigh (computed by NEIGHB) and the accuracy parameter accint:
dmadel = (0.50 + 0.05accint) dneigh
(7.41a)
rmadel = (10+accint) dmadel
(7.41b)
The loop over the cells terminates when the tail of h(r) has become negligible. Experience shows that
cutting off the tail of h(r) at large r too quickly leads to unacceptable errors. The maximum cell distance,
rcelx is set at
rcelx = rmadel + (11+accint)  dmadel
The default for rcelx may be overruled via input (key
(7.42c)
).
If one of the variables dmadel, rmadel and rcelx is specified via input, the default determination of the
others is adapted to satisfy as well as possible all conditions discussed here (routine RADIAL).
modifications of the screened lattice sums
1.
The evaluation of (7.29) in a point r takes place in a simple loop over lattice points; no interpolation
is needed, in contrast with the treatment of other one-center functions in BAND. The 'exact' evaluation,
without interpolation, gives rise to an awkward problem in the core regions. The potential of an l=0 fit
function has the long range behaviour 1/r, but for small r it goes to zero: the 1/r multipole term is cancelled
by another term, as can be verified by inspection of (7.13). FITPNT constructs the fit potential, by addition
of the multipole lattice sum (from VMULTI) at one hand and the 'normal' bloch sum of the exponentially
decaying part at the other hand. The second part is obtained by interpolation from one-center function tables
and contains thus small inaccuracies which are not present in the multipole part. Consequently the
cancelling for r0 is not exact and for very small r these numerical errors blow up.
Therefore the multipole term is modified such that for small r it is suppressed: that part is incorporated in
the other term, i.e. in the one-center function table (in routine FITRA2). The modified multipole potential
becomes
r2
Mlm(r)  Zlm() l+3 r
r +e
(7.43)
Coulomb potential and lattice sums
49
For large r this equals
the pure multipole
potential ~1/rl+1 . For
Rock-salt
r0 it behaves as r2 .
Cesium
Zincblende
Fluorite
Chloride
Consequently the
multipole lattice
A=2
1.747 51 41 ..
1.017 65 41 ..
1.637 76 12 ..
2.519 08 40 ..
potentials as computed
A=4
1.747 56 19 ..
1.017 67 95 ..
1.638 06 34 ..
2.519 40 07 ..
exact
1.747 56 45 9.
1.017 68 07 5.
1.638 05 ..
2.519 39 ..
in VMULTI have 'coreholes'. This is
compensated in BAND
in the one-center
tables (FITRA2), but it
Table I. Computed Madelung constants for different accuracies A. The exact
values have been taken from Sarkar and Bhattacharyya [1988] (rock-salt
and cesiumchloride) and Atkins [1987] (zincblende and fluorite).
must be kept in mind
when the results of
VMULTI are to be
applied in another situation.
2.
The numerical integration (over the parameter r0 ) of the lattice sums, as discussed above, is carried
out by a modification of h(r). Let the spacing between the equidistant integration points be  and the five
points be r0 r0 , r0 , r0 +, r0 +2. Then the multipole potential is computed as (we leave the first
modification aside for the moment)
1
Vlm(r) = 5
=
2
Z ( )
1

lm R
=
  R+r
l+1
(Rr(r0+k))/d
1+e
k=2  R

R

2
 Z ( )

1
1
lm R



l+1 5
k e(Rrr0)/d
1+f
R+r

k=2

(7.44)
with f = e/d . f is the variable fermfc in BAND; fermfc is computed in RADIAL. The screening function
actually employed (FITPNT, ATMFNC, VMULTI) is thus
1
h(r) = 5
2
 1+fk e(r1 r0)/d
(7.45)
k=2
In table I we display the computed Madelung constants of a few standard crystal types, for different values
of the accuracy parameter.
Coulomb potential and lattice sums
50
p o tentia l a nd d ensity, a no ther rela tio n
The following equation connects the integral over the potential with an integral over the density
2
∫ V(r) dr = 2π
3 ∫ r (r) dr
(7.46)
In a crystal the region of integration is an 'element of periodicity', such as the crystal unit cell. When divided
by the cell volume (7.46) gives the average potential, sometimes called the 'constant term (in the potential)'.
A discussion of the average potential and/or relation (7.46) turns up from time to time in the literature,
without bearing much relevance, since the zero level of the potential in a 3D crystal is physically arbitrary.
The reason to mention it here is not that we found an interesting application, but rather that in the derivation
of (7.46) I overlooked in first instance an instructive mathematical aspect, which is fairly general and which
may easily be missed.
First we examine the relation in a finite system, where the integrations extend over all space.
The potential due to a charge distribution (r) is defined by
V(r) = 

(r)dr'
rr'
(7.47)
alternating sequences again
We notice first that (7.46) does not hold for an arbitrary charge distribution. Two conditions (at least) must
be fulfilled: the monopole and dipole moments of (r) have to vanish both. The first condition is intuitively
obvious since otherwise the integral over the potential would be unbounded. Both requirements are clearly
exposed when we choose another coordinate frame by displacing the origin, i.e. set rr+r0 . The r.h.s. of
(7.46) becomes (apart from the factor 2π/3)
 r+r0 2 (r) dr = 
 r2 (r) dr + r02 
 (r) dr + 2r0 · 
 r (r) dr =

 r2 (r) dr + r02 Q + 2r0 ·D
=
(7.48)
giving the original result plus a monopole and a dipole term; Q is the total amount of charge, i.e. the
monopole moment of (r), D is the dipole moment. Since the l.h.s. of (7.46) is not affected by the
diplacement of the origin, the integral over the potential is apparently not well defined unless both Q and D
are zero.
Now we will give a derivation of (7.46) in which the dipole condition is erronously juggled away; so we
assume only a vanishing monopole moment and proceed to 'prove' (7.46). Consider a finite system, a
molecule or cluster, such that all charge is confined within a huge but finite sphere with radius R. The
potential outside the sphere has an expansion in spherical harmonics
Coulomb potential and lattice sums
V(r) =

51
Vlm(r) Zlm()
lm
r>R
(7.49)
where the components Vlm(r) are related to the multipole moments of the charge distribution. For a neutral
system the monopole term is zero. Hence, due to the angular integration of Zlm() , l≥1

 V(r) dr = 0
(7.50)
outside
sphere
So for the potential (as well as for the density) the integration over all space in (7.46) can be restricted to
the integral inside the sphere.
To evaluate the integral of the potential (the l.h.s. of 7.46) we use a formal expression for the density
analogous to (7.49), and the expansion of rr'
1
=
rr'
4π
rl
<
 2l+1 rl+1
>
lm
1
in spherical harmonics
*
Zlm(') Zlm()
(7.51)
r< and r> refer to the smaller and larger respectively of r and r'. This gives
R
4π
r<l'
2

 V(r) dr = 
 r dr 
 d 
 (r') dr'  2l'+1 rl'+1 Zl'm'(') Zl'm'() =
all
space
l'm'
0
R
*
>
R
l'
4π r<
*
2
 r dr 
 d 
 (r')2 dr' 
 d '
=
lm(r')Zlm(')
2l'+1 rl'+1 Zl'm'(') Zl'm'() =
>
lm
l'm'
0
0

R

R
l
4π r<
2
2 dr'


=
r
dr
(r')

(r')


lm
2l+1 rl+1  d Zlm() =
>
lm
0
0

R
R
2
2
= 4π 
 (r') dr' 00(r') 
 r dr
0
0
1/2
(4π)
r>
R
R2 (r')2
2
= (4π)3/2 
 (r') dr' 00(r')  2  6  =
0
R
R
2π
2
2
 r dr 00(r) 
 d Z00()  3 
 r4 dr 00(r) 
 d Z00() =
= 2πR 
0
0
=

lm
R
 2R 2

2π
4
2πR 
 r dr lm(r) 
 d Zlm()  3 
 r dr lm(r) 
 d Zlm() =


0
0
2π
2π
2
2
= 2πR2 
 (r) dr  3 
 r (r) dr =  3 
 r (r) dr
(7.52)
Coulomb potential and lattice sums
52
The charge neutrality of the system has been used in the last line of (7.52) and in (7.50), but what happened
to the dipole condition?
The clue is the dipole term in equation (7.50). A dipole moment D gives a potential at a position r relative
to the dipole
Vdipole(r) =
D cos
r2
(7.53)
where  is the angle between D and r. The integral of this potential over the region outside the sphere is
∞ 

 Vdipole(r) dr = D  
 dr  
 d cos
R 
outside
(7.54)
sphere
This was taken to be zero (7.50) because of the last factor, but the unboundedness of the radial integral
makes such an assessment hazardous: infinityzero=??.
Integral (7.54) may be associated with the summation of an alternating sequence, of which the terms do not
go to zero. If we start at the sphere boundary and increase R by small discrete steps, each shell may be
thought to contribute two terms to the integral (7.54), one positive and one negative, corresponding to the
half shells (0≤≤π/2) and (π/2≤≤π). They cancel each other, but the individual terms do not go to zero as R
increases.
The 'outside' integral (7.54) resembles thus a summation like S=11+11+1.... Putting it another way:
assume that the dipole moment is oriented along the z-axis and imagine a displacement of the sphere in the
positive z-direction (a different choice of origin): as regards the integral over the dipole potential a positive
half-shell is drawn into the sphere from the outside region and a similar negative half shell is expelled. The
integral inside the sphere is thus changed by a term which is not negligible however large we took R. An
equal, but opposite change has to take place in the 'outside' integral if the integral over all space (7.46) is
properly defined. Relation (7.50) cannot be true then in both situations, although the monopole condition,
presumably the only requirement for (7.50), may be satisfied.
Conclusion: the integral of the dipole potential over all space, and in particular over the outside region, is
not well defined. In analogy with a discrete series we may call (7.50) conditionally defined.
average potential in a crystal
In a 3D crystal the charge and potential are periodic functions. Divide (r) in localized terms i(r)
(r) = ∑i(r)
(7.55)
such that the i(r) transform into each other under the translation operators of the crystal. This does not
uniquely define i(r). We might take for example i(r) equal to the total charge (r) in cell i, and i(r)=0
otherwise; that is we may view (r) as being build up from 'cells'. We may however also choose i(r) as an
atomic-like distribution so that (r) is described as a 'sum-of-overlapping-atoms'. We demand only that
Coulomb potential and lattice sums
53
i(r) is of finite extension so that it can be contained in a finite sphere. Furthermore we require of course
that its monopole and dipole moments are zero (i.e. the following remarks do not apply if these conditions
cannot be met). Then, following the lines of the derivation (7.40), it is easily found that the average
potential is
1
2π
 V(r) dr =
 r2 i(r) dr
 
3 
unit
cell
(7.56)
all
space
where  is the volume of the unit cell.
A strange aspect of this result is that a different choice of the localized 'element' i(r), e.g. a 'cell' instead of
an atomic-like distribution, may yield a different value for the average potential (we assume of course that
for all choices of i(r) the dipole (and monopole) moments are zero). Indeed, in a truly infinite 3D crystal
the average potential has no physical meaning (we cannot take an electron out of the system) and
mathematically it is not defined.
For this reason BAND prints in a 3D calculation the final one-electron energies not as absolute numbers, but
gives their values relative to the fermi energy.
In a very huge, but finite system, we may also define elements i(r) to build up the system, but the
boundaries of the system now determine the i(r) and the result (7.56) is well defined and gives the average
potential in the interior.
8
DOS: density of states and population analysis
DOS is the master routine for the analysis of the eigenstates and energy bands after the SCF procedure has
been finished. A Mulliken population analysis is performed and optionally the density of states (DOS) is
computed. The evaluation of the necessary integrals over the Brillouin Zone is discussed in
SS^BZ-integration and will not be repeated here.
Apart from the total density of states various partial densities of states may be computed. They are
analogous to the Mulliken populations [Mulliken 1955]. We write the total DOS as
n(E) =

ij
qij(E)
(8.1)
where i and j run over the primitive basis functions: atomic one center functions and/or plane waves in the
current implementation; one might of course define other basic quantities such as fragment orbitals.
An equation for qij(E) is derived from the usual expression for the DOS:
DOS: Density of states and population analysis
n(E) =


n
54
dk (Een(k) )
(8.2)
BZ
We insert the identity (normalization of the one-particle states)
1=
∫  (k;r) dr
2
n
(8.3)
and use the expansion of n(k;r) in the basis functions
n(k;r) =  cni(k) i(k;r)
(8.4)
i
to write for (8.2)
n(E) =
 ∫∫
*
ijn
=
*
dk dr (Een(k) ) cni(k) cnj(k) ni(k;r) nj(k;r) =
∫
dk (Een(k) ) cni(k) cnj(k) Sij(k)
∫
dk (Een(k) ) cni(k) cnj(k) Sij(k)
ijn
*
(8.5)
so that
qij(E) =
n
*
(8.6)
The (surface) BZ integral is evaluated numerically:
∫ dk (Een(k) )  
k
wn(k)
(8.7)
The weights wn(k) depend on the energy E, giving
qij(E) =

n
*
wn(k;E) cni(k) cnj(k) Sij(k)
(8.8)
The off-diagonal overlap density of states is then
nij(E)  qij(E) + qji(E) =
=

nk

*
nk
*
wn(k;E) 2 Re{cni(k) cnj(k) Sij(k) } 
kn
where we defined uij by
*
wn(k;E) [cni(k) cnj(k) Sij(k) + cnj(k) cni(k) Sji(k) ] =

nk
kn
wn(k;E) uij
(i≠j)
(8.9)
DOS: Density of states and population analysis
kn
55
*
kn
uij =uji = 2 Re{cni(k) cnj(k) Sij(k) }
(8.10)
The gross density of states is
1
ni(E) = qii(E) + 2

i≠j
[qij(E) + qji(E) ] =

nk
1
on(k;E) 2

j
kn
uij 

nk
kn
wn(k;E) vi
(8.11)
The diagonal elements uii in this expression are defined by (8.10) (i=j) and are twice the net density of
states.
implementation
DOS is the master routine of this section. The total and partial (gross and overlap) densities of states are
computed for a number of equidistant energy values in the range (edosmn,edosmx). The number of energy
values, nedos is input by the key
or
; first and second defaults are 0 and 100. The value 0
implies that no DOS is computed. The DOS section performs then only the Mulliken population analysis (see
below).
The minimum and maximum energy values can be specified in the same input record as nedos (with
ordering: nedos, edosmn, edosmx), but they may be omitted; defaults are edosmn=4 and edosmx=+1 (a.u.).
These values are relative to the fermi energy.
By default only the total DOS is evaluated and output. Gross and overlap densities of states are generated if
the keys
, respectively
are supplied. Each data record in the corresponding key-input
block (see SS^input) specifies the number of the basis function, (respectively the numbers for a pair of
functions) for which the partial DOS is to be computed. This information is retrieved from the input file by
DOSINP and stored in the arrays idgros() and idover().
The eigenstates n(k;r) and eigenvalues en(k) are on file iteig. The stored expansion coefficients relate
however to the orthonormal basis used in the SCF procedure. The first step is therefore the back
transformation to the original basis, the plane waves and bloch sums of one-center functions. The
transformation matrix has been written on file itdata by BASOVL, together with the overlap matrix of the
original basis (S in (8.8)). DOSTRA performs the back transformation to obtain the required expansion
kn
kn
coefficients and calls then DOSCON. DOSCON computes the quantities uij and vi ((8.10) and (8.11)) and
stores them on file itcon.
DOSCAL finally organizes the calculation of the total and partial DOS The energy bands en(k) are read
from file iteig and the surface integral occupation numbers wn(k;ei) are computed (OCCUPA) for all
required energies ei . The summation over the bands and k-points yields directly the total DOS (DOSTOT).
kn
The gross and overlap DOS are evaluated in DOSPOP by straightforward numerical integration of uij and
kn
vi .
DOS: Density of states and population analysis
56
The DOS output (DOSOUT) is either printed (default) or written to a plotfile. The plot file is generated if the
key
is found in input. The same input record may also contain the unit number for the plot file
itplot (default: 7). The plot file is a formatted file.
POPANA, called by DOS, computes the Mulliken populations. The procedure is completely analogous to
kn
kn
that in DOSPOP: the functions uij and vi are summed over the bands and numerically integrated over the
BZ. The integration weights refer now not to the surface integral for some energy, but to the volume
integration up to the fermi energy: they are the occupation numbers of the one-particle states.
9
Energy: total and cohesive
In the DF formalism, with the customary approximation of motionless point nuclei, the total energy of a
system is
E = ET + EXC + EC
(9.1)
ET is the kinetic energy, EXC the XC energy and EC the coulomb energy.
ET = ∑ni i Ti
(9.2)
The summation includes integration over the BZ; ni are occupation numbers of the one-particle states i; T
is the kinetic energy operator, /2 in atomic units.
EXC = 
 XC(r) dr
(9.3)
XC(r) is the XC energy density. It is a functional of the electronic charge density (r). The precise form
depends on the DF adopted, for example X or Vosko-Wilk-Nusair (see SS^XC).
1
EC = 2
∫∫ dr
1 dr2
{(r1 )+∑Z (r1 R )} r1 {(r2 )+∑Z (r2 R )}
12
(9.4)
The summation runs over all atoms . Z and R are the nuclear charge and position.
There are two sources of error in the computation of the energy. The first is from the evaluation of various
integrals. The applied numerical integration procedure [chapter III], though fairly accurate, is not exact.
The second source of error is the coulomb potential, implicit in the Coulomb energy (9.4). To calculate the
coulomb potential from the charge density BAND employs fit functions {fi } for which the corresponding
coulomb potentials are known (SS^coulomb potential).
 = ∑ci fi + 
(9.5)
Energy: total and cohesive
57
VCoul []  VCoul [∑ci fi ] = ∑ci VCoul [fi ]
(9.6)
The coefficients ci are the least squares solution of the fitting problem, with the constraint that the fit
density fit∑ci fi represent the same total charge as the exact density .  is the difference between the
exact density and the fit density. The presence of  implicates an error in the computed coulomb energy
term.
The cohesive energy is the difference between the total energies of the crystal and the constituting atoms
respectively.
Ecoh = ∑E  Ecrystal
(9.7)
The free atom subprogram DIRAC determines the first term in the r.h.s. of (9.7) almost exactly. The crystal
energy could be calculated by numerical integration in the crystal integration grid. Since Ecoh is often very
small compared with the two terms defining it, care is needed lest a meaningless value for Ecoh is obtained
from (9.7), even when Ecrystal is computed with a relative accuracy of 104 or 105 . Therefore we
rewrite expression (9.7) in a form that allows the calculation of Ecoh directly by numerical integration, with
corresponding precision. The crystal total energy is then defined and computed as
Ecrystal = ∑E  Ecoh
(9.8)
By interpolation from the tables produced in DIRAC, the free atom functions (the charge density, the
coulomb potential and the orbital functions i and Ti) are evaluated in the crystal integration points. The
various energy terms in Ecoh are conceptually the difference of the corresponding terms in the crystal and
atomic total energies respectively. Since these are now computed by integration of the energy densities in
the same grid, the obtained difference term is identical to the integral of the difference function. Hence the
precision of the integration is preserved in Ecoh :
∑wi Ecrystal(ri) ∑wi Eatoms(ri) = ∑wi (Ecrystal(ri)Eatoms(ri)) = ∑wi ∆E(ri )
(9.9)
The summation runs over all integration points.
kinetic energy
For each atom  the kinetic energy E ?????
ET crystal = ∑Tij Pij
(9.10)
This form is equivalent to (9.2).
XC energy
For each atom the free atom charge density (r), evaluted in the crystal points by interpolation, is used
to compute the XC energy density XC,(r). This is then integrated to contribute to the energy term ∑EXC,
. The self consistent crystal charge density is used of course for EXC,crystal .
Energy: total and cohesive
58
coulomb energy
The coulomb energy difference is rewritten to split off the electrostatic coulomb interaction between the
unrelaxed free atoms. This large term can be evaluated separately with high precision without much effort
(see below). The rest, the relaxation part, is computed by 'normal' numerical integration.
Define
atoms = ∑
(9.11a)
crystal = atoms + def
(9.11b)
The deformation density def is approximated with fit functions for the solution of the Poisson equation
(SS^coulomb potential):
def = fit + 
(9.11c)
Vfit = coulomb potential due to fit
(9.11d)
V = coulomb potential of atom , including the nuclear potential Z /r
(9.11e)
Vatoms = ∑V
(9.11f)
Then
2 EC =
∫∫ dr dr
1
...
2
{crystal(r1)+∑Z (r1 R )} r1 {crystal(r2)+∑Z (r2 R )} 
12
 ∫∫ dr1 dr2 {(r1)+Z (r1 R )} r12 {(r2)+Z (r2 R )} =
1

(omitting the spatial arguments)
=
∫∫ dr dr
=
∫ Vatoms {atoms+∑Z +def} dr + ∫ def Vatoms dr +
1
+
=
{atoms+∑Z +def} r1 {atoms+∑Z +def}  ∑∫ V(+Z) dr =
2
12
∫∫ dr dr
1
1
∫
2 def r def ∑ V (+Z) dr =
12
 ∫ V (+Z) dr + 2∫ Vatoms def dr ∑∫ V (+Z) dr +∫∫ dr1

1
dr2 (fit+) r (fit+) =
12
Energy: total and cohesive
=
59
 ∫ V (+Z) dr + 2∫ Vatoms def dr + ∫

Vfit (def+) dr +
∫∫ dr dr
1
2  r (9.12)
12
1
In the last line of (9.12) the first term is the electrostatic interaction between the free atoms, to be treated
below. The second and third term are the relaxation terms, evaluated by numerical integration.
fit erro r
The last term in (9.12) cannot be computed. It is an error term resulting from the inadequacy of the fit
functions to describe the (deformation) density exactly. An upperbound on this error term can be
determined for three-dimensional crystals. Let  have the fourier expansion
(r) = ∑KeiK·r
(9.13)
K runs over the reciprocal lattice sites. def contains zero charge, as does fit (and hence ) because of the
constraint in the fit. Hence 0=0 in (9.13).
The coulomb potential due to  is obtained from the Poisson equation V=4:
V = 4

 KK2 eiK·r
(9.14)
K0
This gives for the error term
=
∫∫ dr dr r1
1
2
∫
 = V  dr = 4π
12
∫ { K
K
K iK·r
2 e

K'
K'eiK'·r
} dr 
4
2
Kmin
∫ 2dr (9.15)
Kmin is the smallest nonzero vector of the reciprocal lattice. The integral ∫2dr is easily computed and has
shown in practice that the fit functions routinely employed are adequate.
This analysis does not include the effect of the inexactness of the fit on the self consistent solution itself. It
seems reasonable however to suppose that this may be neglected as regards the energy: a small deviation in
the ground state density from the true variational minimum has a quadratic and hence a very small effect on
the resulting energy.
electro sta tic intera ctio n b etween neutra l a to m s
The electrostatic interaction between two spherically symmetric atoms A and B at positions RA and RB is
Eelstat =
∫∫ dr dr
(9.16)
1
2
{A+ZA } r1 {B+ZB } = ∫ VA {B+ZB } dr = ZB V(RB)
12
+
∫V
A
B dr
Energy: total and cohesive
60
The last integral is evaluated numerically in prolate spheroidal coordinates. Several other types of elliptic
coordinates [Arfken 1970] have been tried but yielded inferior precision.
Let A and B be located along the z-axis, at positions z=±a. Define coordinates u,v, by
x = a sinh(u) sin(v) cos()
y = a sinh(u) sin(v) sin()
(9.17)
z = a cosh(u) cos(v)
Numerical integration can be set up as a product formula in the variables (u,p,) with pcos(v). The
-integration yields a simple factor 2 because the functions VA(r) and B(r) are invariant for rotation
around the z-axis. So
∫
∞
1
VA B dr = 2
 du 
 dp J(u,p) VA(u,p) B(u,p)
0
(9.18)
1
The Jacobian is
J(u,p) = a3 sinh(u) (cosh2(u)  p2 )
(9.19)
For neutral atoms the functions fall off rapidly as u goes to infinity, so that the upper limit on the
u-integration can be replaced by a suitable umax . Numerical integration is then performed by a GaussLegendre product formula in u and p
∫V
nu np
A
B dr    J(ui ,pj ) wi wj VA(ui,pj) B(ui,pj)
u
p
(9.20)
i=1 j=1
For light atoms up to the first series of transition metals nu =30, np =20 give already accurate results.
Heavier atoms require more points but still the electrostatic interaction energy can be calculated almost
exactly without much effort. The precision of the cohesive energy is determined by the other terms.
electro sta tic Ma d elung energ y
In case the crystal calculation is started up with ions, the electrostatic energy has to be split in two terms: the
Madelung energy due to the effective ionic point charges plus the interaction between the neutral atoms.
The latter has been treated above. The former can be done in any of the standard ways, for instance the
Ewald technique. In BAND it is evaluated by a finite lattice sum in real space. For point charges q at
positions d in the unit cell the Madelung energy per unit cell is computed as
1
EM = 2
'R+dd  h(R+d d 
q q
R
(9.21)
Energy: total and cohesive
61
The prime on the summation signifies omission of the singular terms; h(r) is a screening function. The
formal expression for the Madelung energy has h(r)1, making the sum conditionally convergent. In BAND
h(r) is a fermi-dirac distribution function
h(r) =
1
1+e
(9.22)
(rr0)/d
so that the sum is absolutely convergent. The parameters r0 and d determine the accuracy of the resulting
sum. Typical values used are r0 =40 a.u. and d=3 a.u. For a discussion see SS^coulomb potential.
implementation
ATMFNC interpolates and sums the atomic functions and writes them to file. The coulomb potential is
stored on itvatm and the density (both the valence and the total density) on itdatm. ATMFNC calculates also
the energy terms ET, atoms and EXC, atoms by integration over the crystal grid. The atomic total energies
E are computed in DIRAC. All these energy terms are written to file itdatm after the density values.
Auxiliary routine ELSTAB evaluates the interaction between two neutral atoms (9.20); the elliptic
integration parameters are nuelst and nvelst (in common FIXDAT).They can be assigned values via input
(keys
and
respectively); first and second defaults are for nuelst: 40 and 60, for nvelst: 80 and
120.
ELSTAB is called by ELSTAT where all terms are added to the total electrostatic interaction energy (9.16);
the Madelung part, a simple summation in ELSTAT, is kept separate. Both energy terms, elstt and emadel,
are passed to ATOMIC where they are written to itdatm, together with the sum of atomic total energies
eatoms.
The variables defining the cut-off function (9.22) for the Madelung energy are rmadel (r0 ) and dmadel (d);
see SS^coulomb potential for the determination of their values.
The crystal kinetic energy matrix Tij is constructed in HAMFIX (one matrix for each k-point in the BZ) and
written to itdatm. The final self consistent density matrix Pij is computed in PMATRX and written to file
itpmat. The multiplication (9.10) is performed in ENERGY.
ENERGY integrates also the crystal XC energy (9.3), the coulomb terms vatdef and vdef (the 2nd and 3rd
term in (9.12) respectively) and the fit error integral ∫2dr.
The fit coefficients are computed in RHOFIT and, at the last cycle of the SCF procedure, written by RHOPOT
to file itpot. They are used to compute the fit density and hence the deviation function  (9.11c). The
potential Vfit due to the fit is constructed from the fit coefficients (and the fit potentials on itfit) in RHOPOT
and written to itpot, after the fit coefficients.
Files
10
62
Files
BAND employs a large number of files to store data during the calculation. Some of these files can be fairly
large. This is in particular so for the file that contains for each k-point the values of the basis functions in
the integration points.
On some machines the total amount of data is not the problem, but there may be a severe limit on the
amount per file. This has led us to devise a series of subroutines in which all operations with internal files,
like reading, writing, open and close, are performed. We will denote this set of routines plus the related data
structures as the filemanager. The file manager keeps track of the amount of data per file and switches to
another file if a particular one is full.
Files are represented in the program as integer variables. The names of these variables start with the
characters 'it'; itbas for instance is the file with the basis functions. The value of the variable is the unit
number of the file. This value may be changed by the file manager during input/output (IO) operations. This
happens when the file is full and more data have to be written; the file manager takes another unit to write
on, but in the program we do not notice that: IO is still performed with file itbas.
The implementation of the file manager has not only solved the problem with the maximum file-sizes, but
also it has increased the programming facilities in BAND. We can now freely open and use files, if some
program extension requires so, without bothering which units are available: the file manager knows. One
may for instance insert somewhere
call flnew (itnew,'unformatted')
(10.1)
with the effect that a currently unused unit number is assigned to the variable itnew; the associated file is
opened and rewound.
The file management system is structured as follows. File variables in the program, such as itbas, are not
associated with one single unit, but with a sequence, or string of files. The value of the variable is the unit
number of the currently active member of the string. The members of the string are connected by a pointer
structure, implemented as a two-dimensional array indfil(0:maxfil,2); the constant maxfil is the maximum
number of units that is in principle available to the file manager. Indfil(ii,1) is the successor of unit ii in the
same string and indfil(ii,2) precedes ii. Begin and end of a string have the pointer pointing to itself: the first
unit in a string has indfil(ii,2)=ii and the last unit has indfil(ii,1)=ii.
A special string is the string of free files: the units that are available but currently not in use. At the start-up
of the program all units 1 through maxfil are members of the free string. Indfil(0,1) points to the first free
file; initially indfil(0,1)=1, indfil(i,1)=i+1, i=1,2,... To detect when the last free file is reached, the forward
pointer of that last free file, indfil(maxfil,1), is set to zero (instead of maxfil, as we would do at the end of
Files
63
any other string). Whenever a new file is requested by the program, the file manager takes the first free unit
for it. This is removed from the free string to form a new string on its own; the new string consists of one
member at that moment. The main part of subroutine FLNEW, mentioned above, thus reads
subroutine flnew (itnew,....)
(10.2)
itnew = indfil(0,1)
pick up a free unit
indfil(0,1) = indfil(itnew,1)
first next free unit
indfil(itnew,1) = itnew
pointers of the new string
indifl(itnew,2) = itnew
..
..
end
Apart from the index array indfil, the file manager employs an array lenfil. This is used to know when the
end of a file has been reached. Initially lenfil(ii)=0 for all units ii. When n words are written to file ii,
lenfil(ii) is increased by n.
Actually it is increased by: nlstor+markio. The extra term markio is used to be on the safe side: depending
on the computer the system file manager may write some extra information to file while executing a
FORTRAN write instruction; consequently the file may already be more filled than the unwarned user
expects.
The value of lenfil() is to be compared with the maximum file size, which is given by the constant mxflln in
the program; the factor lstor converts the number of words, n, to the units in which mxflln is specified;
usually this will be in words so that lstor=1.
When lenfil() reaches the maximum file length mxflln, the file manager switches to the next file in the same
string. If necessary the string is extended with a unit from the free string.
The analogous procedure is followed when reading from a file. The counter lenfil() is reset at zero when a
file is rewound. It is increased again with each read operation, so that we know when we have to switch to
the next file to retrieve more data.
All this has the consequence that read and write operations are accompanied by several checks and updates
of the file management information system. The same is true for file operations like rewind; rewind(itbas)
has now to be interpreted as: a) associate with itbas the first unit in its string, b) rewind that unit, c) reset the
file counter(s) lenfil().
To keep the program as transparant as possible all operations with files are performed in the file manager
subroutines; the usual FORTRAN statements are replaced by calls to these routines. This makes it also easier
to adapt the file manager to future demands. We list the involved routines below with a concise explanation.
Read and write are performed in three routines FLIOI, FLIOL and FLIOR. These are completely similar; the
only difference is that they handle respectively integers, logicals and reals. A typical call is
call flior ('write',itbas,n,aa)
(10.3)
Files
64
This is the analogue of the usual
write (itbas) (aa(i),i=1,n)
(10.4)
Aa may be a scalar if n equals unity. The two statements above are also equivalent in that they have to be
interpreted as writing (or reading) exactly one record, whatever may happen 'on the background'. So the
sequence
call flrwnd (itbas)
rewind
(10.5)
call flior ('write',itbas,10,aa)
call flior ('write',itbas,10,bb)
call flrwnd (itbas)
call flior ('read',itbas,5,cc)
call flior ('read',itbas,5,dd)
has the effect that the first five elements of bb are stored into dd(1:5) (i.e. not the last five of aa).
Furthermore the sequence
call flrwnd (itbas)
(10.6)
call flior ('write',itbas,10,aa)
call flior ('write',itbas,10,bb)
call flrwnd (itbas)
call flior ('read',itbas,15,cc)
results in an error, because it requests the reading of a larger record than has been written. On some
machines the usual FORTRAN equivalent is possible and results in the reading of both written records. For
reasons of simplicity and error checking we have chosen not to support this in BAND's file manager.
The read/write routines FLIO- can be used only for IO with unformatted files.
Opening of new files, deleting of superfluous files and rewinding are performed as follows.
-
FLNEW(it,..) supplies a free unit number, opens and rewinds the file and assigns the unit number to the
argument.
-
FLFREE(it) closes, with status 'delete', unit it and all other units in its string. The unit numbers are
inserted into the free string again.
-
FLRWND(it) assigns to it the first unit in the string, rewinds that file and resets the counter lenfil(it).
The subroutines FLIOI, FLIOL, FLIOR, FLNEW, FLFREE and FLRWND are the only ones that are called in
the 'normal' program. A few auxiliary routines are used in the file manager to isolate some specific aspects.
-
FLNEXT(it) picks up and rewinds the next file in the string.
-
FLADD(it) picks up a free unit and adds iut to the string it.
Files
-
65
FLCLOS(it,..) and FLOPEN(it,..) perform the usual FORTRAN open and close operations (i.e. for one
unit)
Finally
-
FLPROT(it) removes unit it from the string of free files, so that it is not available anymore to the file
manager. FLPROT can be activated via input to protect specific units from use by the program; the key
is
, where it is the unit number. Subroutine INIT calls FLPROT to protect the standard input and
output files, units 5 and 6 respectively in the current implementation.
-
FLDUMP(message,action) writes the state of affairs in the file manager to output. message is a string
that will be printed before the information, action specifies what to do with the currently open files:
action may be 'delete' or 'keep'. FLDUMP is called by STOPIT when the program is normally terminated,
to check whether files are still not closed (action='delete'). For debugging purposes this information can
be printed (action='keep') at many places in the program by giving the input instruction
.
remark
The read and write routines of the file manager are used only for unformatted files. A few specific files in
BAND are formatted. These are the plotfile for the density of states data, the input file for the numerical
integration package and the file to which all input for the program is copied. Furthermore the integration
package POINTS delivers two files with information. These have not been written (and hence cannot be
read) by the file manager. The unit numbers are controlled by the file manager but read and write is
performed with the normal FORTRAN statements (in GEMTRY).
11
Form factors
X-ray factors, or form factors are (proportional to) the Fourier coefficients of the charge density. The form
factors are denoted FK .
(r) = ∑FK eiK·r
(11.1)
The summation runs over all lattice points K of the reciprocal lattice.
FK =
1
 (r) eiK·r dr

(11.2)
Integration is over the unit cell with volume . The form factors corresponding to symmetry related
K-vectors are equal, or differ at most by a phase factor. Let {t,R} be a space group operator,
r'={t,R}r=t+Rr, K'=RK, then
FK' =
1
1
 (r') eiK'·r' dr' = 
({t,R}r) e


-i [K'·t + RmnKn Rmprp]


mn

p
dr =
Form factors
66
1
= eiK'·t 
(r) eiK·r dr = eiK'·t FK

(11.3)
where we used the symmetry of (r) and the unitarity of the operator R.
BAND calculates the form factors for the stars of K-vectors 0..N, counting K=0 as the zero-th star. N may be
specified via input by the key
or equivalently
; default N=3.
FORMFA, the master routine, calls first PLANEW to generate the coordinates of N stars of (reciprocal)
lattice points, then CELRED to reduce this set to the subset of symmetry unique lattice points and finally
FORMF1, where the form factors are actually calculated.
Integral (11.2) is evaluated numerically, by summation over all integration points
1
FK = ∑w
j
∑wj eiK·rj
(rj )
(11.4)
remarks
#
Equation (11.2) follows from (11.1) by orthogonality of the plane waves. Let SKK' be the overlap matrix
of the employed plane waves, evaluated numerically
1
SKK' = ∑w
j
∑wj
1
eiK·rj eiK'·rj = ∑w
j
∑wj
ei(K'K)·rj
(11.5)
Starting from (11.1) we may define the form factors alternatively by

FK =

K'
1
SKK' FK'
(11.6)
where FK' is given by (11.4). With a perfect numerical integration SKK' equals KK', the analytical


value, and FK =FK . Both FK and FK are calculated and printed. This gives some indication of the
reliability of the results as regards the numerical integration.
We may expect that the imperfection of the integration shows up more strongly for larger K, so that the
overlap matrix S will resemble the unit matrix better if its size is smaller, i.e. if less stars of K-vectors

are taken into account. Consequently the discrepancies FK FK for a particular K0 may be due to the
0
0
presence of larger K in the set and not necessarily to the difference between FK as computed from
(11.4) and its analytical value (11.2).
#
0
The form factors provide also a means to check the fit functions used for the calculation of the coulomb
potential. Let the latter have a Fourier expansion
V(r) = ∑VK eiK·r
From (11.1) and Poisson's equation V=4 we may compute the form factors as
(11.7)
Form factors
V
FK
67
K2
= 4π VK
(11.8)
VK is determined from the potential values in the integration points, like FK (11.4) . In the program V(r)
V
is computed via the approximate expansion of the density in fit functions. Comparison of FK and FK
gives thus an indication of the adequacy of the fit set. See also the note on the fit error in SS^energy.
V
Relation (11.8) does not hold for K=0: F0 is physically meaningless. The corresponding Fourier
coefficient of the density relates to the total amount of charge; this particular coefficient is therefore not
divided by the volume (∑wj ) in FORMF1.
12
Geometry
The numerical integration schemes in BAND are closely related to geometric concepts: space is divided in
polyhedra, low-dimensional crystals are envelopped in boundary planes, the points are generated in the
irreducible wedge, etc. In this section we discuss the representation and processing of such data.
p la nes
A plane consists of all points x that satisfy
x·v = d (12.1)
and is therefore represented in BAND by the normal vector v and the distance d; the variable names are
usually plane(3) and dplane.
The same plane is obtained when we reverse the signs of both v and d. Often we will be interested in the
part of space at a particular side of the plane. With this in mind we resolve the arbitrariness in the sign of
(v,d) by defining a point to be inside the plane if x·v<d and outside if x·v>d. Reversing the signs means then
that we interchange inside and outside of the plane.
o rienta tio n a nd ro ta tio n
Geometrical analysis in a plane is most convenient in the xy-plane, since we can then neglect the third
coordinate. To achieve this we will usually need a rotation of the plane under consideration. The rotation
matrix is computed by ROTMAT(v1,v2,rmat). Input are the vectors v1(3) and v2(3) and output is rmat(3,3),
the matrix that rotates the direction v1 to the direction v2; the input vectors need not be normalized. In the
mentioned application v1 would be the normal of the plane and v2 the z-axis, i.e. the vector (0,0,1).
The problem at hand does not uniquely define the rotation matrix: infinitely many unitary transformations
v1v2 exist with determinant +1. ROTMAT takes the shortest possible arc of rotation; the fixed vector, the
axis of rotation is the vector product v1v2.
Geometry
68
This particular convention is useful when some particular compounded rotations have to be constructed.
Assume for instance that s1 and s2 are two vectors and we want to rotate them such that s1 becomes the xaxis and s2 lies in the xy-plane. Let then ux and uz be the x-axis (1,0,0) and z-axis (0,0,1) respectively. The
rotation matrix rmat is calculated by
call r3vecp (s1,s2,axis)
vectorproduct of two vectors
call rotmat (s1,ux,rmat1)
rotate s1 to the x-axis
call rotate (rmat1,axis,axis2)
rotate a vector
call rotmat (axis2,uz,rmat2)
rotate (s1,s2) to xy-plane
do 10 i=1,3
10 call rotate (rmat2,rmat1(1,i),rmat(1,i))
product rotation
(12.2)
Since axis is orthogonal to s1 (and s2) , axis2 is orthogonal to the x-axis (the rotated s1 ) and lies thus in the
yz-plane. ROTMAT constructs then rmat2, which rotates axis2 to the z-axis, using the x-axis as the axis of
rotation. So rmat2 leaves ux , the rotated s1 , invariant, hence the compounded rotation transforms s1 into
the x-axis, as required. This procedure is used for example in PYRPT4 where a pyramid is rotated
(PYRROT) to some standard orientation.
lines
Lines in the xy-plane are defined by (12.1), where x and v have now only two components. In analogy with
the planes we define the inside and outside of a line (x·v<d and x·v>d).
By the direction of a line we will understand the angle  that its normal makes with the positive x-axis
(0≤<2π). Any set of lines to be processed is ordered in BAND such that their directions are in increasing
order.
p o lyg o ns
A polygon (in the xy-plane) is defined by its sides, i.e. by a set of lines with their orientations such that the
polygon is inside each line.
To find the vertices of the polygon we order the sides according to their directions. The intersection point x
of an adjacent pair of lines is the solution of a linear 22 system
vi ·x = di
(12.3a)
vi+1 ·x = di+1
(12.3b)
Given a set of lines the polygon defined by them may have fewer sides if one or more of the lines are
redundant; fig.3 shows (part of) a set of lines, of which the numbers 2 and 3 are redundant; they are
excluded, or cut-off one might say, by the lines 1 and 4.
Geometry
69
Considering only the direct
neighbours we see that line 3 is
cut off by 2 and 4, but line 2 is
not excluded by its neighbours
1 and 3. In general we have say
lines i and j, excluding all lines
i+1..j1 between them. At least
one of these excluded lines is
then cut off also by its own
direct neighbours. POLYGN
uses this fact to remove from
an input set of lines all
redundant ones: for each line
exclusion is decided only with
respect to its neighbours.
Having traversed the whole set,
POLYGN restarts the loop if
one or more lines were
removed, until no more
redundant lines are found.
Fig.3. The shaded area is (part of) a polygonal region, defined as
p o lyhed ra
the space inside a set of lines. The lines 2 and 3 are redundant.
A polyhedron can be defined as
the region of space inside a set of planes. Again, given a set of planes some of these may be redundant and
should be removed.
Each of the remaining proper planes defines a 'face' of the polyhedron. The relevant part of that plane is a
polygon, defined by the intersection lines with the other planes. A polyhedron is in BAND represented as a)
a set of planes plane(3,nplane),dplane(nplane) plus b) a list of vertices vertex(3,*) with an index array
index(nplane+1). The vertices corresponding to plane i are the numbers index(i)+1 through index(i+1);
Index(1)=0. Each vertex occurs at least three times in the list, once for each face it belongs to. The subset of
vertices of a particular face are in clock-wise order (viewed from inside the polyhedron), so that, when the
normal on the plane is rotated to the positive z-axis (the standard orientation) the rotated vertices are in
increasing order of their direction angles. This ordering of the vertices is assumed (and checked) for
instance in the integration package POINTS (the routines PYRPT3 and PYRPT4).
Routine POLYHE generates the polyhedron data structure from an input set of planes; redundant planes are
removed. The algorithm is as follows. For each plane the rotation matrix is constructed that rotates it to the
xy-plane. In the rotated frame all intersection lines with the other planes are computed. The resulting
polygon is analyzed (POLYGN): if all sides are redundant, that is, if the 'inside' of the polygon does not
Geometry
70
exist, then the corresponding plane is redundant for the polyhedron and can be removed; otherwise the
vertices are computed, back rotated to the original coordinate frame and added to the list.
The line of intersection of a plane (v,d) and a plane parallel to the xy-plane, with z-coordinate z0 is easily
found: the normal on the line is (apart from a normalizing scale factor) given by the x- and y-components of
v; the distance parameter dxy for the line is (fig.4)
dxy =
d
+ z0 tg()
cos()
(12.4)
sym m etry
The irreducible wedge of a
polygon is constructed by
PLGIRR. Input are the vertices
of the polygon (ordered) and
output the vertices of a
connected irreducible wedge (a
polygon again). In order to
construct this PLGIRR sets up
an index array for the sides of
the polygon. The index is 1 for
a side belonging to the wedge,
Fig.4. A plane, parallel to the xy-plane at z=z0 is cut by a second
plane (v,d). The normal v makes an angle  with the xy-plane. The
distance dxy is given by (12.4); see text.
0 for a side outside (which
must then be symmetry
equivalent to one of sides
belonging to the wedge), and
1 if the boundary of the wedge
cuts the side in two equal parts.
All indices are initiated at 1. A loop over the sides is then executed, starting with the first and counting
upwards. For each of them the equivalent sides are found and assigned index 0 (: to be removed). The loop
is interrupted when the boundary of the wedge is reached. This is the case as soon as a side to be considered
has already index 0, so that it falls outside the wedge, or when one of the symmetry operators interchanges
the two vertices belonging to the side: only half of the side belongs then to the irreducible region (index 1).
Next a second loop is performed, starting with the last side and counting backwards until the other boundary
of the wedge is found. In this way the irreducible wedge is constructed 'around' side 1, by traversing the
circumference of the polygon in both directions until the edge of the wedge is encountered.
The final index array is then used to compute the vertices of the symmetry unique subregion.
The origin is a special point. Assuming the symmetry group not to be trivial, the origin is not one of the
vertices of the original polygon. In many cases however it is a vertex of the irreducible wedge; if this is the
Geometry
71
case PLGIRR permutes cyclically all final vertices such that the origin is the first in the output list of
vertices.
If the original polygon is halved (only one reflection plane for example), one of the sides of the wedge
passes through the origin, but the origin is not one of the vertices. Nevertheless it may be convenient
(SS^BZ integration) to define the origin as a vertex also in this case, spanning an angle π. This is not done
by PLGIRR, but to facilitate this adaptation PLGIRR permutes also in this case the final set of vertices such
that, if the origin is to be added, it fits in between the first and the last of the output vertices.
In three dimensions we may need the irreducible wedge of a polyhedron. Associate with each of the
polygonal faces of the polyhedron the pyramid with the polygon as its base and the origin as the top. The
symmetry analysis is then a two-step proces. First we retain only the symmetry unique pyramids by
checking which of the normal vectors are transformed into each other by the symmetry operators (the
corresponding faces are then necessarily also equivalent). Then each of the remaining polygonal faces is
subsequently rotated to the xy-plane and dealt with by PLGIRR.
sim p lices
Simplices play a role in BAND in connection with the integration method in k-space. Simplices are defined
in any n-dimensional space by n+1 points: the simplex in one dimension is an interval, in two dimensions it
is a triangle, etcetera.
A special problem occurring in the generation of k-space integration points is the subdivision of a simplex
into smaller simplices by repeated bisection of the edges. SIMPLS performs this task. Input is a generating
simplex simplx(ndim,ndim+1) and the number of refinement-steps nmesh; ndim is the dimensionality and
hence the number of coordinates for each of the (ndim+1) points.
Output is a list of distinct points point(ndim,npnt) and an index array idsimp(nrow,nsimpl); nrow is the rowdimension of idsimp (must at least be ndim+1); nsimpl is the generated number of simplices of the specified
refinement. Idsimp(1,i), idsimp(2,i),.. idsimp(ndim+1,i) specify the vertices of the i-th simplex by pointing to
entries in the list of points (point). Each refinement step splits every simplex into nsub=2ndim smaller ones,
so that the total number of smallest simplices will be nsimpl=2ndimnmesh .
The algorithm is basically a multiple nested loop, one for each level of refinement. Each of the loops runs
over all nsub sub-simplices of the one-level-larger simplex.
do i1 = 1,nsub
construct the i1-th subsimplex
..
do i2 = 1,nsub
the i2-th subsub(...)simplex
..
do i3 = 1,nsub
..
(etcetera)
(12.5)
Geometry
72
Since the number of nestings, nmesh, is a variable the loops cannot be implemented explicit; they are
constructed implicitly, with 'goto' statements and tests on loop termination; the loop counters are stored in
array idsub(nmesh); array vertex(ndim,ndim+1,0:nmesh) stores the vertices of the simplices under current
treatment, one for each level; level 0 is the input simplex.
A subsimplex may have one of its vertices in common with its parent simplex; the other vertices are then the
midpoints of the adjacent edges. Other subsimplices may have only a particular set of these midpoints as
vertices. In general the vertices of the subsimplex can then be defined as the averages of particular pairs of
parent vertices, where in some cases the average may be taken of one and the same point. Each of the
2ndim subsimplices is thus characterized as a special set of ndim+1 'pairs'. This data structure is the fixed
data array idpart(2,ndim+1,2ndim ) in SIMPLS. Inspection shows that with an appropriate ordering of the
subsimplices the structures for the lower dimensional cases can conveniently be embedded in those for
higher dimensions. The implemented array idpart is the 3D case; when SIMPLS is called with ndim<3 the
appropriate submatrix is used.
13
Input
This section deals with the processing in BAND of input data and with the ways in which the operation of
the program can be directed via input. A few remarks will be made concerning output.
Input is optional in many respects; omission leads to default settings. Input relating to a large number of
details is dealt with in the corresponding Software Sections and is not discussed here. Reading this section
should supply sufficient information however to run the program.
BAND reads only one input file. Restart possibilities have not yet been implemented.
INIT excludes the unit numbers 5 and 6 from use by BAND's file manager (see SS^files), assuming that these
are associated with the standard input and output channels respectively.
The input file is defined to have two parts: the comment part and the data part. The comment part, which
may be empty, consists of all consecutive first records that have the character 'c' in the first column; the
other records constitute the data part. The data part and hence the input file is defined to end with either the
FORTRAN 'end-of-file'-code, or a specific record (see below), whichever comes first.
INIT copies the complete input file to output. Simultaneously the comment part is copied to a file itcomm
and the data part to itinpt. These formatted files are processed further in HEADIN and GETINP, which are
both called from INIT.
HEADIN prints the 'heading' of the output, directly after the copy of the input file. The heading provides
some general information about the program and the calculation, such as the release number of the program,
the date of the release and the requested CP-time and memory-usage for the current run. Naturally some of
Input
73
this information is picked up by machine-dependent code. This has to be adapted when BAND is
implemented on another computer. The involved routines are SECTIM and HEADIN.
HEADIN retrieves also the comment lines again from itcomm and writes them (without the first-column 'c')
into the output heading.
GETINP analyzes most of the information on itinpt. The rest, for example the characteristics of the basis
functions, is written to a new file unit; this is (at the end of GETINP) assigned to itinpt, replacing the old
value. In various parts of the program the remaining information is extracted from (the new) itinpt.
All (data) input is structured by keys; a key is a (short) string, usually a single word. Each key defines a
separate key section. Two forms are used. In the first form the section consists of only one record; it
contains the key and, depending on the case, additional (numerical) information. The second form is a
sequence of records: the first record is the key; the last contains (only) two asterisks, '**', signifying the end
of the section; the intermediate records provide all information; in one special case the first (key) record has
to contain also additional information.
We will denote the two forms by key record and key block respectively.
The form of the key section is not optional: each admissable key is associated with a particular form.
Generally speaking the block form is employed only for keys that relate potentially to large amounts of data,
such as a list of basis functions or atom coordinates. All keys that have the block form will be mentioned in
this section.
Empty lines in input are allowed and meaningless; INIT omits them when itinpt is written.
All numerical information is 'free format': the absolute positions of numbers and the form in which reals are
specified is irrelevant.
The ordering of the keys in input is free and has no implications (with one obvious exception, as we will
see).
Most of the keys are optional: omission leads to defaults for the corresponding variables or options. For
some keys second defaults are available; these are activated by giving the key without specifying further
information.
GETINP checks the occurrence of all keys that are known to the program. Unknown keys are neglected and
their presence in input has no consequences. As said before, in most cases GETINP extracts all information
and assigns values to the variables and/or sets the options that correspond to the encountered keys; for some
keys the information is only globally surveyed and copied to a new file to be processed later.
The occurrence in input of a few special keys is 'kept in mind' by storing them in a list of keys, the array
keylst. This operation is performed by routine KEYSET, which is called from GETINP whenever such a key
Input
74
is found. The logical function KEY checks the presence of a specified key in the list, i.e. KEY('this key') tells
us whether
has been stored in the list.
This structure has made it easy to implement new options or to adapt the operation of BAND to new
insights, either permanently or temporarily for testing or debugging purposes. Suppose for instance that we
consider the replacement of some algorithm by a possibly more efficient one. To test this we would like to
compare the two alternatives in a number of calculations. Instead of keeping two versions of the program
during the testing period, we just define a new key, say
, and insert a few lines in GETINP to
search for that key and to call KEYSET when it occurs. The alternative algorithm can then be implemented
side by side with the original one, with a simple if-then-else structure:
if (key('algorithm')) then
..
(new algorithm)
..
else
(13.1)
..
(old version)
..
endif
In this way we can keep both possibilities available as long as we wish, without bothering about the
maintenance of two programs instead of one.
Since BAND is a program in perpetual development, the set of keys changes rather often. The keys
mentioned in this section and in other Software Sections may therefore not exhaust the set employed in
BAND; at the other hand some of the keys may not be in use anymore. One should examine the code of
GETINP to ascertain which keys are in fact recognized in the current release.
In the next paragraphs we discuss the input of necessary information, such as the geometry and basis
functions. After that we will examine keys related to output printing and the 'control' keys, which determine
the operation of the program in general.
The keys are typed outlined (
,
). If the block form is to be used for the corresponding data, this
will be indicated between brackets. Keys may have synonyms; the alternatives will be listed (
,
most cases a key is recognized also when it is embedded in a larger string; for instance the string
recognized as denoting the key
.
g eo m etry
Information must be provided concerning the Bravais lattice of the crystal, its dimensionality, the
coordinates of the atoms and the units in which the data are specified on the input file.
). In
is
Input
75
. Lattice vectors and atom coordinates are interpreted as being specified in ångstrøms. Default
(omission of the key): atomic units.
(block). Each record contains a lattice vector: three cartesian coordinates; if fewer coordinates are
specified zeros are supplemented; at least one coordinate per record must be given (empty records are
discarded). The number of records is the dimensionality ndim of the crystal.
The lattice vectors (in atomic units) are stored in array avec(3,3). The inverse-transpose, which
describes the reciprocal lattice (apart from a factor 2π), is stored in bvec(3,3).
. The positions of the atoms are specified in natural units, i.e. in units of the lattice vectors. Default:
cartesian units.
If the dimensionality of the crystal is less than three, only the first ndim coordinates can possibly refer
to the lattice vectors; the others are automatically cartesian.
The (cartesian) coordinate values are stored in array xyzatm(3,natomt). Natomt is the total number of
atoms in the crystal unit cell.
(block). The positions of the atoms. The key has to be specified anew for each (chemical) type of
atom that occurs in the crystal. The number of types of atoms, ntyp, is defined as the number of
occurrences of the key
. Atoms belonging to different types as defined here cannot be symmetry
equivalent.
This key is the exceptional case in which the leading key-record itself must contain additional
information: the atomic number (=the nuclear charge).
Each of the intermediate records in the block gives the coordinates of one atom; zeros are suppplied
when fewer than three coordinates are found.
Note that the meaning of the input coordinate values depends on the absence or presence (anywhere
in the input file) of the keys
and
.
functio n sets
Data have to be supplied concerning the free atoms that make up the crystal (key
valence basis functions (
defined above,
,
), the Slater type
) and the Slater type fit functions ( ). For each of the ntyp types of atoms, as
and
The order in which the keys
are searched for.
,
and
is obligatory, but
and
are optional.
occur is relevant (this is the exception to the rule that the order
of keys in input has no meaning). In the first place: the keys
and
correspond in their order of
appearance. In the second place: for each of the atom types GETINP searches the keys
corresponding
-block but before the next
after the
-block (or the end of input); if they are not found in that
part it is assumed that they have been omitted for that type. In the third place if both the
-block are present for a certain type, the
and
-block must precede the
-block and the
-block.
(block). The first record in the block (following the key) states the number of numerical one-electron
states natorb to be computed and the number of them that are to be interpreted as core states in the
crystal ncore: two integers; omission of the second implies ncore(ityp)=0.
Input
76
The next records give for each of the natorb orbitals n,l,q: the main and angular quantum numbers and
the number of electrons. Example: '3 2 7' implies 7 electrons in the 3d-shell of the spherically
symmetric atom. The occupation q may be omitted; default: fully occupied (q=4l+2).
Anywhere inside the
(or:
-block the following additional keys may be supplied:
). The numerical valence states are incorporated in the crystal valence basis. Default
(omission of the key): no.
Note that the numerical core states are used anyway.
. For each numerical one-electron state which has an angular quantum number l≤lfit, the square of
the orbital (i.e. the orbital density function) is used as a spherically symmetric fit function (a
one-center fit function with l=0) in the crystal. The value of lfit may be specified in the key
record; default: lfit=0.
Absence of the key implies lfit=1: no fit functions are derived from the numerical orbitals.
. All one-center functions, both those from DIRAC and the Slater type valence and fit functions
are represented in BAND as tables f(ri ),i=1..nr. The radial coordinates ri constitute a
logarithmic grid (ri+1 /ri is constant). Key
specifies the number of points in this grid: nr,
the first value r1 and the last value, rnr , in that order. Defaults are used for absent data
(nr=2000, r1 =104 a.u., rnr =40 a.u.).
Remark: the radial grids (one for each type of atom) are stored by BAND in array rad(nrx,ntyp);
nrx is the maximum nr. of radial points in any of the grids. After the setting-up of the tables, the
values in rad are replaced by the reciprocals 1/rad; this is more convenient in the interpolation
routines (ATMFNC, BASPNT, FITPNT). In some places in BAND the array is accordingly
denoted rinv(nrx,ntyp). The array nr(ntyp) stores the number of radial points in each grid.
,
,
,
(block). Each record characterizes one set of Slater type functions for the valence
basis by three variables n,l,: two integers and one real. The corresponding functions are Zlm()
rn1 er , centered on the atoms of the type under consideration.
,
(block). Analogous to the previous; the functions are used in the fit set to describe the (deformation)
charge density in the crystal.
. Apart from the one-center numerical orbitals (from DIRAC) and the Slater type orbitals, the
valence basis may contain plane waves. The plane waves ei(k+K)·r used for each k-point k in the BZ,
are characterized by the lattice points K of the reciprocal lattice. The key record must state the
number of stars of these lattice points to be used in the valence basis, nwavst. Star no.0 consists only
of the central point K=0. Each subsequent star consists of all K with the next higher distance to the
origin, regardless of symmetry relations.
The central 'star' is used only if no other valence functions are employed. Otherwise it is omitted to
prevent dependency problems in the valence set; only the star numeros 1 through nwavst are taken
then. The resulting number of plane waves in the basis (per k-point) is nvalwv.
Input
77
p rint d irectives
The amount of printed output is determined by a number of general print options and may further be
directed by print instructions.
The general print options are encoded by the variables iprntp, iprnti, iprnts, iprnte and iprntr in common
FIXDAT. They determine the general output levels for respectively the preparation part, the numerical
integration package, the SCF-procedure in general, the eigensystems of the iteratively computed hamiltonian
and the results (properties section). The higher their value, the more is printed. The (low) default settings
provide already a fair amount of information; more output is usually only required to test accuracies and to
examine intermediate results. The corresponding keys and defaults are
. defaults (first and second): 1, 2
. defaults: 1, 2.
. defaults: 0, 1.
When the SCF procedure encounters convergence problems, the value of iprnts is automatically
increased in SCFTST (but not higher than 2). This induces the output of information which may clarify
the type of convergence problem.
. first default: 0 : eigenvalues at the first and the last cycle; no eigenvectors.
second default: 1 : complete eigensystems at the first and last cycle.
other values: 2 : (only) eigenvalues at all cycles. 3 : complete eigensystems at all cycles.
. defaults: 0, 1
More specific print instructions are activated by keys that resemble those of the previous set, but that are
processed differently in the program. All these keys have the form
, where 'abc' is some special
string. Whenever such a key is found by GETINP, i.e. when 'abc' does not equal (or contain) 'pre', 'int', 'scf',
'eig' or 'prop', routine PRNTST is called, which adds the string 'abc' to a list of print instructions, stored in
the array prtlst. The occurrence of a specific string in the list is checked by the logical function PRNT-:
PRNT('abc') is true if 'abc' has been added to the print instruction list. This set-up is analogous to that for
the keys (cf. routines KEYSET, KEY) and has been devised to adapt the output easily to new demands. The
print strings recognized by BAND are
: detailed information about the determination of the fermi energy at every cycle and related
data.
: fit coefficients and related data at every cycle.
: the occupation numbers for all one-particle states at every cycle.
g enera l co ntro l
SS .BAND stops after execution of SS; 'ss' must be the name of a subroutine or section that is
recognized by BAND's controller; all major subroutines satisfy this requirement (see SS^control).
SS1 SS2 .. . BAND skips the sections or subroutines SS1, SS2, ... Again 'ss1', etcetera must be
recognized by the controller. Any consequences of the not-executing are not (yet) taken care of in
BAND; the program may even 'crash' because e.g. variables have not been computed. In practice this
Input
78
option is consequently only safe for a few specific sections (e.g. ELSTAT: the electrostatic interaction
energy is not computed then, but this has no further consequences; also various parts of the properties
section may safely be skipped).
The sections to be skipped are stored (by routine SKIPST, called from GETINP) in a 'skip-list', the
array skiplst; the presence of a specific string in the list is checked by the logical function SKIP('ss').
Compare the manipulation of keys and print directives.
The skip-list is printed in the output heading.
Of course it is an easy matter to extend the applicability of the skip-command beyond the standard
sections and routines in BAND. Execution of any part of the program may be subjected to the value of
SKIP('ss'); 'ss' has thereby automatically been made into a recognizable name. We have done so for
instance with the printing of all overlap populations (in POPANA); this output is suppressed now by
the input command:
.
Other keys that influence the operation of BAND in a general way are
,
,
or
. Both spins are treated independently as regards the potential,
charge density, eigensystems. The basis sets are identical (the free atom equations solved in DIRAC
are spin-restricted). Omission of the key implies a spin-restricted crystal calculation.
. The default value of all general print options are set so high that all possible output is generated.
Moreover BAND will not stop, as it would do otherwise, when intermediate results are suspect.
. This incorporates most of the
effects. In addition some (other) defaults attain different values. In
particular integration levels in real space and in k-space are lowered. The default resettings are
performed by TESTST.
k eys with b lo ck typ e inp ut
We conclude this section with the enumeration of all keys that carry with them in the input file a blockstructure, as defined before:
,
,
,
and
have already been mentioned.
provides information for the integration package. See SS^integration.
and
determine which partial densities of states have to be computed.
See SS^DOS.
14
Integration
Integrals over the crystal unit cell are evaluated by numerical integration. The integration formula is based
on a partitioning of space in atomic polyhedra, core-like spheres and (except for 3D crystals) an outer
region. For each of the sub regions efficient product-Gauss rules are generated. The procedure is described
extensively in [chapter III].
Integration
79
In other software sections some specific aspects are dealt with, such as the construction of the atomic
polyhedra and the irreducible wedge (SS^geometry) and the computation of the space group operators and
their application (SS^symmetry). Here we mention a few details that have not been covered yet.
The theory and global set-up of the integration formulas are discussed in chapter III. The integration package
POINTS is suited for periodic systems as well as for molecules; the latter have been used extensively to test
the performance and we will therefore refer to the molecular application from time to time in the discussion
below.
POINTS is called from GEMTRY. GEMTRY supplies the necessary data (atom positions, lattice structure and
integration parameters) to POINTS via file itipnt. POINTS returns a file with points, itpnt, and a file with
geometric data, itgeom. The latter contains in particular the space group operators and the division of the
atoms in sets of symmetry equivalent ones.
The file with points is processed further in RPNTID. Each record written by POINTS contains
precisely one complete set of symmetry equivalent points. After the reorganization in RPNTID the block
structure on the file is such that a) the number of points per block does not exceed a prescribed maximum
length npx and b) each block contains only complete sets of symmetry related points. Also written to this
file is information concerning the symmetry relation between the points. This serves to facilitate in various
parts of BAND symmetry operations based on numerical integration: e.g. the symmetrization of a function
by averaging over the equivalent points (the density) and the expansion over all points of a symmetric
function that has been computed in the unique points only (the potential). Furthermore RPNTID computes
for each point which atom is nearest by (according to some metric); this is used by CHARGE to partition the
self consistent charge density over the atoms.
Each block on the resulting points file (itpnt) consists of the following records:
1
np: the total number of points in the block.
2
xp(np): the x-coordinates of the points.
3,4
yp and zp(np): the y- and z-coordinates.
5
wp(np): the integration weights.
6
npsym: the number of symmetry unique points in the block.
7
nequiv(npsym): the number of equivalent points for each of the unique points.
8
idatom(np): for each point the index of the atom nearest by. Idatom() may have a positive or a
negative value; the indicated atom is abs(idatom()), positive and negative values signify that the
points are inside, respectively outside the atomic sphere.
As discussed in SS^workspace itpnt is reorganized several times (RPNTRE, REORGF). The same structure is
maintained. Only the maximum block length npx may be changed.
inp ut
A file itintg is opened in INIT and used by BAND to collect integration parameters. In GEMTRY this is
combined with the atom coordinates etc. to write the input file for POINTS itipnt. Integration parameters can
be specified via input with the key
. The contents of the associated data block (see SS^input) is
Integration
80
copied to itintg and finally to itipnt. This data block is read again by the input routine of the integration
package, RDINTG. It must consists of a sequence of keys with additional numerical information. RDINTG
knows two types of keys: a) single records containing the key and a number, and b) one record with the key
and a second record with an array of numbers, one for each type of atom in the system. The admissible
keys, their meanings and the default values are listed below; the names of the corresponding variables are
identical to the keys.
type a: single-value keys
1
; default: 3.5. This is the general accuracy parameter. In normal operation only this parameter
should be supplied, if any at all. Almost all other parameters depend by default on accint.
The computed integration formula is intended to give an accuracy of accint significant digits for
integrals that normally occur in electronic structure calculations.
2
; default: accint. Analogous to accint accsph is an accuracy parameter referring to the number
of significant digits of the formula, in this case for the radial integration in the atomic spheres only.
3
; default: accint. Similarly accpyr refers to the atomic polyhedra and the constituting pyramids.
4
; default: accpyr. The integration over the atomic pyramids is a threefold product formula in
variables u,v and w. U and v describe more or less the angular integration and w parametrizes the
radial variation relative to the central atom in the polyhedron. Accpyu specifies of course the accuracy
of the outer integration 'loop' over the variable u.
5,6
and
7
; defaults: accpyr. Similar to accpyu, now for the variables v and w.
; default: accint. The accuracy parameter for the outer region (not relevant for threedimensional crystals). The treatment of the outer regions in POINTS is not very sophisticated
compared with the spheres and polyhedra: notest functions are employed to monitor and tune the
number of points to the specified accuracy. The parameter name accout is therefore a little
misleading; of course it is meant to have the meaning suggested, but the necessary code has not yet
been developed. Fortunately the outer regions are in general not very relevant for the integrals.
Moreover accout does determine the number of points in the outer region (see below) and our
experience thusfar suggests that the resulting accuracy is at least good enough in the large majority of
cases.
8
; default: 2.3rsphx. Rsphx is the radius of the largest atomic sphere (see parameter rspher
below). Dishul is the distance from the outer atomic spheres to the enveloping boundary planes that
constitute the inner limit of the outer region. The atomic spheres have usually radii in the order of 1
a.u., so that the outer region starts at approximately 3 à 3.5 a.u. from the nuclei.
9
; default depends on the nuclear charge and on accout. Frange specifies the distance at which all
one center functions become negligible (not counting the multipole potentials). In the integration
formula it is the position of the outward limit of the outer region, measured from the outermost atoms.
The default depends on the nuclear charge of the heaviest atom, Zx and on the integration parameter
accout. The relation below is based on some trial and error and on common sense. Set fr=10 (Zx ≤2),
12 (Zx ≤18), 15 (Zx ≤54) or 20 (otherwise). Then frange=fr(32e.07accout ). Fig.5 depicts frange as
a function of accout for Zx =30.
10
; default: 1+nint(2.5/dishul).
Integration
81
The outward integration in the outer region is logarithmically subdivided in nouter subintervals. In
most situations two subintervals is optimal: a relatively narrow interval near the outer atoms and a
large interval for the decaying tails of the functions. If dishul attains extreme values, some adaptation
is necessary of course, hence the applied default relation.
11
; default: 1.3+0.9accout.
This paramater governs the 'radial' integration in the outer regions (outward, away from the atoms).
The default relation has been fitted by a number of test calculations on various molecules. Outrad is
the number of integration points per outward subinterval (see nouter above).
12
default: 0.5+1.5accout.
Parameter for the 2D integrations parallel to the boundary planes of the outer region. The default
relation has again been determined from test calculations.
13
; default: lintgx+1. This parameter is relevant only for systems with an axis of infinite rotational
symmetry such that all atoms lie in a straight line. The symmetry unique points can all be chosen in a
half plane: we have in fact a 2D integration problem. BAND however needs also the equivalent points
to evaluate correctly integrals over the rotational angular variable.
For each of the symmetry unique points (in the half plane) POINTS generates a circle of equidistant
points. The necessary number of points depends on the angular momentum quantum numbers of the
integrands. The maximum l-value is lintgx (see below); the required number of angularly equidistant
points to integrate the periodic functions of this order is lintgx+1.
If linrot is specified on input, the generated number of points on the circle is the smallest integer
multiple of linrot that equals or exceeds lintgx+1.
type b: keys with ntyp data in the next record
1
. The radii of the atomic spheres. By default they are determined for each type by a) the distance
d0 to the nearest atom and b) the nuclear charge Z, such that
a
rspher<d0 /2: the atomic sphere is inside the atomic voronoi polyhedron.
b
if d0 is small: rspherd0 /2: the sphere is as large as possible in narrow regions.
c
if d0 is large: rspherd0 /2: for free, isolated atoms integration in spherical coordinates is optimal,
so we extend the sphere as much as possible in such a situation. 'Very large' is defined to be frange,
the assumed function range, so rspher depends also on this parameter.
d
for 'normal' values of d0 : rspher equals approximately a standard value that depends on the nuclear
charge: heavier atoms get a larger 'core'-sphere. The implemented function can be found in
RDINTG. It results for instance in rspher=0.5 for hydrogen, 1.0 for oxygen, 1.3 for copper and 1.7 for
uranium. (In fact a smooth function has been devised that produces precisely these values). To get
an impression of the combined effect of all aspects fig.6 displays the value of rspher as a function of
d0 for copper, with frange=25 a.u.
2
: the maximum angular momentum quantum number of one-center integrands, one value for each
type of atom. (Lintgx, used to determine the default value of linrot above, is the maximum over this
array). BAND automatically computes these values from the fit functions and basis functions employed
in the calculation. Lintgx is used by POINTS to choose the appropriate angular integration formulas for
the atomic spheres.
Integration
82
The default value for linteg as implemented in RDINTG depends on the nuclear charge: linteg=0 (Z≤2),
4 (Z≤18), 8 (Z≤54) or 12. This is not relevant here because BAND overrules the default values anyway.
remarks
#
The defaults stated above might yield evidently ridiculous values in some situations. For instance dishul
(=2.3rsphx) would become unreasonably small if we explicitly specify extreme values for the atomic
spheres. This is accounted for in RDINTG by checking such extraordinary situations and changing the
default to more sensible values. The details of these checks and adaptations can be found by inspection
of RDINTG. We have attempted to provide reasonable defaults for all parameters, whatever the
specifications for the others.
Integration
#
83
As mentioned
before, the
treatment of the
outer region is
not so well
developed (yet)
40
frange
35
30
as the atomic
spheres and
polyhedra.
Although this is
usually not a
25
20
15
problem due to
the relative
accout
10
0
unimportance of
2
4
6
8
10
12
that part of
space and
because the
Fig.5. Assumed function-range (frange), in a.u., depending on the integration
parameter accout, for Zx =30. (see text)
implemented
strategy
functions
reasonably well
in practice, we
30
radius of atomic sphere
have
encountered
(minor)
difficulties with
some
25
20
15
molecules.
Invariably this
had to do with
the 'parallel'
integration
(parameter
10
5
distance to nearest neighbour
0
0
5
10
15
20
25
30
35
40
45
50
outpar) over
large polygons
that were part of
the enveloping
Fig.6. Radius of the atomic sphere (a.u.) for Cu, as a function of the distance
d0 (a.u.) to the nearest atom. (assumed function range: 25 a.u. (see text))
molecular
polyhedron.
These polygons are subdivided in quadrangles (and possibly a triangle). No further subdivision is made.
Consequently we may occasionally have to integrate over very large quadrangles with one single
Integration
84
product Gauss-Legendre formula. Functions that are relatively localized are not integrated easily then
and a subdivision in smaller quadrangles would probably be better.
When such troubles come up one may specify a larger value for dishul, thereby shifting the problematic
region away and making it less important, or one may increase outpar. It is not easy to say which of the
two leads effectively to more integration points. Increasing dishul implies enlarging some of the atomic
polyhedra; this results usually in significantly more points in the involved pyramids. At the other hand,
if the outer polygons are essentially too large for efficient integration, outpar will have to be increased
considerably before the problem is solved.
Interpolation and bloch sums
15
85
Interpolation and bloch sums
The construction of crystal basis functions involves interpolation and bloch summation. The crystal
functions are represented by their values in the integration points and they are computed (in most cases) as
bloch sums of one center functions
f(k;r) =

eik·R (rR)
(15.1)
R
The summation over the lattice points, or cells R is infinite but for the value of f(k;r) in a point r in the
central unit cell the loop may be cut off at some large cell distance because the employed one-center
functions (r) decay exponentially for large r.
The functions (r) are of the form Zlm() P(r ), consisting of a spherical harmonic times a radial
function in coordinates relative to some atom . The radial functions are in BAND represented as tables,
P(r)P(ri ),i=1,..,nr}.
interp o la tio n
To compute the sum (15.1) we must in particular determine from the table the radial function value P(r) for
any distance r. This is achieved by a three-point Lagrange interpolation: given r we determine the three
nearest points in the radial grid, ri , ri+1 and ri+2 . P(r) is then computed as a linear combination
P(r) = e1 Pi + e2 Pi+1 + e3 Pi+2
(15.2)
such that P(r) is implicitly approximated by a parabola through the three points.
For each occurring distance r we have to determine the index i in the radial table and then the combination
coefficients e1 , e2 and e3 . Let c be the multiplication constant that characterizes the radial mesh, crj+1
/rj . The interpolated value is given by the Lagrange formula
P(r) =
(rri+1)(rri+2)
(riri+1)(riri+2)
Pi +
(rri)(rri+2)
(ri+1ri)(ri+1ri+2)
Pi+1 +
(rri)(rri+1)
(ri+2ri)(ri+2ri+1)
Pi+2 =
(define b=r/ri )
=
(bc)(bc2)
(b1)(bc2)
(b1)(bc)
Pi +
Pi+1 + 2
)P
=
2
2
(1c)(1c )
(c1)(cc )
(c 1)(c2c) i+2
= d1 (bc)(bc2) Pi + d2 (b1)(bc2) Pi+1 + d3 (b1)(bc) Pi+2
(15.3)
Interpolation and bloch sums
86
d1 , d2 and d3 are constants of the radial mesh: d1 =1/{(1c)(1c2 )}, d2 =1/{(c1)(cc2 )}, d3 =1/{(c2
1)(c2 c))}. The interpolation coefficients in (15.2) are thus given by
e1 = d1 (bc)(bc2)
e2 = d2 (b1)(bc2)
(15.4)
e3 = d3 (b1)(bc)
Of course r may be so large that we need extrapolation from the table. This should make no difference. The
tables are expected to contain the whole range of the function so that the last few values Pnr , Pnr1 ,.. are
negligible and the result is almost zero. Polynomial extrapolation of an exponentially decaying function is in
principle hazardous however and we set the coefficients explicitly to zero as a safety measure.
The radial grid points are ri =r1 ci1 . If ri+1 is the grid point closest to r, the index i is computed as
i = nint
log(r/r1)
 log(c) 
(15.5)
Nint(x) is the integer nearest to x. The index i to be used is of course constrained by 1≤i≤nr2.
implementation
Radial functions are interpolated in ATMFNC, BASPNT and FITPNT. The organization is in all three routines
analogous. The radial mesh points (that is, their reciprocals) are stored in array rinv. For each type of atom
subroutine INTDAT provides various constants related to the radial mesh: d1, d2 and d3, c and csq(=c2 ),
nr2(=nr2), rfac(=1/r1 =rinv(1)) and clogi(=1/log(c)) (c.f. (15.4) and (15.5)).
Then we compute for a vector of integration points, with the coordinates relative to some atom  in some
cell R, the interpolation coefficients for each point as follows:
1. determine the radial distance r,R and (except in ATMFNC) also the values of the spherical harmonics.
2. get index i: intpl=nint {clogilog(rrfac)}
3. apply the constraints: intpl=max(intpl,1) and intpl=min(intpl,nr2) (routine VBND2I).
4. b=rrinv(intpl)
5. determine the coefficients from (15.4)
6. no extrapolation: if b>csq, set e1 =e2 =e3 =0 (routine CONDIT).
Some of these operations have been isolated in separate routines ( CONDIT, VBND2I) because the code
appeared to be not so easily vectorized (by the Cy205 compiler) in the original context.
accuracy and normalization
Interpolation implicates obviously some inaccuracy, which depends on the denseness of the radial mesh.
We have tested this by interpolating exponential functions rn er and comparing the resulting integrals
with the analytical values; a numerical integration scheme with very high precision has been used for this,
so that the errors are determined by the interpolation. Fixing the first and last mesh-points at 104 and 40
Interpolation and bloch sums
87
a.u. (the defaults in BAND) we found relative errors of the order 2105 for 300 mesh points, 1.5106 for
600 points, 2107 with 1200 points and 3108 using 2400 points. These figures are the r.m.s. errors
found for the employed set of test functions; the worst cases deviated less than an order of magnitude. The
default used in BAND, 2000 points, may thus be expected to give no significant interpolation errors.
Application of a four-point interpolation did not yield a substantial improvement. We have not tried a fivepoint interpolation; the interpolation routines (especially BASPNT) take a major part of the execution time
(in the preparation stage of the program) and the use of five instead of three interpolation coefficients would
considerably increase the cost.
Of some functions the exact integral is known. This holds for instance for various functions of the numerical
free atoms: the charge density, the potential, the individual orbital densities and the corresponding kinetic
energy functions. Their integrals, as evaluated numerically over the radial logarithmic grid, are so accurate
that they may be called exact. The numerical integral over the crystal grid gives an indication of the
interpolation accuracy, but is also determined of course by the quality of the crystal grid itself. One may
now apply some normalization to correct the deviations, by simply multiplying the interpolated function
values by a constant. We have experimented with this possibility but found that it is better not to do so: the
total energy of the crystal and the cohesive energy were less stable against variations in the integration
precision and hence, on the average, their deviations from the exact (converged) result were larger. We have
applied the normalization to several combinations of the obvious candidates (only the density, the density
and the kinetic energy, the potential, and so on); in none case we found an improvement.
b lo ch sum s
To limit the number of terms in the bloch sums (15.1) all radial function tables are analyzed; for each
function the minimum number of cells is determined to evaluate (15.1) with sufficient precision. First
RADMAX calculates the maximum radial extension of any of the tabulated functions. The computed
maximum cell-distance needed in any bloch sum, rcelx, is then used by CELLS to generate a list of ncel
lattice points, xyzcel(3,ncel), ordered according to their distances from the origin. Finally CELMAX reexamines the radial tables and copies them to another file together with the maximum number of cells to be
used for each of them individually.
RADMAX and CELMAX employ the same auxiliary routine RADMAA for the analysis of a single table:
RADMAA takes the absolute values of the function and integrates then with a repeated simpson rule
(SIMPSL) over a trial subrange 1..ntry of the nr points. This is compared with the total integral and the
minimum value of ntry is determined (by bisection) where the missing part represents less than a fraction
cutoff.
The criterium for negligibility cutoff is 10max(3.0, accint+1.0) (computed in RADIAL) ; accint is the general
accuracy parameter for integration over the crystal unit cell (SS^integration). cutoff is optionally read from
input (key:
; the second default 102 leads to a rather crude approximation of the bloch sums).
Interpolation and bloch sums
88
remarks
1. The generated list of 'cells' is usually larger than according to this section. The reason is that the lattice
summation of the multipole potentials in VMULTI requires more terms and employs the same list
xyzcel(). Rcelx is therefore determined also by the parameters rmadel and dmadel used in the evaluation
of these potentials (SS^coulomb potential).
2. Rcelx may be read from input with key
16
or
.
Iteration
The crystal hamiltonian equation is solved by an iterative procedure. Self consistency may be checked by
monitoring in subsequent cycles for instance the total energy, the electronic charge density or the potential.
BAND uses the potential.
Define the cycle operator F. It comprises one complete cycle, in which the potential V leads, via
diagonalization of the hamiltonian and construction of the resulting density to a computed new potential
F(V).
F: V  H  {e,}    F(V)
(16.1)
In the self consistent situation F(V)=V. Starting from some trial V0 we obtain F(V0 ) and may insert this
into the next cycle as the new potential V1 . The sequence Vk , k=1,2,.. may then be hoped to converge, but
in many cases it displays oscillatory behaviour.
o p tim ized d a m p ing
BAND solves this by damping. The potential used in the next cycle is a mixture of the previous and the
computed one:
Vk = (1)Vk1 + F(Vk1 )  H(Vk1)
(16.2)
where we introduced the operator H =(1)1+F.
In virtually all cases the oscillations are suppressed and convergence of the sequence {Vk } is achieved if
the mixing parameter  is small enough. Of course too small a value for  slows down the development of V
towards self consistency and one would like to use some optimum value opt.
It turns out that opt not only varies from one atomic system to another, but also that it may depend on the
stage of the iterative proces. Often it is necessary to use stronger damping, by diminishing , as self
consistency is approached. BAND tries to optimize  from cycle to cycle. The way in which this is done is
based on a discussion of self consistency strategies in [Marchuk 1975].
Define by G1F the operator that yields the difference between input and output of a cycle.
Iteration
89
G(V) = V  F(V)
(16.3)
The self consistent solution V* to be found satisfies G(V*)=0. We assume now that F and G can be
approximated by linear operators with real eigenvalues. Let then {un ,n}, n=0,1,.. be the eigensystem of F
F(un ) = nun
(16.4)
with u0 ~V*, 0=1.
A proper self consistent V* must at least be locally stable, so that n<1, n=1,2,.. [Dederichs and Zeller 1983]
The related eigensystem of G, {un ,n1n} thus satisfies 0=0, n>0, n=1,2,...
If V has the expansion
V = V* +  cn un
(16.5)
F(V) = V* +  ncn un
(16.6)
n=1
then
n=1
The (local) stability of the self-consistent solution implies n<1, but not n<1, so it is not assured that
the sequence F(V), F(F(V)),.. should converge to V*.
Consider now simple damping and define (16.2):
H(V) = V  G(V) = V* +
 (1n)
n=1
cn u n
(16.7)
The convergence rate of the n-th component is n(1n). The sequence H(V) , H(H(V)) ,.. converges
if n<1 for all n.
The (asymptotic) rate of convergence is determined by the most slowly decreasing component, i.e. by max
n
n. Denote the smallest and largest  by  and  respectively, 0<≤n≤, n=1,2,.. and the
corresponding (normalized) eigenvectors by  and . Best overall convergence is achieved with
opt = 2  (+)
(16.8)
For  and  this gives an absolute rate of convergence
1opt=1opt = ()  (+)
(16.9)
and all other components decrease faster. With optimal damping the deviation of Vk from V* becomes
more and more dominated by  and  as the iterations progress. Let us assume then for simplicity
Vk = V* + c  + c 
(16.10)
Iteration
90
so that
gk  G(Vk ) = c  + c  A + B
(16.11)
This gives at the next cycle
Vk+1 = Vk  gk = V* + (1)c  + (1)c  =
= V* + c  + c 
gk+1 = A + B
(16.12)
(16.13)
In optimal damping =2/( +), so that =. For all reasonable values of  we have at least 0≤<1
and 1<≤0: both components decrease and the -component oscillates. fig.7 displays gk and gk+1 as
vectors in the two-dimensional space spanned by  and . The coordinate axes are depicted oblique to
stress that  and  are not necessarily orthogonal.
Our main purpose now is to achieve optimal damping. In that case the angle  between successive vectors
gk is a constant, independent of k, and gk and gk+2 are parallel. The potential V and hence g are available
by their values in the integration points. The norm gk  and the inner product (gk ·gk+1 ) are thus easily
computed and defne the angle 
cosk =
(gk1·gk)
gk1 gk
(16.14)
If A and B of (16.11) and (16.13) differ very much in size cos k±1 and a change in  may be computed
less accurately. Therefore we strive to have A=B as a secondary goal, together with =opt. In the
optimal situation that A=B and =opt the vectors gk1 and gk are orthogonal, cos=0. The overall
strategy, combining the two goals, is to let cos go slowly to zero in the course of the iterations. The
smallness of the changes in  assures that opt; the direction of the changes corresponds to making the
two components equal.
At every cycle the angle k is calculated and compared with k1. The development in the angle tells us
whether  is smaller or larger than opt, while the value of  itself indicates the relative sizes of the - and
-components;  is then adapted accordingly.
This set-up is fairly straightforward but contains a few problems. In the first place we have to determine by
how much  should increase or decrease. Secondly we have made some strong assumptions. If they were
exact it would be possible to determine the vectors  and  and their coefficients A and B from a few
consecutive 'measurements' gk , gk+1 ,... We could then compute V* exactly. In reality however other
components than  and  are also present. Furthermore the true operator G is not linear. The linearized
form represents a first order approximation in the neighbourhood of the current potential V. Consequently
the apparent eigensystem {un ,n} varies from cycle to cycle as V changes. The angle k and its relation to
k1 are then not so simply related to the mixing parameter  as we have assumed above.
Iteration
91
In the past years we have
tried in several ways to use
various assumptions more
rigorously. Again and
again this has yielded
spectacular convergence in
some cases and divergence
or oscillations in others.
The strategy implemented
in BAND employs the
stated assumptions
cautiously and has proven
to be robust. Convergence
is reached almost always,
though it may occasionally
Fig.7. Two consecutive difference vectors g and their components along
take a large number of
the- and - axis (see text).
cycles, up to a few
hundred. Typically 30 cycles suffice.
implementation
Subroutine RHOPOT computes the new potential F(V) from the density and calls MIXITR, which determines
the new mixing parameter. The variables that are relevant for monitoring the development of  and  are in
common block VARDAT. At every cycle the previous potential Vk1 and the corresponding difference
vector gk1 are on file itpot. Together with the computed mixing parameter, parmix (), they define the
current potential: Vk =Vk1 gk1 .
To compute a next value of  it is assumed that it lies between a lower bound l and an upperbound u, the
variables parmxl and parmxu in the program. To accomodate to changes in the apparent eigensystem of G,
l and u are updated together with  itself. The interval (l,u) is intended to indicate something like an
error bar on  itself and is used to scale the amount of change in . The interval becomes smaller when  is
more or less stable over several cycles and it is enlarged when the adaptations to  appear to be insufficient.
To monitor the development of the angle k between the vectors gk1 and gk MIXITR employs an 'error'
function k, which is sensitive in the whole domain of , in particular near the undesirable extreme values 0
and π, which varies from ∞ to +∞, equalling zero in the optimum =π/2.
k = tg(k/2) 
1
tg(k/2)
(16.15)
k, rather than cosk is checked and guided to zero. The optimal development is defined to be k=0.95k1
. The relative deviation from this is
Iteration
92
k0.95k1
=
k
(16.16)
 is increased if <0 and diminished otherwise. The size of  is used to compute the new  in the appropriate
interval (previous,u) or (l,previous) respectively. The map d() is chosen as a fermi-distribution
function: approximately linear dependency on  around the cut-off point (=0) and an exponential approach
to the limit l (or u) for large .
The update of the boundaries is as follows. If  increases the lower boundary l is increased somewhat and
the upper boundary is shifted upward; this shift is larger when a) the last few changes in  have been in the
same direction and/or b) the previous increase of  was relatively large. Both cases suggest that the upper
bound has been too close and consequently the increments of  too small. A variable, pmixch, keeps track of
the number of consecutive changes of  in the same direction.
If  diminishes the boundaries are updated in a similar way, but the increasing and decreasing cases are not
treated on the same footing. The reason is that a too large  may easily lead to a (temporary) divergence in
the SCF procedure, while a too small value only slows down convergence (16.7). Therefore the
implemented adaptations are such, that a speed-up in boundary adaptations is triggered more easily for
decreases than for increases of .
The norms gk  are used to check convergence. The current value, potdif, is compared in routine SCFTST
with the convergence criterium, convrg. The quotient of successive values of potdif defines the rate of
convergence. From fig.7 it can be inferred that this quotient will oscillate if the 'coordinate' axes are oblique
(assuming optimal damping). Such an oscillatory trend in the convergence rate is encountered fairly often.
The situation is then better judged by considering the development over two cycles at a time; we do this by
taking the harmonic average of two successive quotients. Conv0 is the current quotient, convr0 (in common
VARDAT) that of the previous cycle, and averag(conv0convr0)
1/2
the quantity used. In order to measure
the overall development, neglecting moderate variations from one cycle to another, the convergence rate
convrt is defined in BAND as a weighted (harmonic) mean of averag-values of subsequent cycles:
convrt  convrt0.85  averag0.15
(16.17)
Convrt is initialized in INIT (0.5) and checked at every cycle in SCFTST by comparison with a criterium
scfrtx. It may happen that the automatic adaptations fail, in the sense that the SCF procedure converges too
slowly (or not at all): convrt exceeds scfrtx. This may be due to some weak point in the optimization
algorithm or to problematic aspects in the atomic system at hand. If for instance many bands are close to the
fermi level, previously unoccupied bands may drop below occupied ones as a consequence of the changes
in the potential. The occupation numbers are determined according to the aufbau-principle so that abrupt
variations in the electronic configuration may result with corresponding large changes in the computed F(V)
and hence in the difference vector g. When convergence seems to fail, SCFTST attempts to recover by
halving  abruptly; the resulting slowing down of the iterative potential updates has proved to solve the
convergence problems in the large majority of cases.
Iteration
93
At such an intervention by SCFTST, the boundary values l and u are adapted accordingly; convrt is reinitialized at 0.5 and scfrtx is slightly increased to allow slower convergence in the continuation.
This trick is repeated until it is concluded that convergence is still failing. This is judged by comparing the
values of potdif at the successive interventions by SCFTST; if this sequence does not converge the program
is stopped; the last value in that sequence is stored in the variable difold.
 as well as the boundaries l and u are allowed to assume values between zero and some maximum,
parmax. parmax is initialized at 2.0, but it is decreased by SCFTST when convergence is problematic.
INIT initializes parmix (0.15), parmxl (0.10) and parmxu (0.20).
Via input various aspects of the iterative procedure may be influenced, by using the following keys:
, or
specifies the initial value of parmix. If the key record contains a second real value, this is
assigned to a factor delmix (default: 0.75). The boundary values parmxl and parmxu are initialized at
parmixdelmix and parmix/delmix.
overrules the default of the absolute convergence criterium convrg.
supplies the slowest allowable rate of convergence scfrtx.
specifies the maximum number of cycles to be executed. For organizational reasons at least two cycles
will be executed. If maximally one cycle is ordered, then parmix is set at zero, so that the final results
correspond to the starting-up potential, without any update.
Cheb yshev a ccelera tio n
Strategies for iterative processes are discussed frequently in the literature. A few words may be spend here
on a method, called the Chebyshev acceleration [Marchuk 1975]. It may provide a significant improvement
over damping. Terminology and definitions are as above. In particular it is assumed again that the operator
G=1F is linear with real eigenvalues.
In damping Vk is a linear combination of F(Vk1 ) and Vk1 . The Chebyshev method is based on a
straightforward generalization of this: Vk is a linear combination of F(Vk1 ) and all previous Vm ,
m=k1,k2,... It will turn out that it is not necessary to store all these previous potentials.
Let V0 be some trial potential and let Pk(x) be a polynomial of degree k in the variable x. Define the
sequence {Vk } by
Vk = Pk(G) V0
(16.18)
With the expansion (16.5) for V0 this gives
Vk = Pk(G)

n=0
cn u n =

n=0
Pk(n) cn un
(16.19)
with c0 u0 =V*. Two conditions are imposed on the polynomials Pk :
a) Pk(G) V*=V*  Pk(0) =1. This is a normalization condition; it is analogous to the coefficients of
F(Vk1 ) and Vk1 adding up to unity in (16.2).
Iteration
94
b) Pk() is as small as possible for all  in the spectrum of G. This states that any component in VV*
should be made as small as possible (16.19). Since we do not know all (discrete) n, this condition is
generalized to the whole interval (,)
max
≤≤
Pk() is minimal
(16.20)
The Chebyshev polynomials of the first kind have the minimax property, so the conditions a and b are
satisfied by
Pk() =
+2
)

+
Tk(
)

Tk(
(16.21)
where Tk(x) is the Chebyshev polynomial (of the first kind).
The recurrence relation for Chebyshev polynomials, Tk(x) =2xTk1(x) Tk2(x) , can be used to obtain a
convenient expression for Vk from (16.19) and (16.21). Define
x=
+2G

z=
+

(16.22)
This gives
4
Tk(x) V0 2zTk1(x)V0  Tk2(x)V0  GTk1(x)V0
Vk = Pk(G) V0 = T (z) =
=
Tk(z)
k
=
=
2zTk1(z)Pk1(G)V0  Tk2(z)Pk2(G)V0 
4
T (z)GPk1(G)V0
 k1
Tk(z)
2zTk1(z)Vk1  Tk2(z)Vk2 
=
4
T (z)g
 k1 k1
Tk(z)
(16.23)
Equation (16.23) shows that the Chebyshev method is in principle hardly more complicated than damping.
We have to store Vk2 on file, in addition to Vk1 . In damping we must determine one parameter, ; here
we need two parameters,  and . If these are known the numbers Tk(z) , k=1,2,... are computable with the
recurrence relation.
The iterative procedure is started with some trial potential V0 . To use (16.23) also for the first cycle, define
T1(z) = T1(z) = z
(satisfying the recurrence relation) and V1 =V1 . For k=1 we obtain then
(16.24)
Iteration
95
2
V1 = V0 
g
+ 0
(16.25)
i.e.: optimal damping at the first cycle.
Compared with damping the Chebyshev procedure gives theoretically a considerably faster convergence. In
optimal damping the convergence rate is given by (16.9): damp=1/z. In the Chebyshev method the
components decrease as Pk(n) . Since Tk(x) is a periodic function of k for all x≤1, all components in
(16.19) decrease on the average as 1/Tk(z) , giving for the convergence rate
cheb = Tk1(z) Tk(z)
(16.26)
For large z>>1 Tk(z) is dominated by the leading term 2k1 zk , making cheb1/(2z). The gain over
damping is a factor two per cycle.
For z close to unity (>>), say z=1+ with <<1, damping converges very slowly: damp1. In the
Chebyshev method convergence is not very fast either in the first few cycles, but significantly better than
with damping: in a first order Taylor expansion Tk(1+) 1+k2 (k2<<1) giving 1k1). For larger k
the variation in  from one cycle to another will diminish. Assuming for simplicity that it is constant, the
recurrence relation can be used to compute
Tk1(z)
 = T (z) =
k
1
1
1/2
Tk2(z) = (2z)  1  (2)
2z  T (z)
k1
(16.27)
So, if for instance =0.1, damping needs 25 cycles to proceed one order of magnitude, while this is achieved
in 4 cycles with the Chebyshev method (for large k); if =0.01 these figures become 230 and 15 cycles
respectively.
All this presupposes of course that the assumptions are correct: that G is a linear operator and that  and 
are known, at least approximately. A useful implementation will in particular require some way to adapt the
procedure to changes in the apparent spectrum of G, i.e. to update  and  iteratively and to incorporate
such changes in the method.
Impressive results are reported by Akai and Dederichs [1985], but the presented data are very few, making
one suspicious that these may correspond only to non-problematic cases. We have done some
experimentation in a simple Hartree-Fock program, in calculations on small first-row molecules. With rigid,
fixed values for  and , the results varied from reasonable improvement over damping to problematic
oscillatory behaviour. In all cases it proved advantageous to use simple damping in the first few cycles,
before starting up the Chebyshev method; the same experience was mentioned in [Akai and Dederichs
1985]. We have thusfar not worked out a practical implementation with control of the developments and
provisions for problems that might occur.
Lattice vectors
17
96
Lattice vectors
LATTYP reads the lattice vectors from itinpt into array avec(3,3). The inverse-transpose, bvec(3,3)
describes the Bravais lattice in k-space (apart from a scaling factor 2π). The dimensionality in real space
ndim is the number of lattice vectors found in input. Ndimk is the dimensionality in k-space. By default
ndimk=ndim, but a lower value can be requested with the key
. This specifies the number of
directions in k-space for which the dispersion is to be suppressed; the second default, induced by giving
only the key
without a numerical value in the same record, is: ndimk=0, i.e. all dispersion is
neglected; only the -point K=0 is used then.
In one special case the program overrules whatever the input specifications imply by setting ndimk=0, when
each atom in the central unit cell is (or can be chosen to be) separated so far from any atom in a
neighbouring cell that the functions have no overlap. Effectively we have then in each unit cell a 'molecule'
isolated from all other atoms in the system. This is decided by ATMSEP. ATMSEP translates first by Bravais
translations all atoms in the central cell in such a way that they from a maximally compact group. Then the
3
minimum distance to any other atom is calculated, which is then compared with 2 rfar; rfar is the
maximum extension of any of the one-center functions in the crystal (not counting the multipole potentials
of course); rfar is calculated by RADMAX.
If dispersion is to be neglected in some but not all directions in k-space we have to determine which of the
ndim original vectors in bvec are discarded. BAND chooses for this the shortest of the lattice vector(s) that
span the BZ.
Orienta tio n
In many places in the program it is convenient to have the lattice vectors oriented such that only the first
ndim, respectively ndimk cartesian coordinates are needed to describe the Bravais lattices. As the input
lattice vectors are not subject to any condition of this kind BAND performs a rotation to a standard
orientation. The transformation matrix stdrot(3,3) is computed in LATTIC. The input-specified atom
coordinates are interpreted in the original coordinate system; they are rotated in GEMTRY to the new frame.
18
Legendre points
Points and weights for Gauss-Legendre quadrature are generated by routines LEGPNT(n,x,w) and
LEGEND(n,x,w,a,b).
Legendre points
97
LEGPNT is the basic routine. Input is n, the number of requested points; output are the n zeros x(i),i=1,n of
the n-th order Legendre polynomial and the associated weights w(i),i=1,n. The numerical integration with
this scheme is exact for polynomials of degree ≤2n1 [Krylov 1962]. The points are located on (1,1).
The integration scheme can be used for any finite interval by a linear transformation; the algebraic degree of
precision is then preserved. LEGEND calls LEGPNT to obtain the points on (1,1) and transforms them to
(a,b)
xi  a + 2
xi+1
ba
(18.1)
wi 
2wi
ba
computation of the points and weights
Let Pn(x) be the Legendre polynomial of degree n, with zeros xi , i=1,n.
For each zero a first guess x is made with a distribution function (see below) for the zeros. This guess is
improved iteratively by the Newton-Raphson method (i.e. first order Taylor expansion) until convergence is
reached
Pn(x)
x  x  P '(x)
(18.2)
n
Pn(x) and the derivative Pn '(x) are evaluated with the recurrence formulas for Legendre polynomials
[Arfken 1970] (P0(x) =1, P1(x) =x)
1
Pn(x) = n
Pn '(x) =
{(2n1)xPn1(x)
n
1x2
{Pn1(x)
 (n1)Pn2(x) }
n=2,3,..
 xPn(x) }
(18.3)
(18.4)
If x has converged to a zero, the associated weight can be computed from the relation [Krylov 1962]
w=
2
(1x2) (Pn'(x))2
(18.5)
Using recurrence relations for the Legendre polynomials this can be rewritten in numerous different forms.
Since x is only approximately the zero the weight is also only approximately the exact integration weight.
The different possible formulas for the weights result in significantly different deviations from the correct
value. The reason for this is that terms involving Pn(x) are omitted in the analytical forms but they are not
strictly zero in the computation. Trial and error taught us to adopt the formula (from 18.4 and 18.5)
1x2
w= 2
n (Pn1(x)  xPn(x))2
(18.6)
Legendre points
98
Omission of the second term in the numerator, analytically zero, deteriorates the result.
the d istrib utio n f unctio n
The zeros lie symmetrically around x=0. The distribution function gives the approximate location of the
zeros in (0,1). This approximation is
π

(n+12i)
xi  sin 2 [(n+1/2+1)/(8 f(n,i))]


i=1..n/2
π (n+12i)
f(n,i) = 1/2 + n cos 2 (n+1/2) 

(18.7a)
(18.7b)

remarks
#
The Newton-Raphson procedure is only guaranteed to converge (to the intended zero) if the initial
approximation is accurate enough. The approximation function (18.7) is purely empirical; we have not
tried to derive a formal proof of the adequacy. Tests with n≤10,000 gave no problems; they indicate
even that the approximation is better for higher n.
#
We have compared our results with the tables of Stroud and Secrest [1966] for n up to 1,000. For the
higher n-values the weights tend to be less precise than the points, presumably because the weights are
related to the derivative (18.5). For higher n the polynomial Pn(x) oscillates rapidly, especially near the
endpoints x=±1, where the zeros cluster; small errors in x give thus much larger deviations in Pn '(x) and
hence in the weights.
The computed integration weights are added in LEGPNT. This sum should equal 2.0, the length of the
interval (1,1). The deviation from this is 5e13 for n=100, 7e12 for n=1,000 and 7e11 for
n=10,000. The test computations have been carried out on a Cyber750, with approximately 14 digits
floating point accuracy.
#
19
LEGPNT multiplies finally all weights by a uniform normalizing factor to make their sum equal 2.0.
SCF linearization
In big calculations by far the most time-consuming parts in the SCF procedure are
a) the evaluation of the matrix elements of the potential, the iteratively computed part of the hamiltonian.
k
Vpq =
N

i=1
k*
k
wi V(ri ) p (ri) q(ri)
k=1,K p,q=1,n
(19.1)
V(r) is the potential; {wi } are the integration weights. With K k-points in the BZ, n basis functions (in each
k-point) and N integration points the computational effort is propertional to KNn2 .
b) the construction of the density.
SCF linearization
99
k*
k
(ri ) =   Ppq p (ri) q(ri)
k
i=1,N
(19.2)
k pq
Pk is the density matrix in the representation of the basis functions in the given k-point. Again we have the
factor KNn2 .
(In fact there are two ways to compute the density: from the density matrix or directly from the occupied
eigenstates; see SS^charge density for a discussion of this).
A considerable saving of time is achieved when the involved multiple loop structures can be circumvented.
For both cases this is possible, be it only at some cycles of the iterative procedure. The integration points
and the basis functions are fixed quantities, so the only factors that change from cycle to cycle are the
k
potential V(r) in (19.1) and the density matrices Pk in (19.2). The quantities to be computed, Vpq and (ri
), depend linearly on them. So if in the first case the current potential V0(r) is accurately described by a
linear combination of potentials in previous cycles, then the same holds for the potential matrices Vk .
The second case is analogous: a linear combination of density matrices Pk defines the corresponding linear
combination of density functions.
In the SCF procedure of BAND these approximations are tested in the routines POTAPP and RHOAPP
respectively for the potential and the density. The implementation is structured as follows.
a p p ro x im a tio n o f the p o tentia l
File itvstr contains a sequence of nstrp orthonormal (potential) functions vj(ri) , j=1,nstrp. These functions
are not the potentials on subsequent cycles, but they are derived from them. The current potential v0(ri) is
on file itp. By straightforward numerical integration v0 is expanded (POTAPP) in the set {vi }

nstrp
v0 

j=1
cj vj
(19.3)
N
cj =

i=1
wi v0(ri) vj(ri)
(19.4)

Define the error in the approximation as the length (squared) of the difference vector v0  v 0.
err = vnorm 

j
N
vnorm =

i=1
2
cj
2
wi v0(ri)
A logical approx tests the accuracy of the approximation, using a relative tolerance test
(19.5)
(19.6)
SCF linearization
approx = err < test  vnorm
100
(19.7)
As the SCF procedure approaches convergence ever smaller differences in the potential become relevant.
The tolerance parameter test must therefore vary accordingly. SCF convergence is expressed by  (the
variable potdif in BAND), measuring the difference between the crystal potential at the previous cycle and
the potential computed from the new density. The approximation criterium is set to
test = 107 
(19.8)
Now, if the test is passed (approx=true) the coefficients cj are subsequently used in EIGSYS to construct the
potential matrices (one in each k-point) as linear combinations of the (potential) matrices that correspond to
the set orthonormal potential vectors on file itvstr. The matrices are on file ithstr.

If the approximation is not sufficiently accurate, the difference vector v0  v 0 is constructed, normalized
and added to the set; nstrp increases by one. In EIGSYS the potential matrix is computed in the normal way,
by numerical integration (19.1). The file ithstr is extended with the matrix (one for each k-point)
corresponding to the potential vector that has been added to itvstr. The new matrix is the difference between
the exact potential matrix for that cycle and the linear combination, multiplied by the normalization factor
for the potential vector.
remarks
1. The number of potential vectors on itvstr and corresponding matrices on ithstr has a maximum nstrpx;
nstrpx is assigned a value (70) in INIT. If this maximum has been reached and approx=false, i.e. a new
vector should be added, the first vector in the set is removed. The new vector is then defined such, that
the true potential is exactly approximated by the new orthonormal set, i.e. as if the removed vector had
never been there. So, apart from normalization
nstrp
vnew in set = v0 

j=2
vj
(19.9)
Note that the lower bound in the summation is 2.
In practice nstrp hardly ever exceeds 19 at the end of the SCF stage; in most cases convergence is
already reached with nstrp equal to 4 or 5.
2. In a spin-unrestriced calculation we have different potentials for up- and down spins. The
implementation contains therefore a loop over the number of independent spins (one or two), which we
have omitted here for sake of clarity. For normalization and approximation tests the spin coordinate is
treated on the same footing as the real space coordinates, e.g. (19.4) is replaced by
N
cj =

 i=1


wi v0 (ri) vj (ri)
(19.10)
In particular this implies that the linear approximation is either used for both spins, or for neither.
SCF linearization
101
3. File itp does not contain the true potential v0 directly, but two vectors; v0 is defined as a specific linear
combination of them (SS^iteration).
a p p ro x im a tio n o f the d ensity
The approach in routines RHOAPP and RHOPMT is analogous to that in POTAPP and EIGSYS. File itpstr
contains nstrr (density) matrices (for each k-point in the BZ); nstrr is initially zero (INIT) and cannot exceed
nstrrx (=70,INIT).
The current P-matrix, on file itpmat, is expanded in the set on itpstr. Norms and orthogonality for the
matrices are defined by an inner product S
K
S(Pi ,Pj ) =

k=1 pq
k
k
Pi pq Pj pq
(19.11)
Note the summation over the k-points.
The corresponding density vectors j(ri) are on file itrstr.
remarks
1. As in the potential case, in spin-unrestricted calculations the density approximation is used for both
spins or for neither; a loop over the spins is included in the definition of the inner product S (19.11);
only one set of expansion coefficients is determined, applying to both spins simultaneously. The density
matrices and vectors are distinct of course for up- and down spins.
2. Real and imaginary parts of the density matrices are treated as if the matrix P had just one more index
to be summed over (with values 'real' and 'imaginary'). So in fact the inner product is defined (including
also spin now)
K
S(i,j) =

k=1 pq 
k,
k,
k,
k,
Re(Pi,pq ) Re(Pj,pq ) + Im(Pi,pq ) Im(Pj,pq )
(19.12)
Of course this is just a matter of definition. The only goal of the procedure is to have some way to test
the adequacy of the linear approximation. See however below.
final notes
1. The savings in computer time by the linearizations depend strongly on the size of the calculation. In a
big one the approximate evaluations take only a few percent of exact evaluations. During the first cycles
there are of course no data for an expansion: the set has to be build up. During the iterations an
additional vector is needed in the expansion set from time to time. In a typical big calculation with slow
convergence we would have, say 100 cycles to reach self-consistency and exact evaluations in 20 of
them, i.e. linear approximations would be used in 80 cycles. Accounting for other parts of the SCF
procedure and for the fact that the linearization procedures involve some overhead of testing and filemanipulation, the gain is then a factor between 2 and 4 (in the SCF part).
SCF linearization
102
2. The linearizations of the density and potential are tested and used independently. As a matter of
experience the potential behaves better in this respect than the density: linear approximations in EIGSYS
are used more often that in RHOPMT. In some cases the density is exactly computed twice as many
times as the potential matrix.
3. The criterium factor 107 in the approximation tests is a result of trial and error. Obviously a less
stringent test would allow the linear approximations more often, increasing at first sight the gain in
efficiency.
However the switch from approximate to exact evaluation unavoidably takes place from time to time
during the iterations. At such a moment there is a (slight) discontinuity in the development towards the
self-consistent solution: the function space in which the potential, or density, is described, changes
abruptly. The iterative procedure is rather sensitive to this and suffers from instability and oscillations if
these changes are too large. The adopted value 107 appears to be a reasonable compromise.
4. The approximation procedure for the potential is simple and straightforward; in particular the vectors,
their norms and the inner product are defined in a natural way. With respect to the density
approximation we feel less comfortable. The fact that this case performs worse and that the instability
always seems to occur when the computational mode is switched for the density, but not if it happens
for the potential, suggests some inbalance in the procedure. Further investigation might be useful; this
has no high priority however because in practice the implemented strategy functions reasonbly well.
20
Symmetry
The most obvious symmetry in crystals is the translational symmetry of the Bravais lattice. The
corresponding operators, the Bravais translations, are denoted by a capital T. All T are products of n basis
Ti that span the unit cell of the n-dimensional crystal, by which we mean an atomic system with
translational symmetry in n directions; n=0 (molecule), 1 (polymer), 2 (slab) or 3 (bulk crystal).
The complete space group consists of affine transformations {t;R}
{t;R}x = t + Rx
(20.1)
R is a unitary operator and t is a translation which may have components in the n directions of periodicity.
The generators of the space group is a subset of {t;R} that generates all {t;R} by multiplication with the
Bravais translations. The set of generators is defined to be minimal in the sense that not any pair of them is
related by a pure Bravais translation.
A few aspects:
#
The inverse of {t;R} is {t;R}
{t;R}{R
1
t;R
1
-1
={R
(
-1
1
t;R
1
1
}:
}x = t + R R t + R x
)
= t  RR
1
t + RR
1
x=x
(20.2)
Symmetry
#
103
All generators have different point group parts R. Suppose {t1 ;R} and {t2 ;R} are both generators.
Then
{t1 ;R}{t2 ;R}
1
x = t1  RR
1
t2 + RR
1
x = (t1 t2 ) + x
(20.3)
The translation (t1 t2 ) is then a symmetry operator and hence a Bravais translation T. So
{t1 ;R}{t2 ;R}
1
= T  {t1 ;R} = T{t2 ;R}
(20.4)
contrary to the definition that two generators are not related by a pure Bravais translation.
#
For each space group operator {t;R} the point-group part R is a symmetry operator of the Bravais
lattice. Let the symbol ~ denote symmetry equivalency. Then
x' ~ {t;R}
1
x ~{t;R}
1
(
1
x + T ~ {t;R} {t;R} x + T
)
= x + RT
(20.5)
So the translation RT is a symmetry operator and hence a Bravais translation.
#
Until now it has tacitly been assumed that the unit cell is chosen as small as possible. This is implicit in
the argument that a pure translation must be a Bravais translation if it is a symmetry operator. One may
of course define a larger unit cell however, for example the simple cubic 'double' unit cell in a bcc
crystal. This modifies the statements above a little:
With any choice of unit cell, the resulting Bravais group is a subgroup of the true Bravais group.
The larger unit cell can be considered as a conjuction of N smaller unit cells. They are characterized by
operators Ti , i=1...N, of the true Bravais lattice.
These Ti are naturally symmetry operators of the system, but they do not occur now in the employed
Bravais translation group. Consequently they will have to be included in the set of generators. In other
words, in case of a larger unit cell the point group parts R of the {t;R} are symmetry operators of the
true Bravais lattice, but not necessarily of the employed one; moreover each R in the set of generators
now occurs N times; the different translational parts t belonging to the same R differ (or can be chosen
to differ) by the Ti of the true lattice that define the enlarged unit cell.
#
From time to time we have to deal with sets of points in the unit cell and their symmetry relations. For a
point x in the unit cell the image {t;R}x may be located in a neighbouring cell. We define now the
operator {t;R} as the operator {t;R} followed by a Bravais back-translation to the central unit cell, i.e.
{t;R} = {t+T;R}
(20.6)
where T depends on the point to be operated upon; it is always the unique Bravais translation which
projects the image into the central unit cell. This definition is convenient because, given a set of all
equivalent points {x} in the unit cell the points {t;R} x now belong to the same set and are not 'only'
translationally equivalent points. We will call {t;R} a centralizing operator. It is easily verified that the
set of centralizing generators constitutes a mathematical group.
Symmetry
104
co m p uta tio n o f the o p era to rs
The operators {t;R} are represented and computed in the coordinate representation: R is a unitary 33
matrix and t a 3-component vector.
In accordance with the notes above the construction of the operators takes place in two steps:
A
compute the point group symmetry of the Bravais lattice (the holosymmetric point group)
B
for each operator R find the translational part t, if any, that makes it a space group operator.
point group operators
We address a more general problem. Let N sets of points be given, xij , j=1..N, i=1..Mj , in n-dimensional
1
2
n
space: xij (xij ,xij ,..,xij ). We want to compute all real unitary transformations R in the form of nn
matrices, that are simultaneously symmetry operators for each of the N groups of points, Rxij {xkj }k .
An example is the point group symmetry of a molecule: the points are the locations of the atoms; chemically
different atoms fall in distinct sets and the dimensionality is (usually) three. In case of the Bravais lattice we
have only one set, but infinitely many points; we will see however that it is sufficient to consider only the
first few 'stars' of lattice points around the origin.
The algorithm is based on the fact that n pairs of points (pi ,qi ), i=1..n, define a linear transformation in ndimensional space via the system of equations
Rpi = qi
i=1..n
(20.7)
The set {pi },i=1..n will be called the basis and the set {qi },i=1..n the projection. Any admissible basis and
projection must each constitute a linearly independent set, otherwise the system (20.7) does not properly
define the transformation R; in the following we assume that every basis or projection has been checked to
be independent.
Let now R be one of the symmetry operators to be found and let {p} be an arbitrary basis, the points of
which may be taken from any combination of the N sets. Since R is a by assumption a symmetry operator
there must obviously be a projection {q} such that the combination of R, {p} and {q} satisfies (20.7). So
the algorithm is
1. find any basis {p}.
2. loop over all distinct projections {q} and compute R from (20.7).
3. check for every point xij , j=1..N, i=1..Mj , whether Rxij coincides with a point in the appropriate (j-th)
set. If not, discard R.
Although this set-up suffices to determine all symmetry operators, the efficiency can be improved
somewhat. To avoid duplications and unnecessary work in loop 3) we check first, for each computed R the
unitarity and whether it has already been found before. Furthermore, as for each point xij its image Rxij
must belong to the same set j, we impose this condition also on the projection set {q}: qi and pi must
belong to the same set, for all i=1..n; this shortens loop 2).
Symmetry
105
implementation
Subroutine PNTGRP computes the nn matrices corresponding to a given configuration of points in
n-dimensional space.
First a linearly independent basis is constructed; array indgrp stores the types of basis-points: the i-th basis
point belongs to set indgrp(i). Then all distinct n-tuples of points are considered to serve as projection {q}.
Indgrp is used to control that pi and qi belong to the same set. After assuring linear independency of {q}
and unitarity of R, all image points Rxij is examined to see whether R is a symmetry operator.
Two index arrays, indbas and indprj keep track of which points of the sets are used in the basis and
(current) projection respectively.
remarks
#
The identity operator, a symmetry operator of every system of points, is generated first; it is the first
symmetry operator in the output list.
#
The inversion operator, if it occurs, is second in the output list. PNTGRP moves it to this position after
the calculation of all operators.
#
Identity of points and operators is checked by comparing real numbers. The tolerance is set at a fixed
absolute value 106 . To avoid numerical problems it may therefore be necessary to rescale the input
points.
#
The requirement that the basis be independent in n-dimensional space implies that PNTGRP cannot
compute for instance the three-dimensional point group of a flat molecule. A possible solution to this is:
a. rotate the molecule to the xy-plane.
(20.8)
b. generate the point group in two-dimensional space (PNTGRP).
c. copy the obtained 22-matrices to the left-upper parts of 33-matrices, setting R33 =1 and the
remaining matrix elements zero.
d. Incorporate the reflection in the xy-plane in the symmetry group (SYMADD).
e. rotate the operators back to the original coordinate frame.
space group operators
First we construct the point group of the Bravais lattice. For this purpose PNTGRP is called with one set of
points: the lattice points of a few 'stars' around the origin. Let Ti , i=1...n be the unit cell basis vectors of the
n-dimensional crystal. It is then sufficient to consider only the stars of these Ti , because their combined sets
of points has the same symmetry as the complete Bravais lattice:
#
Let R be a symmetry operator of the Bravais lattice. The image of T under R, TR say, belongs to the
star of T because R is unitary and hence preserves lengths. So R is a symmetry operator of each star
separately. In particular this applies to the basis vectors Ti and their stars.
#
Let now R be a symmetry operator for the basis stars. Then for any lattice point T, necessarily a linear
combination of the basis vectors, the image under R is
Symmetry
106
n
RT = R

i=1
n
mi T i =

i=1
R
mi T i
(20.9)
R
which belongs to the Bravais lattice because each Ti is a lattice point. Hence R is a symmetry operator
of the whole Bravais lattice.
Construction of the space group generators thus takes place (SYMCRY) as follows:
1. generate the stars of points corresponding to the basis vectors ( LATTPT).
2. compute from these points the Bravais point group operators R (PNTGRP).
3. for each R find out whether it can be made into a space group operator {t;R}:
The atoms are divided in (chemically) distinct types.
a. take any atom in the central unit cell, say the first atom of the first type, at position x and compute
the image under R: x'=Rx.
b. run over all atoms of the same type and in the same unit cell, at positions y and define the
translational part t of the space group operator as t=yx', so that {t;R}x=y.
c. loop over all atoms of all types and check whether their images under {t;R} coincide with atoms of
the same type, possibly in neighbouring cells. If not, discard this {t;R} and try the next possible
translation (step b) by taking another atom y.
remarks
1. We have to consider only atoms y in the central unit cell. Atoms in other cells result in operators that
differ from those obtained already by Bravais translations.
2. If for a particular R no translational part is found that satisfies the requirements, R is discarded
altogether.
3. If for a particular R more than one t is found, then the crystal unit cell has not properly been defined, as
discussed above. The true, smaller unit cell is then computed and the procedure restarted. This is
necessary because with a too large unit cell we may not have found all point group operators of the true
Bravais lattice.
The determination of the true unit cell is as follows:
Two arrays of lattice vectors are present. The first, the array avec in the program, is fixed and stores
the lattice vectors as defined by the user. The second, vlatt, describes the true unit cell; it is only used
locally (in SYMCRY). Vlatt is initiated by copying avec; it may be changed in the course of the
symmetry analysis.
Since the identity E is the first operator in the output list of PNTGRP, this is the first operator
processed in the construction of the space group generators. Obviously {0;E} is a space group operator.
Any next t that is found with E, must be an element of the true Bravais lattice. t is expanded in the
vectors vlatt.
n
t=

i=1
ci vlatti
(20.10)
Symmetry
107
If vlatt describes a too large unit cell, then for at least one t belonging to E, one of the expansion
coefficients is non-integer. By adding or subtracting integer multiples of the {vlatti } we construct the
Bravais translation (of the true lattice)
n
v=
 ci
i=1
vlatti
(20.11)

where all ci are in the unit interval [0,1) and differ from the ci in (20.10) by integers. By assumption at

least one of the obtained coefficients, cj say, is non-zero; in the set {vlatt} we replace then the j-th
vector by v. The new set describes a smaller unit cell. The procedure can now be restarted: generate the
pointgroup corresponding to {vlatt}, etc.
When each t found with E has an integer expansion (20.10) the correct unit cell has been found and the
procedure can be continued with all other operators R.
4. The same Bravais lattice may be generated with different sets of basis lattice vectors. The crystal unit
parallellepiped is not uniquely defined: we may add to any of the basis vectors an integer multiple of
any of the other basis vectors. For various reasons it is convenient in the program (though not strictly
necessary) to avoid extreme choices in this respect and define the unit parallellepiped as compact as
possible. This is done by recombining the vectors such that they have minimal lengths. LATTCH (lattice
check) performs that task; LATTCH checks in this way also linear dependency of the lattice vectors.
sym m etry in k - sp a ce
The connection between the space group symmetry in real space and the symmetry in k-space is easily
derived [Jones 1975]. According to Bloch's theorem any eigenstate n(k;r) can be written in the form
n(k;r) = un(k;r) eik·r
(20.12)
where un(k;r) is symmetric with respect to all Bravais translations
un(k;T+r) = un(k;r)
(20.13)
The occurrence of k in the argument list of the function u does therefore not imply that it transforms as the
corresponding irrep of the translation group, but it signifies only that the u-parts of eigenstates (k) at
different k-points are different.
It follows from the hamiltonian equation
{/2 + V(r)} (k;r) = e(k) (k;r)
(20.14)
that the periodic function u(k;r) satisfies
{/2 + k2 /2  ik· + V(r)} u(k;r) = e(k) u(k;r)
Let now {t;R} be a space group operator, defining the coordinate transformation rr'
(20.15)
Symmetry
108
ri '  ({t;R}r)i = ti +

j
Rij rj
(20.16)
∂
Rji ∂r '
(20.17)
For the derivatives this gives
∂
∂ri =
∂r ' ∂
 ∂rji ∂rj' = 
j
j
j
Define then the transformation in k-space kk' by
ki '  (Rk)i =

j
Rij kj
(20.18)
Simultaneous application of the transformations (20.16) and (20.18) leaves (20.15) invariant because R is
unitary and V(r')=V(r). Hence the solutions {e,} in k and k' are symmetry related:
en(Rk) = en(k)
(20.19a)
un(Rk;{t;R}r) = u(k;r)
(20.19b)
i  Rmntmkn
n(Rk;{t;R}r) = e mn
n(k;r)
(20.19c)
So, if {t;R} is a symmetry operator in real space, then R is a symmetry operator in k-space.
In addition the inversion operator J is a symmetry operator in k-space, regardless of the space group:
(20.15) is transformed into its complex conjugate; all eigenvalues en(k) are real because the hamiltonian is
an hermitian operator. Hence
en(k) = en(k)
*
un(k;r) = un(k;r)
(20.20a)
(20.20b)
The construction of the point group operators in k-space is thus straightforward once we have calculated the
space group generators {t;R}:
a. gather all distinct point group operators R occurring among the {t;R}.
b. add the inversion operator J if it is not yet present (and of course all products JR).
Whereas real space is 3-dimensional this does not necessarily hold for k-space: in an nD crystal k-space has
only n dimensions. The operators must then of course also be operators in nD space and can be represented
as nn matrices. For a slab for instance the reflection in the plane is an irrelevant operator in k-space.
With an appropriate orientation of the coordinate system the relevant part of the real space operators R is
then the left-upper nn submatrix. From (20.18) we conclude that this is an operator in k-space if the
Symmetry
109
summation can be restricted to the first n terms, that is if all Rij are zero when i≤n, j>n or vice-versa.
Otherwise a point k would be mapped onto a point 'outside' the nD region.
The computation of the k-space symmetry group is therefore
1. For each {t;R} check that it does not couple the first n coordinates to the last 3n, and gather all
resulting distinct nn matrices (SYMPRJ).
2. Add inversion in nD space (SYMADD).
remark
The dimensionality of k-space is in BAND not only determined by the translational symmetry of the atomic
system. If that were the case, none of the operators {t;R} could couple the nD space to the other
coordinates. However, BAND optionally neglects dispersion in certain directions in k-space, thereby
artifically reducing the dimensionality of the BZ. This option may be activated (via input) for example when
the crystal unit cell is much larger in one particular direction that in others (SS^BZ-integration).
integ ra tio n in k - sp a ce
The point group symmetry in k-space is applied to reduce integrations over the BZ to integrals over the
irreducible wedge, the symmetry unique part of the BZ. In the employed analytic-quadratic integration
method [Wiesenekker et al. 1988, Wiesenekker and Baerends 1990] the BZ is divided in simplices and the
integration points are the vertices and the midpoints of the edges. The BZ is constructed as a Wigner-Seitz
polyhedron (in two dimensions a polygon, in one dimension an interval). The irreducible wedge is
determined (SS^BZ-integration) and that region is divided in simplices etc. By use of symmetry the number
of k-points needed is thus reduced; this is an important saving in computer time since virtually all costdetermining aspects in a calculation scale with the number of k-points processed.
The energy bands en(k) are totally symmetric; hence a quantity like the density of states (DOS) can be
computed by integration over the irreducible wedge only. This does not apply however to all properties.
The eigenstates in symmetry related k-points are not identical (but they are symmetry related). If {t;R} is a
space group operator and k' and k are related by k'=Rk, then
n(k';r) = e it.k' n(k;{t;R}1r)
(20.21)
Let now F be a quantity to be computed as
F=


n
dk Fn(k)
(20.22)
BZ
where Fn(k) is the contribution to F from the one-particle state n(k;r)
Fn(k) = F[n(k;r)]
(20.23)
Symmetry
110
The summation is over the bands and integration is over the complete (first) BZ. Using the space group
generators {t;R} in real space and the corresponding operators R in k-space (20.22) can be rewritten as
F=


 dk Fn(Rk) =
{tR} n irr.BZ

1

 dk F[n(k;{t;R} r)]
(20.24)
{tR} n irr.BZ
The charge density for instance is given by
(r) =

1
2

 dk n(k;{t;R} r) 
{tR} n irr.BZ

irr({t;R}1r)
(20.25)
{tR}
irr(r) is the charge density resulting from integration over the irreducible wedge only.
The occupation numbers on(k) for the one-particle states, which are the weights for the numerical
integration over the irreducible BZ, are in the program determined such, that they represent also the
equivalent k-points. The integrals over the BZ are therefore scaled already and e.g. the symmetrized density
is computed as
1
(r) = N
G

{tR} n
k
1
1
on(k) n(k;{t;R} r)2  N
G
 irr({t;R}1r)
(20.26)
{tR}
NG is the number of symmetry operators.

We see that the total charge density is obtained by projecting out the symmetric component of irr(r) .
In BAND the density, like all crystal functions, is represented by its values in the crystal integration points.
The integration scheme is symmetric, which allows the symmetrization (20.26) by straightforward
numerical integration (see below).
integ ra tio n in rea l sp a ce
The numerical integration formula is symmetric: if r is an integration point and {t;R} a space group
operator then r'{t;R}r is also an integration point and the weights are equal. The symmetry unique points
are called the generators (of the integration formula).
The construction of the symmetric scheme is extensively discussed in chapter III. Here we are concerned
with the utilization of the symmetry property.
1.
The component of a particular irreducible representation (irrep) of the group can be projected out of a
function. Whereas this applies to any irrep, the most common application is the symmetrization of a
function, i.e. the projecting out of the (totally) symmetric conponent.
This is done in BAND with the density. As we saw above, the integration in k-space over the irreducible part
of the BZ yields an incorrect, that is, non-symmetric charge density: we need the symmetric component. The
projector is (20.26)
Symmetry
111
1
PA1 =
NG

{t;R}
(20.27)
{tR}
The points on file itpnt are grouped in a number of (large) blocks. Each block contains a sequence of sets of
symmetry related points. The file with basis function values is organized accordingly and the a-symmetric
density, computed from the basis functions (SS^charge density) has the same structure. Symmetrization is
performed by averaging the values in each set of symmetry related points (20.27). This happens in RHOPMT
(or RHOPSI, see SS^charge density).
2.
If a function belongs to an irrep of the group, we have to compute and store only the values in the
generators, the symmetry unique points. The values in related points can be derived when desired from the
transformation properties of the irrep. This possibility is used in BAND for the density, the potential, the fit
functions and the fit potentials. All these functions are totally symmetric.
With the data structure on itpnt the functions can easily be expanded again over all points. This is done for
the potential in EIGSYS, where the matrix elements of the hamiltonian are evaluated by numerical
integration of the potential against products of the basis functions (the basis functions currently used do not
belong to irreps).
3.
The numerical integral of a function is greatly simplified if it belongs to an irrep. Any irrep which is
not the symmetric one integrates to zero. For a symmetric integrand we have to loop only over the generator
points because the values in the related points are equal and they can be accounted for by multiplication by
the appropriate factor (the number of equivalent points), which can be included in the weights of the
generators.
This is used for the expansion of the density in the fit set (7.6a) in RHOFIT.
It would be an enormous improvement in efficiency of BAND if also the valence basis functions were
organized according to the irreps. This block diagonalizes the hamiltonian matrices. The off-diagonal
blocks are zero on grounds of symmetry and hence need not be computed at all. For the diagonal blocks the
loop over the integration points reduces to the generators.
sp ecia l sym m etry ro utines
Some specific tasks related to symmetry are performed by special routines. These are examined here. Most
of them are general as regards the dimensionality; the point group operators are represented as nn
matrices.
extension of a point group
Given a point group G0 and an operator R, SYMADD constructs the compounded group, consisting of G0
and R and all additional operators required to make the new set a group.
A set of point group operators can be extended to a group by computing all possible product operators and
adding them if they are not yet present, until the set is closed under multiplication. Both the 'left' and the
Symmetry
112
'right' products have to be checked since for an arbitrary pair of operators R1 and R2 the products R1 R2
and R2 R1 need not be identical.
This straightforward method assures that also the identity E and the inverse of each operator will be
1
M1
generated: for every point group operator R there is an integer M such that RM =E. Hence R =R
so
that R and E both occur among the products of R with itself.
SYMADD uses this algorithm but supposes that the input set G0 is already a group: products among its
operators are not checked for occurrence in the set.
checking and reducing a point group
SYMCHK checks a point group against a set of points and removes the operators that are not symmetry
operators of the point set. The points may be divided in subsets of equivalent points.
The algorithm is simple: for each operator R compute all image points and check whether these coincide
with points in the appropriate subsets.
Note that symmetry operators of the point set that were not present in the input group are not generated, i.e.
the output group does not necessarily represent all symmetry of the point set (compare routine PNTGRP).
checking completeness of a point group
GRPCHK checks whether a set of nn matrices is closed under multiplication, i.e. whether it constitutes a
group. An output error parameter ier gives the outcome (0:group, 1:not a group).
analysis and characterization of a symmetry operator
MAT3AN analyzes a unitary 33 matrix (unitarity is checked). Output are the determinant D, an axis a
(normalized vector) and an angle . For a pure rotation (D=1) a and  are the axis and angle of clockwise
rotation. An improper rotation (D=1) can be written as a reflection times a rotation around the normal on
the reflection plane; the axis and angle refer to that rotation; a pure reflection has =0.
Two special operators are the identity E (D=1, =0) and the inversion J (D=1, =π). For both of them the
axis is undefined and it is taken arbitrarily as the z-axis. MAT3AN checks first whether R is either of these
two operators. For all other operators R the axis and angle are computed as follows.
An auxiliary vector w orthogonal to the axis a is constructed.  is then the angle between w and Rw; the axis
a is the vector product wRw. A suitable w can be computed from an arbitrary vector v by removing the
component along the axis. For a pure rotation (D=1):
w = (R1)v
(20.28)
and for an improper rotation (D=1):
w = (R2 1)v
(20.29)
Symmetry
113
Of course the 'arbitrary' vector v must not coincide with the axis, since that would give w=0. This is
prevented in MAT3AN by taking for v the x-axis unless R11 =1 (20.28) respectively R11 =1 (20.29); in that
case the y-axis is taken.
A zero result for w is also found for special angles of rotation. For a pure rotation (20.28) if =0 (the
identity E); for an improper rotation (20.29) if =π (the inversion J) or =0 (a pure reflection). As E and J
have been checked a priori the only possibility is the pure reflection. So, if w is found to be zero, the
situation is immediately clear: =0 and the axis is given by
a = (R1)v
(20.30)
Finally, in case of a pure rotation (not the identity), the determination of the axis as the vector product
wRw may yield zero. If this situation is encountered (D=1, =π) a second general vector w' orthogonal to
the axis is constructed (20.28), where v is now the y-axis (or the z-axis). The axis a is then found as the
vector product ww'.
local symmetry around an atom
GRPTYP is used in the integration package POINTS to determine the appropriate type of integration formula
for the atomic spheres. The symmetry of the formula must correspond to the local symmetry of an atom.
The available special (Lebedev) formulas for the spherical surface all have the octahedral symmetry. Apart
from these we may generate a product formula in the coordinates cos and  [chapter III]; this can be used
for all axial groups. The only point group type not covered is the icosahedral symmetry. Special formulas of
this type are known [Stroud 1971], but they have not yet been implemented in the integration package.
Icosahedral symmetry is rarely encountered in polyatomic systems and thusfar this restriction in the
possibilities has not played a role.
GRPTYP determines from a set of point group operators (33 matrices) whether the group is of the
octahedral or axial type; the octahedral type includes subgroups such as the tetrahedral group; an
icosahedral group will be detected as an 'error'.
Apart from the type of symmetry, GRPTYP computes the rotation matrix that will bring the coordinate frame
into a standard orientation (in which the implemented integration formulas for the sphere are defined). The
standard orientation of an axial group has the z-axis as the axis of rotation and the x-axis in a 'vertical'
reflection plane (if there is any). For an octahedral group (or subgroup) the standard orientation is the usual
orientation of the cube: 4-fold rotations around the coordinate axes and 3-fold and improper 6-fold rotations
around the (111)-directions.
Finally GRPTYP outputs for axial groups the order of the rotation.
The algorithm in GRPTYP is based on the determination of the main axis and the highest order secondary
axis of the group. These are found by subsequently analyzing all operators ( MAT3AN) and updating the
main and secondary axes and their orders of rotation.
Symmetry
114
A group is defined (in MAT3AN) to be of the axial type whenever the secondary axis has a rotation order
not higher than 2; otherwise it is of the octahedral type. In the latter case the angle between the main and
secondary axis is checked in relation to their rotation orders (6-fold, 4-fold or 3-fold). In this way the
icosahedral symmetry is detected as an 'incorrect octahedral' symmetry.
correcting symmetry inaccuracies in a set of points
Given a set of points which is approximately symmetric with respect to a symmetry group, SYMTRZ
removes the numerical noise from the position coordinates. The points are displaced slightly, such that the
output set is exactly symmetric (to machine accuracy). This is done by explicitly projecting out the
symmetric component of the set of points.
The symmetry group is a space group, of which the generators {t;R} are input into SYMTRZ; the set of
points in the crystal unit cell must be complete: if x is in the set then {t;R} x must belong to the set, apart of
course from the coordinate inaccuracies to be corrected. {t;R} is the centralizing generator, defined earlier
in this section.
A 'normal' point group, e.g. for a molecule, can be handled by specifying the crystal dimensionality as zero
and giving the zero vector for all translational parts t of the operators.
Given x there is for every {t;R} exactly one point y in the set such that x is (approximately) the centralized

image of y under {t;R}. The exact image of y under {t;R} , approximately equal to x, is denoted x({t;R})

x  x({t;R}) = {t;R} y
(20.31)

Running over all operators, the points x({t;R}) are all images that should coincide with x. The exactly
symmetric points are computed as the mean positions
1
x' = N
G
 x({t;R})
(20.32)
{t;R}
NG is the number of operators. It is easily verified that this definition yields a perfectly symmetric set.
Consider the image of such a point
1
{t;R} x' = N
G
 {t;R}x({t;R}')
(20.33)
{t;R}'
Let z be the approximate image of x under {t;R} , then


{t;R}x({t;R}') = z ({t;R}{t;R}')
is one of the image points close to z. Hence
(20.34)
Symmetry
1
{t;R} x' = N
G

1

z ({t;R}{t;R}') = N
G
{t;R}'

115

z ({t;R}'') = z'
(20.35)
{t;R}''
This is precisely one of the points in the output set (the corrected point z), as required. In the second
equality in (20.35) we used that the centralizing generators constitute a group.
It is assumed here that the operators themselves are exact (otherwise the product of two operators might not
equal one of the operators exactly and hence the second equality in (20.35) would not be correct. As a
consequence, if the operators have been determined by PNTGRP from the (possibly inaccurate) set of points,
the symmetrizing cannot be done with these operators. It would be necessary to symmetrize first the
operators themselves.
operators in the spherical harmonics representation
A point group operator R transforms a spherical harmonic Zlm() into a linear combination of the (2l+1)
functions Zlm'() , m'=l...l. SYMZLM computes the matrix Zl of that representation (More precisely,
SYMZLM computes for a sequence of operators all Zlm -representation matrices up to a maximum l-value).
By spherical harmonics we understand here the real-valued spherical harmonics (23.30) as they are
employed in BAND.
The matrix elements of Zl are
l
*
*
1
Zmm' = 
 Zlm'() RZlm() d = 
 Zlm'() Zlm(R ) d
(20.36)
The integrand is an angular polynomial of degree 2l, so that (20.36) can be evaluated exactly by numerical
integration with a formula of the appropriate degree of precision. SPHPRD generates a product formula in
cos and  of any desired degree. Let the points and weights of the formula be i(x,y,z)i wi , i=1..n. The
matrix elements (20.36) are then computed by
1. generate an integration formula of the appropriate degree: SPHPRD (or SPHPNT, the general routine for
spherical integration).
2. calculate Zlm(i) , i=1..n for all required (l,m): VZLM.
3. rotate all points to obtain R
1
i. Since the operator R and the points (x,y,z)i are both in the coordinate
representation, this is straightforward.
1
4. call routine VZLM again, now with the rotated points as input, to evaluate Zlm(R ) , i=1..n.
l
5. compute Zmm' as
l
Zmm' =

i
*
1
wi Zlm'(i) Zlm(R i)
(20.37)
l
SYMZLM calculates Zmm' for all l-values l=0..lmax and for a sequence of noper operators. The result is one
large array oprzlm. It contains successively for all operators the (trivial) 11 l=0 matrices, then for all
operators the 33 l=1 matrices, and so on.
Symmetry
116
sym m etry a d a p ted f unctio ns
The symmetry property of the crystal integration grid allows the computation of all symmetry components
of any function from the function values in the integration points. To construct in this way from a set of
functions the combinations that transform as irreps, would be a time consuming procedure. It is easier to
calculate the combination coefficients from the analytical properties of the functions.
It is natural to set up the irreps in a crystal in two steps. First we take a point k in the (first) BZ, by which we
specify an irrep of the translation group. The generators {t;R} of the little group of k (i.e. Rk=k) are then
used for a further reduction to the irreps of the space group. For a general point k there are no such
operators (except the identity) and we are finished immediately. Many of the k-points that are used in an
average calculation have more symmetry however, because they lie often on symmetry elements (reflection
planes, rotation axes) in k-space; the central -point k=0 even has all generators in its group.
Let {t;R} be the operators of the k-point under consideration. The centralizing operators {t;R} constitute a
group and can be treated like any point group: analysis of the multiplication table to compute the classes,
characters and irreps. (For the moment we neglect the translational symmetry k).
n

Denote the matrices of the irreps by Dij , i,j=1...n ; n is the dimension of irrep . The projector Pi ,
which picks up from a given function the component corresponding to irrep  and column n is derived from
the orthogonality theorem of the theory of groups

{t;R}
NG


1
Dij({t;R} ) Dkl({t;R}) = n il jk 

(20.38)
NG is the number of operators. The projector is
n
n
Pi = N
G

{t;R}

1
Din({t;R} ) {t;R}
(20.39)
i is an arbitrary row-index of the representation matrix. It is usually kept fixed for all column-indices n; this
assures for instance that the hamiltonian matrices of all partner representation columns are identical.
Suppose now that we have M functions which span a reducible representation, and that we have the
operators {t;R} as MM matrices in the representation of that function set. (20.39) gives then also the
projector as an MM matrix. This projector matrix has eigenvalues 0 and 1. The eigenvectors corresponding
to the latter are the n-adapted functions. The combination coefficients can thus be computed by
diagonalization of the projector matrix. Equivalently they can be found by a Schmidt orthogonalization
procedure: take subsequently every column of the projector matrix, orthogonalize it on those obtained
before (and normalize). Some columns yield the zero vector; these singular solutions correspond to the
eigenvalues zero and can be discarded.
Symmetry
117
(The apparent arbitrariness in the Schmidt orthogonalization procedure (that we may take as the first
solution any non-zero column) corresponds to the fact that all non-singular eigenvectors are degenerate.
Any linear combination is also a solution. So it is also arbitrary which eigenvectors are actually computed,
although this may not be realized when one implements a call to some standard diagonalization routine).
The functions employed in BAND fall in two classes: spherical harmonics and plane waves. We discuss first
the spherical harmonics. The plane waves will be treated essentially in the same way.
spherical harmonics


Consider a function flm(r) =P(r )Zlm() centered on atom . A symmetry operator {t;R} maps flm(r) on
a symmetry equivalent atom , possibly  itself, and in addition it may rotate and/or reflect the orientation.
The radial part P(r) is not relevant here and can be omitted from the discussion. The degree l of the angular
polynomial is not changed by any operator because symmetry operators are linear operators. The functions

spanning a representation can thus be limited to {Zlm() }m=-l..l , where  runs over the equivalent atoms.
We deal with the translational symmetry later and restrict  to the atoms in the unit cell, N in number say. A
spherical harmonic Zlm() generates then a representation of dimension M=(2l+1)N.
l
SYMZLM computes for the point group operators R the matrices Zmm' which are their Zlm -representations
(see elsewhere in this section). From this we construct {t;R} as the MM matrix. This matrix is in a natural
way divided in NN blocks of size (2l+1) each. If {t;R} maps atom  on the equivalent atom , then the
l
corresponding matrix Zmm' occupies the (,)-block of the large matrix; the other blocks in the -th blockcolumn are zero. The complete MM matrix is determined by filling the appropriate block in each blockcolumn. Viewed as a NN matrix with unity for the occupied blocks and zero otherwise, we have the
operator as a permuation of the equivalent atoms in the unit cell.
Summation (20.39) yields the projection matrix Pn , which may then be diagonalized, or treated by the
Schmidt-orthogonalization method. This gives us all distinct n-adapted functions.
translational symmetry
n,l
Incorporation of the translation symmetry is as follows. Denote by cm , =1..N, m=l..l the coefficients

which define a n-adapted function as discussed above. Let flm(r) =P(r )Zlm() be a function to be
symmetry adapted and k the k-point under consideration. The required crystal function is then
k,n
l
(r) =

T
m
n,l 
cm flm(rT) eik·(R+T) =

m
n,l

cm eik·R 
 flm(rT) eik·T

T
(20.40)


The term in brackets is the familiar bloch sum of the generating function flm(r) ; R is the position of atom
. The procedure is thus straighforward: compute first the coefficients {c} from the operators. Then, for
each k-point: construct the bloch sums, calculate the phase-factors e
ik·R
and combine (20.40).
Symmetry
118
plane waves
The plane waves ei(k+K)·r are characterized by a point k in the BZ and a reciprocal lattice vector K. For a
given k a representation is generated by the star of K: all K that transform into each other under the
operators R. The matrix Z that represents {t;R} in this function set has elements Z =eiK'·t if RK'=K and
KK'
zero otherwise. Diagonalization of the projection matrix (20.39) (or Schmidt orthogonalization) gives the
n
coefficients cK for the symmetry adapted combinations
k,n
K
(r) = eik·r

K'
n
cK eiK'·r
(20.41)
The translation symmetry is incorporated automatically by the prefactor eik·r .
fit functions
The fit functions for the charge density are combined into symmetry adapted functions. The only
representation needed is the totally symmetric one; in particular the translation symmetry is k=0.
The projector (20.39) for the totally symmetric representation is simple. The dimension of the
representation is n =1 and all representation matrices D are scalars, equal to 1.
ZLMPRJ, with auxiliary routine ZLMPR1 constructs the projector. Via a Schmidt orthogonalization all
distinct symmetric functions are obtained from a given generating spherical harmonic fit function (plane
wave fit functions are not employed in BAND).
Given the number of generating one-center fit functions for each type of atom and their l-values the total
number of symmetric fit combinations can be computed. This is done in FITSYM to assist in the efficient
workspace organization (SS^workspace). The coefficient vectors describing the fit combinations, calculated
in ZLMPRJ, are used in FITPNT during the interpolation and bloch summation (k=0) of the one-center
functions to construct the symmetry adapted fit functions.
basis functions
Symmetry adaptation of the basis functions has not yet been implemented in BAND. Since that would
increase the efficiency enormously it may be one of the future developments of the program; we spend some
words here on a few aspects of a possible set-up.
1. The computation of the combination coefficients, which define the symmetry adapted functions in terms
of the primitive one-center functions and their bloch sums, can proceed as described before, repeating
the procedure for each k-point.
Since in each k-point the local symmetry group is a subgroup of the space group and all generators
belong to the local group of k=0, it is also possible to determine the combination coefficients only once,
using all generators (i.e. we do it for the local group of k=0). The coefficients can then be used in each
Symmetry
119
k-point and define suitable functions (20.40 and 20.41). However it has then to be determined yet,
which of the irreps of the k=0 group are coupled in the k-point under consideration.
2. The gain in efficiency from a symmetry oriented basis set is twofold. First: in each k-point the
hamiltonian Hk is automatically block-diagonalized. The off-diagonal blocks are zero on grounds of
symmetry and we don't have to compute them in EIGSYS. Similarly, in the construction of the density
(RHOPMT, see SS^charge density) the double loop over the basis functions is reduced to a sequence of
shorter double loops: the P-matrices are also block-diagonal.
The additional level in the multiple loop structures (the loop over the irreps) requires that we know the
partition of the basis functions over the irreps and suggests that the basis functions of the respective
irreps should be grouped together.
The computation of the basis functions in BASPNT may be organized in analogy with the
construction of the fit functions in FITPNT (with the symmetry coefficients from ZLMPRJ). For a proper
dimensioning of arrays one could use routines like FITSYM and ZLMPRJ to determine in advance the
number of symmetry adapted functions for each irrep. This information can be stored on file for usage
by EIGSYS, RHOPMT etcetera.
3. The second enhancement of efficiency results from the numerical integration. When the functions
correspond to irreps, a large number of integrals is zero on symmetry grounds and the remaining
integrands are (or can be chosen to be) totally symmetric so that the integration has to run only over the
symmetry unique points. This saves of course CP-time in the evaluation of the matrix elements. A
similar improvement results for the construction of the density. Consequently we have to store on file
only the function values in the generator points; this reduces also the demand on disc storage facilities,
which is currently a bottleneck in the program.
4. A slight complication is that the irreducible wedge is a different thing for different k-points. For each
k-point we have to determine anew which integration points are the symmetry unique points.
Then, for the evaluation of the potential matrix elements (EIGSYS) we have to expand the potential,
known in the truly unique points (i.e. symmetry unique for the complete space group), over the
integration points used in that k-point. In the current set-up the potential has to be expanded over all
integration points; this is done with help of the symmetry information on the points file itpnt: the points
are organized in sets of equivalent points (SS^integration). In a symmetry oriented set-up we have to
know analogously the relation between the truly unique and the locally unique point sets. Since this
depends on the k-point we may need a different point file for each k-point (more probably we will have
different sections on one point file, since the total amount of these data is limited).
The same type of k-dependent information is also needed in the construction of the density (SS^charge
density): From the P-matrix (RHOPMT), or from the hamiltonian eigenfunctions (RHOPSI) the density is
computed in the points which are used for the basis functions; finally the density is contracted to the
truly unique points (and symmetrized in the proces by averaging, see elsewhere in this section).
Temperature
120
21
Temperature
Bandstructure calculations usually
assume zero temperature. A finite
temperature implies, among other things,
a different distribution of electrons over
the eigenstates. At T=0 the fermi energy
eF is a sharp boundary between fully
occupied and completely empty states.
This is the limiting case of the general
fermi-dirac distribution at temperature T,
giving the occupation of a state with
energy e as
occ(e) =
1
(21.1)
1+e(eeF)/kT
eF is fixed by the condition that the
summation of occ(e) over all one electron
Fig.8. Fermi-dirac distribution with a Riemann-Stieltjes
like integration.
states yield the total number of electrons.
In finite systems, like molecules and
clusters, the eigen energies are discrete.
The determination of eF and the
occupation numbers is then straightforward. In bandstructure calculations this is more complicated as we
have bands of eigenstates. Only states at a few discrete k-points in the Brillouin Zone are computed. These
states may be thought to represent all states of the surrounding region in k-space and the occupation
numbers are then associated with the fraction of represented states that is occupied. However, the
occupations are calculated as weights belonging to a particular numerical integration scheme in k-space
(SS^BZ-integration) and the values may be unexpected, even negative in some cases. Incorporation of a
finite temperature in that integration procedure is not obvious. We approximate therefore the effect of the
correct fermi-dirac distribution in another way.
Of course we are not interested in the occupation numbers themselves but in quantities that are computed by
a summation over one-particle states. Let A be such a quantity and An(k) the contribution from state
n(k;r), A is then computed as
A=

kn
cn(k) An(k)
(21.2)
The integration over k-space has been replaced by a summation over discrete k-points and index n runs over
the bands. In case of the density for example A=(r) and An(k) =n(k;r)2. As mentioned above the
employed numerical integration method in k-space implies that the coefficients cn(k) cannot simply be
Temperature
121
associated with fermi-dirac distribution numbers. The true expression however, taking all infinitely many
k-points into account, would be
A=

BZ

n
occ(en(k) ) An(k) dk
(21.3)
where occ(e) is given by (21.1).
The value of A depends on eF and T via occ(e) in (21.3), so A=A(eF ,T).
Consider now the fermi-dirac distribution and imagine (fig.8) a Riemann-Stieltjes or similar integration
scheme in the variable occ on the interval [0,1]. The points oj , 0<o1 <o2 <...<oN <1, have weights wj such
that wj is the part of the occ-interval represented by oj ; ej is the energy value corresponding to oj via
equation (21.1): oj =occ(ej ). Hence (21.3) can be approximated by
A(eF ,T) 

j
wj A(ej ,0)
(21.4)
Our k-space integration procedure yields coefficients cn(k) corresponding to T=0, but depending on the
fermi energy: cn(k) =cn(k;eF) , so that A may be computed as
A(eF ,T) 

j

cn(k) =

j
wj A(ej ,0) 

wj
j

kn
cn(k;ej) An(k) 
 cn(k)
kn
An(k)
wj cn(k;ej)
(21.5)
(21.6)

The final coefficients c are thus defined as numerical integrals. The integration variable is occ, on the
interval [0,1], so (21.6) has to be read as

cn(k) =

j
wj cn(k;e(oj)) 

j
wj cn(k;oj)
(21.7)
where e(occ) is the inverse function of (21.1).
The accuracy of the result depends of course on the functional dependence of the coefficients c on occ and
on the employed numerical integration method. One would like to use some high precision method like
Gauss-Legendre quadrature. There are two types of problems. In the first place c may have discontinous
derivatives at occ-values, for which e(occ) is a band-edge or another van-Hove energy. This may be
remedied by splitting the interval [0,1] in appropriate subintervals and using separate numerical integrations
for each of them. In the second place the energy e(occ) has infinite derivatives for the end-values 0 and 1;
the same may be true for the coefficients c, at occ=0 or 1, as well as at band edges, so that some variable
transformation may be needed to make Gauss-Legendre quadrature applicable.
Temperature
122
At the current state of affairs BAND uses a straightforward Legendre integration without bothering about
these problematic aspects.
The temperature T and the number of integration points (nfdirc in the program) are input via the keys
and
(or
respectively; default values are nfdirc=2 and T=10. The latter has the effect of
T=0 on the final results: in energy units a temperature of one degree equals approximately 3106 a.u. so
that on the scale of interest (103 a.u.) a temperature of 10 degrees gives virtually a step function for the
fermi-dirac distribution.
The temperature is in BAND represented by the variable tfdirc; its numerical value is the temperature in
atomic energy units (input is in degrees however).
When the SCF procedure has convergence problems, detected in SCFTST (SS^iteration), nfdirc is increased
by 2. Convergence problems are often related to near degeneracies around the fermi-level; a more detailed
approximation of the correct distribution of occupation numbers over the involved states may facilitate the
iterative procedure (but usually this does not help much).
Workspace
22
123
Workspace
d yna m ica l a llo ca tio n o f a rra ys
BAND needs many arrays to store data: the coordinates of the atoms, the characteristics of the basis
functions, the values of functions in the integration points and so on. Fixed sizes for all these arrays are
unwieldy. With any configuration of fixed sizes we will soon encounter an application that does not fit,
while just a redistribution of the claimed workspace would suffice. This would then necessitate a
recompilation of the program with adapted array bounds. Fixed sizes imply also that in every run we pay for
memory we do not use.
FORTRAN77 does not support dynamical array allocation. BAND simulates dynamical allocation of arrays
with subroutine ARRAYS. We will refer to this routine and the related data structures as BAND's workspace
manager.
The total workspace available is fixed of course (FORTRAN77) and consists of two large arrays in (blank)
common, one for integers, iwork, and one for reals, rwork. Whenever an array is needed in the course of the
program execution ARRAYS is called. It gives back which element of iwork or rwork respectively is to be
used as starting address. Later it may be called to release the claimed space again, when the allocated array
is not needed anymore.
A typical structure would be
n = ..
call arrays ('allocate','real','store',n,ii)
call b(n,rwork(ii),....)
..
call arrays ('delocate','real','store',....)
(22.1)
end
subroutine b (n,store,....)
dimension store(n)
..
The allocated arrays are identified by a name, which is stored by ARRAYS in a list of names, together with
their positions in iwork or rwork. Between each pair of allocated arrays in iwork or rwork one word is kept
free. ARRAYS writes a marker there: a specific integer or real value that is improbable to occur often. As
soon as ARRAYS is called to give the space free again, it checks whether the markers directly before and
after the array are intact: global array-bounds checking afterwards.
ARRAYS discriminates two types of data: integers (type 1) and reals (type 2). Array arname(nnames,2)
contains the names of the currently allocated arrays. The maximum number per type, nnames, is a constant
Workspace
124
in the program. Indexw contains the locations of the arrays: indexw(i,j) is the position of the marker before
the i-th array of type j (j=1,2). ncwork stores how many arrays are currently allocated: the number of
allocated arrays of type j is (ncwork(j)1). The relevant data are stored in the common blocks CNTRLC
(data of type 'character') and CNTRLV (other variables).
One of the arguments of ARRAYS is action, a string to identify the purpose of the call. action may have the
values 'allocate' or 'delocate' with obvious meanings. Furthermore it may be
'available' to find out how much space is still free,
'space' to allocate an array as large as possible. This is a combination of 'available' and 'allocate'; output are
the pointer for the array and its size.
'check' to check the markers around a specific array,
'init' to initialize the information structure in ARRAYS (in this way it is called only once of course, by INIT),
'dump' to output the current state of affairs,
'free', which implies 'delocate' for the specified array and for all arrays of the same type that were allocated
after it.
remarks
1. If several arrays have been allocated, their de-allocations may be executed in any order. However, the
space of a delocated array is not available for subsequent allocations until all arrays after it have been
given free as well: free holes in iwork and rwork are not used, only the free final sections.
2. ARRAYS is also used to manage array allocation for logicals and complex numbers. Logicals are treated
on the same footing as integers; complex data are treated as reals, taking two real words for each
complex. The size of an array as specified in the argument list is therefore multiplied by m when sizes
are judged; m=1 in all cases except for complex numbers, where m equals 2.
3. It is possible in principle to have only one workarray in blank common and allocate both integer and
real arrays in it; this might be more efficient as we may now have a too small real array rwork and
wasted space in iwork, or vice versa. The gain would be small, because the real array is an order of
magnitude larger so that some extra space from the integer array is not very important.
The reason to keep integers and reals separate is that type-mixing may cause problems on some
machines. The same might of course be true as regards the mixing of logicals with integers and complex
numbers with reals. In that case the number of types used in ARRAYS should be extended; the necessary
adaptations will be obvious from inspection of the implementation.
o p tim ized vecto r leng ths
Even with the optimized use of available workspace, by way of the pseudo-dynamical allocation procedure
just discussed, it is impossible to have all data in core simultaneously. BAND evaluates most integrals
numerically. The relatively large number of integration points necessitates the storage of data on file, such
as the values of the basis functions in the points. Each time we need them, for instance to compute matrix
elements of the potential, they are read and used. In general only part of these data can be kept in core at the
same time and the numerical integration has to be performed in a number of steps. At each step the values
Workspace
125
corresponding to one block of points are read, processed in the computation and then replaced by the next
block of data.
The heart of the computation consists of vector operations; the length of the vector is the number of points
in a block. It is advantageous to have the vectors as long as possible or, to be more precise, to have as few
blocks as possible, because each extra block costs one more start-up of the vector operation.
For organizational reasons the block structure is the same on all involved files. By the block structure we
understand here the number of blocks (of points) and the length of each individual block. In each routine
where we have to handle such data the available workspace is determined by what other data must currently
be kept in core, so that for each routine there may be a different maximum possible vector length. Routine
WRKORG computes these values for all sections in the preparation stage and gives back the minimum over
them: the absolute maximum vector length for the whole preparation part: npx. This can then be used to
write the various data to file in the appropriate block structure.
As regards the SCF part alone, the maximum possible vector length is often much larger than in the
preparation stage: in the preparation part we need at certain moments the overlap matrices of the valence,
core, or fit functions; these occupy considerable parts of the total workspace. REORGF computes the SCF
vector length and reorganizes the relevant files after PREPAR and before SCF; auxiliary routines for
REORGF are REORG1, -2 and -3; each of these rewrites a (number of) specific files.
The determination of the block structure for PREPAR has two complications.
a) In BASPNT the bloch sums of the basis functions are computed for all employed k-points of the Brillouin
Zone. Much of the involved computational work does not depend on the k-point. Hence it seems obvious to
do this for as many k-points together as possible, instead of repeating the work for each k-point. However
this implicates that more basis function data have to be kept in core (proportional to the number of k-points
processed simultaneously), thereby reducing the maximum possible vector length in BASPNT; this again
makes the execution more expensive by increasing the cost of the vector operations.
To achieve the optimal situation we have to minimize some sort of cost function. This function has to
depend both on the vector length and on the number of k-points handled together. The optimal balance may
be determined by many conditions and not in the least by the machine; we find for example that the optimal
situation on the Cyber205 differs markedly from that on the Cyber995.
We have implemented a cost function (in RPNTRE), but the form will not be discussed here: it is only a first
guess; a systematic experimentation with it might very well yield a significant improvement in performance
of the program.
The number of k-points treated together is the variable kgrp. By default it is determined by RPNTRE, but
this may be overruled by fixing its value via input, with key
.
b) For some functions, in particular the fit functions, we need only the values in the symmetry unique points.
The maximum vector length according to the fit section is thus the maximum number of symmetry unique
Workspace
126
points in a block. The absolute maximum vector length can then be computed if we know the ratio between
the symmetry unique points and the total number of points in a block. However this is not a fixed value (e.g.
the number of symmetry operators): some points may not be general points. If they are located on a
symmetry element, such as a reflection plane or a rotation axis, the number of equivalent points is only a
fraction of the number of symmetry operators. So the ratio depends on what kind of points are in a
particular block. Then, assuming a quotient (the variable symfrc in the program) it must be seen afterwards
whether the actual value in the realized block structure does not deviate so much from it as to cause
problems; RPNTRE checks this and adapts in such (exceptional) cases the block structure.
WRKORG is called first from PREPAR to obtain a first assessment of the maximum vector length npx;
symfrc has been initiated (INIT) at the safe value 1.0. In RPNTID the file with integration points itpnt, output
from POINTS, is rewritten in accordance with the value of npx; meanwhile the maximum fraction of
symmetry unique points, symfrc, is determined. Then, in RPNTRE, the cost function mentioned above and
the new value of symfrc are used to optimize the points-file. The actual rewriting of itpnt occurs in the
auxiliary routine RPNTRW.
XC: exchange and correlation
23
127
XC: exchange and correlation
The exchange and correlation (XC) interactions between the electrons are approximated with the
density functional (DF) formalism. Many DF formulas have been advocated in the literature for the XC
potential. The first proposal was by Slater [Slater 1951], later parametrized to the famous X potential.
More recently the Gunnarson-Lundqvist (GL) [Gunnarsson and Lundqvist 1976] and the Vosko-WilkNusair (VWN) [Vosko et al. 1980] form were published. All these approximations are based on
homogeneous electron gas calculations. BAND has the possibility to take either of these. The VWN
function is presumably the most accurate, but for sake of comparison the other options are occasionally
useful.
Even the more sophisticated DF functions, like VWN, are rather severely in error, depending on the
(poly-) atomic system. The error is roughly 10% in the exchange part and up to 100% in the correlation
[Stoll et al. 1978, Gunnarsson and Jones 1985, Becke 1986]. The signs of these deviations are opposite
and the sizes such that the net result, as regards the self consistent charge density and the total energy,
is often fairly good. Not in all cases however and from a more principal point of view this situation is
of course unsatisfactory.
According to Stoll et al. [1978] the correlation error is mainly due to an overestimation of the
correlation between electrons of the same spin. The program has therefore the option to suppress this
term, partially or completely. To prevent the total result from getting worse we should of course also
repair the exchange error. That can be done to a large extent by including a non-local term, a function
of the gradient of the density [Becke 1988]. Implementation in BAND has not yet been undertaken. It is
not very difficult and will be discussed at the end of this section.
lo ca l XC f unctio ns
The energy density and potential will be denoted by (r) and (r) respectively.
Define:

+() is the charge density of spin-up (-down) electrons.

 = + + 

 = (+) /  is the spin polarization function
f() = ((1+)4/3+(1)4/32) / (24/32)
3 1/3
rs = 
4π

4 1/3
 = 9π
 
is the Wigner radius
(23.1)
XC: exchange and correlation
128
EXC = 
 dr (r) XC(r)
The energy density and the potential derived from it can conceptually be split into an exchange and a
correlation part
XC  XC(rs,) = X + C
rs dXC
dXC
XC(rs,) = ± ( XC) = XC  3 dr ± (1+ )
= X + C
d
s
d
(23.2)
d
±
The exchange part is given for all local spin density functionals by
 3  3  (+)4/3 + ()4/3
3 

4/3
4/3

 X = 
 ((1+ + (1) ) = 2 4π


8πrs
1 
1/3
 (1±)
π
 rs
X = 
±
= 2  3  (±)1/3
4π


(23.3a)
(23.3b)
The difference between the various XC functions is the treatment of the correlation.
X-alpha
In the X approximation the correlation part is represented by parametrizing the exchange term
3
XC = 2 X X
(23.4)
and similarly for XC. The X parameter X is usually taken between 0.6 and 1.0.
Gunnarsson-Lundqvist
p
f
p
C = C + (CC) f()
(23.5)
p and f indicate the para- and ferro magnetic states. (=0,1).
j
C = cj
{((1+3j ) log(1+ 1/j) + 1/2 j - 2j  1/3}
j = rs /dj
(23.6)
j=p,f
cj and dj are constants
cp = 0.0333 a.u.
dp = 11.4 a.u.
cf = 0.0203 a.u.
df = 15.9 a.u.
(23.7)
Vosko-Wilk-Nusair
p
f
p
C = C(rs) + 9/8 (24/3 2) C(rs)(14) f() + (C(rs)C(rs)) 4 f()
(23.8)
XC: exchange and correlation
p
129
f
The values of the parameters C, C and C are all given by a function H(rs,xi,Bi,Ci,Qi,CAi,CBi,CCi ).
The arguments are different for the three cases, i=p,f,.
H(rs,x,B,C,Q,CA,CB,CC ) = CA
{log(rs )  2CB log(√rsx) +
(23.9)
)}
(
+ (CB1) log(rs+B√rs+C) + CC arctan Q/(2√rs+B)
The constants x,B,C,Q,CA,CB and CC are given in table II.
The implementation of the function H needs a little care. As the density approaches zero several terms
in (23.9) become unbounded. The 'infinities' cancel each other analytically. To prevent rounding errors
and other numerical problems from influencing the result a few terms are recombined. Set g=(3/4π)1/3 ,
rs =g1/3:
H = CA
{log(g) + log(
)
1/3
 2CB (√gx1/6)  2CB log(1/6) +
(
)
+ (CB1) log(1/3) + (CB1) log(g + B√g1/6 + C1/3) + CC arctan Q1/6/(2√g+B1/6)
}=
= CA
{log(g)  2
(
)
(
)
CB √gx1/6 + (CB1) log g + B√g1/6 + C1/3 +
(
)
+ CC arctan Q1/6/(2√g+B1/6)
}
(23.10)
Sto ll's co rrectio n f o r the co rrela tio n
As mentioned before it may be argued [Stoll et al. 1978] that the commonly applied DF formalism,
based on the homogeneous electron gas, overestimates the correlation. They propose a correction
yielding an effective correlation function
eff
correc
C (rs,) = C(rs,)  C
(rs,)
(23.11)

correc
C
1
+

1
(rs,) = 2 (1+) C(rs ,=+1) + 2 (1) C(rs ,=1)
3 1/3
±
rs = 
4π±
(23.12)
(23.13)
The corresponding effective correlation potential is
eff ±
C
±
±
= C  C(rs ,=±1)  ±
d
d±
±
C(rs ,=±1)
(23.14)
BAND applies the correction with an (input specified) multiplication factor  between zero and one.
XC: exchange and correlation
eff
130
correc
C = C   C

(23.15)
implementation
p

f
The local density functional (LDF)
x
.10498
.325
.0047584
formulas above are evaluated by two
B
3.72744
7.06042
1.13107
main routines: XCENER and XCPOT
Q
6.15199
4.73093
7.12311
for the energy density and potential
.01688685
respectively. Input are a vector of
.000414034
density values and information
CA
.0310907
CB
.0311676
.144601
.01554535
CC
1.24742
3.3766
regarding the type of DF formula.
.317708
The GL form is not yet available
Table II. Constants in the VWN formula for the
because the evaluation of the energy
correlation functional.
density has not been implemented
thusfar.
A few auxiliary routines compute various (parts of the) formulas occurring in the VWN approximation.
These are XCVWND, XCVWNE, XCVWNF, XCVWNP.
The variable ioptxc determines the type of formula (1:X , 2:VWN, 3:GL); xcpar provides additional
information: for the X formula it is the usual X parameter, in the other cases it is the amount of
Stoll-correction applied ( in 23.15). Default is the VWN formula without Stoll's correction: ioptxc=2,
xcpar=0. Input is with the keys
,
or
. The last two request both the VWN formula to
be applied. In all three cases the key record may contain a value for xcpar; defaults respectively
0.7, 0, 1.
g ra d ient co rrectio n f o r the ex cha ng e
In recent years considerable success has been met in the elimination of the exchange error by inclusion
of a term that depends on the gradient of the density. Becke [1988] gives a function for the exchange
energy
LDA
EX = EX
LDA
EX

4/3
 ∫

x
(1+6xsinh1x)
dr
(23.16)
is the exchange energy in the local density approximation, given by (23.1) and (23.3),  is the
spin index,  is a constant whose value may be fitted to the exact Hartree Fock exchange energies for
atoms (0.0042 a.u.) and the dependence on the gradient is via
x 


Incorporation of this function is straightforward if the density gradient can be evaluated in the
integration points. The density in the crystal is known as an expansion in basis functions i(k)
(23.17)
XC: exchange and correlation
=
ij
k
131
k
Pij i(k) j(k)
(23.18)
k
Pij is the density matrix. For the gradient we have then
 =

k
ij
k
Pij (i(k)j(k)+i(k)j(k)) =
  (Pkij+Pkji)
k
ij
i(k)j(k)
(23.19)
The basis functions are either plane waves eik·r giving eik·r =ik·eik·r , or bloch sums of one center
functions i
i(k;r) =  i(rR) eik·R
(23.20)
R
The one center functions have the form
(r) = Zlm() P(r)
(23.21)
where the radial function P(r) is an exponential function P(r)=rl+k er or a numerical orbital from the
free atom subprogram DIRAC. In both cases the gradient i and hence i(k) can be computed.
For the evaluation of the gradient of the density we need in this formulation the derivatives of the basis
functions i(k). These values, in each integration point three numbers per basis function, will have to
be stored on file. In the current set-up of BAND the data of the basis functions themselves are already a
major bottleneck for the application to large systems so that this approach poses a serious problem. It
may therefore be advisable to employ the fit functions fi that are also used for the evaluation of the
coulomb potential. The density is approximated by
 =   + def 



 +  ci fi
(23.22)
i
 is the spherically symmetric charge density of free atom , def is the deformation charge
def  crystal   
(23.23)

The fit functions are bloch sums (23.20) (with k=0) of one center functions of the form (23.21). The
total number of one center functions that are used for the fit is of the same order as the number of one
center functions for the valence basis. The latter are combined in bloch sums for each k-point, the fit
functions however only for k=0. Furthermore BAND employs as fit functions only the totally symmetric
combinations of one-center functions (SS^symmetry), while for the valence basis of course all irreps
are used. So evaluation of  via (23.22), instead of (23.19) requires less additional data to be stored;
in most cases the difference will at least be an order of magnitude.
A second advantage is the computing time. Expression (23.22) leads to



 +

i
ci fi
(23.24)
XC: exchange and correlation
132
The striking difference with (23.19) is that not only the summation over the k-points is absent, but that
also the double summation over the functions has been replaced by a single sum.
derivatives of one-center functions
We conclude this section with the derivatives of one center functions of the form (23.21). It will be
convenient to express the desired derivatives with respect to the cartesian coordinates in spherical
coordinates. So we use
d/dx = ∂r/∂x d/dr + ∂/∂x d/d + ∂/∂x d/d
(23.25)
and similarly for d/dy and d/dz, with
x2+y2+z2 ,
r=
 = acos(z/r),  = arctan(y/x)
(23.26)
to write
d/dx
   cossin
d/dy =  sinsin
d/dz  cos
cossin/r
sin/rsin
sincos/r
cos/rsin
sin/r
0
d/dr
  
 d/d
 d/d
(23.27)
For an exponential function P(r)=rn er the radial derivative is
d n r
= (n/r ) rn er
dr r e
(23.28)
If P(r) is a numerical function, given as a table P(ri ) with logarithmically increasing ri , ri+1 /ri =c, the
derivative can be computed numerically by a 3-point interpolation. Define g(ri )R(ri )/ri , then

dP(r ) 
dr = 

i
1
(c+1)2g2  (c+2)g1  cg3
c2
c
gi + 2
g  gi1
c 1 i+1
1
cg
+ (2c2+1)gN  (c+1)2gN1
2
c 1 N2
(
)
(
i=1
)
(
i=2..N1
)
i=N



(23.29)
derivatives of spherical harmonics
The angular derivatives operate on the spherical harmonics. BAND uses real-valued spherical
harmonics
cl m Pl m(cos) cos(m)
Zlm() =  m m
cl Pl (cos) sin(m)
m
m≤0
m>0
m



Pl is the associated Legendre function and cl is a normalization factor (m≥0)
(23.30)
XC: exchange and correlation
m
cl = ()m
133
(2l+1) (lm)!
4π (l+m)!
(23.31)
The derivative with respect to  is obvious. For dZlm /d the recurrence relations and the definitions of
the Legendre functions can be used [Arfken 1970].
1
m
m
m-1
m+1
dPl /d = sin dPl /dcos = 2 ((l+m)(lm+1)Pl  Pl )
(23.32)
From the relation to the Legendre polynomials [Arfken 1970]
m
Pl = sinm  (d/dcos)
m
Pl
(23.33)
we may alternatively derive
m
m
m+1
dPl /d = (m/tan) Pl  Pl
(23.34)
The first term in (23.34) presents a problem for sin=0. For m=0 this term should be omitted (23.33),
m
while for m≠0 P contains a factor sinm  removing the singularity. In view furthermore of the
l
restriction m≤l a possible set-up is
1. use explicit expressions for the derivatives of Zlm with l=0,1
2. for l≥2, m=l+1,....,l1 use (23.32)
3. for m= l use (23.34)
m+1
4. for m=l combine (23.32) and (23.34) to eliminate Pl
m
m1
dPl /d = (l+m)(lm+1) Pl
:
m
 (m/tan) Pl
For =0,π/2 the first term in (23.34), respectively the second term in (23.35) have to be omitted.
(23.35)
Software reference list
134
Software Reference List A: subroutines
:
tstat............................................ 32
dirac ............................................ 8
cntrl-.................................... 31–33
prepar ........................................ 11
stopit ................................... 32–33
points ........................................ 12
atmfnc ....................................... 34
vauvw........................................ 15
rhofit ......................................... 35
machin ...................................... 15
coulom ...................................... 35
dirac .......................................... 16
fit- ............................................. 36
sltorb ......................................... 16
fitpnt.......................................... 48
planew....................................... 16
fitra- .......................................... 49
hamfix ....................................... 16
fitra- .......................................... 49
bas- ..................................... 16–20
basovl ........................................ 56
simtrf......................................... 20
popana ....................................... 56
kpnt- .......................................... 24
dos- ..................................... 16–20
simpls ........................................ 24
atmfnc ....................................... 61
lattpt .......................................... 25
elstab ......................................... 61
polygn ....................................... 25
energy ....................................... 61
plgirr ......................................... 25
fl- 16–20
polyhe ....................................... 25
formfa ....................................... 66
pyrrot ........................................ 25
celred ........................................ 66
bzintl ......................................... 26
rotmat ........................................ 68
quad- ......................................... 26
polygn ....................................... 69
quad- ......................................... 26
polyhe ....................................... 70
hybrid ........................................ 26
plgirr ......................................... 70
fermi- .................................. 26–27
simpls ........................................ 71
occ- ..................................... 25–27
headin........................................ 73
emnmxb .................................... 27
getinp ........................................ 73
eigsys ........................................ 28
key- ........................................... 24
pmatrx ....................................... 28
prnt- .......................................... 77
rhopsi ........................................ 29
prnt- .......................................... 77
rhopmt....................................... 29
skip- .......................................... 78
scf 29
skip- .......................................... 78
charge ....................................... 30
testst .......................................... 78
prprts ......................................... 30
points ........................................ 79
rpntid......................................... 30
gemtry ....................................... 79
start ........................................... 31
rpntid ......................................... 79
endof ......................................... 31
rdintg ......................................... 80
skip- .......................................... 32
atmfnc ....................................... 86
sectim ........................................ 32
baspnt ........................................ 86
itimer ......................................... 32
fitpnt.......................................... 86
Software reference list
intdat ......................................... 86
vbnd2i ....................................... 86
condit ........................................ 86
celmax ....................................... 87
radma- ....................................... 24
simpsl ........................................ 87
rhopot........................................ 91
mixitr ........................................ 91
vardat ........................................ 91
scftst .......................................... 92
atmsep ....................................... 96
lattic .......................................... 96
leg- ...................................... 16–20
potapp ....................................... 99
eigsys ...................................... 100
rhoapp ..................................... 101
rhopmt..................................... 101
pntgrp ...................................... 105
symcry..................................... 106
lattpt ........................................ 106
lattch ....................................... 107
symprj ..................................... 109
symzlm.................................... 115
sphprd ..................................... 115
sphpnt ..................................... 115
vzlm ........................................ 115
zlmprj ...................................... 118
fitsym ...................................... 118
fitpnt ....................................... 118
rhopmt..................................... 119
cntrl- ....................................... 124
arrays ........................................ 24
wrkorg ..................................... 125
reorgf ...................................... 125
baspnt ...................................... 125
rpntre....................................... 125
rpntrw...................................... 126
xc- ........................................... 130
alternating sequence
40–45, 50
arrays
123
atoms
positions
75
135
type of
75
basis
orthonormal
12, 16
basis
functions
8, 16
blank common
123
bloch
theorem
6, 107
bloch
functions [Bloch 1928]
8
sum
12, 85, 87
BZ-integration
22
hybrid-quadratic
26
quadratic
26
temperature
27
BZ-integration
accuracy
23
charge
7. (see density)
control
31–33, 78
convergence
95
Chebyshev acceleration
93
core
9, 17, 21
correlation
(see XC)
Stoll's correction;. They propose a correction
yielding an effective correlation function129
correlation
error
127
coulomb potential
10, 33, 50
average
50, 53
damping
88
debugging
65, 78
defaults
72
density
6, 28–31, 50, 110
deformation
10
expansion in spherical harmonics
31
plot
30
density
approximation
101
density of states
(see dos)
dependency
core-valence
18
fit 37
valence
17
dependency
coefficients
17, 37
derivative
angular
132
one-center function
132
radial
132
dimensionality
75
k-space
96, 108, 109
dipole
16–20
DOS
53
gross
55
overlap
55
partial
54
Software reference list
136
plot
56
polygon
70
efficiency8, 14, 20, 27, 29, 48, 98, 101, 111, 118, 119, 125
polyhedron
71
energy
irrep
111, 116
coulomb
58
matrix
116
electrostatic
60
projection
110
error
57
projector
116
kinetic
58
translation group
6, 107
Madelung
61
iteration
88
XC
58
key
73. (see input)
exchange
(see XC)
block
78
exchange
integration
80
error
127
key
execution stack
32–33
block
73
fermi energy
6, 25
record
73
fermi-dirac distribution
120, 121
kinetic energy
files
62
functions
16
format
64, 65
matrix
16, 61
input, output
65
k-point
size
62
equivalence index
24
string
62
lattice
75, 96
files
Bravais
102
manager
62
lattice
fit functions
10, 33–36, 131
sum
13
charge content
34
lattice sum
37
coefficients
35
lines
68
constraint
34
machine-dependency
14, 62, 73
error
59. (see form factors). (see energy)
Madelung
48
generating
35
memory banking conflict
14
plane waves
36
multipole
13, 51
potential
35
n-dimensional crystal
7
form factors
65
nearest neighbour
47
frozen core
(see core)
occupation numbers
9, 25, 28, 110, 120
geometry
67, 75
orientation
96
hamiltonian
6
orthogonality theorem
116
Herman-Skillman
8
output
77. (see print)
input
15, 72
heading
73
core
76
input data
73
defaults
72
plane waves
9, 76
functions
75
symmetry adaptation
118
nuclear charge
75
planes
67
input
P-matrix
21, 28, 101
file
72
point group
115
integration
79
polygons
68
accuracy
80
polyhedra
69
generators
110
population
30, 53
k-space
109
potential
(see coulomb potential)
points
approximation
99
blocks
79, 125
print
symmetry
79
instructions
77
symmetry
110
options
77
integration
RADIAL
48, 76
parameters
80
reciprocal values
76
points
radial
blocks
13
grid points
8
interpolation
48, 85, 16–20
tables
8
irreducible wedge
119
recurrency
33
Software reference list
relativistic effects
results
SCF
screening
screening function
modified
simplex
space group
generators
centralizing
generators
operators
spherical harmonics
definition
representation of operators
spherical harmonics
symmetry adaptation
spin
6
7
13
41
49
23, 71
102
103
102
105
132
115
117
7, 78, 100, 101
137
symmetrization
symmetry
functions
k-space
little group
symmetry
breaking
temperature
timing
unit cell
vector length
vectorization
workspace
workspace
manager
XC
error
X-ray
110
12, 102
116
107
116
12
120. (see BZ-integration)
29
103, 106
13, 125
14
13, 123
123
6, 127
127
(see form factors)
Download