Ionic optimisation Georg KRESSE

advertisement
Ionic optimisation
Georg KRESSE
Institut für Materialphysik and Center for Computational Material Science
Universität Wien, Sensengasse 8, A-1090 Wien, Austria
b-initio
ackage
ienna
G. K RESSE , I ONIC
OPTIMISATION
imulation
Page 1
Overview
the mathematical problem
– minimisation of functions
– rule of the Hessian matrix
– how to overcome slow convergence
the three implemented algorithms
– Quasi-Newton (DIIS)
– conjugate gradient (CG)
– damped MD
strength, weaknesses
a little bit on molecular dynamics
G. K RESSE , I ONIC
OPTIMISATION
Page 2
The mathematical problem
search for the local minimum of a function f x
for simplicity we will consider a simple quadratic function
x0
x0 B x
1
x
2
ā
1
xBx
2
bx
a
f x
where B is the Hessian matrix
∂f
∂x
Bx
Bi j
∂2 f
∂xi ∂x j
for a stationary point, one requires
x0j
∑ Bi j x j
∂f
∂xi
gi x
x0
gx
j
at the minimum the Hessian matrix must be additionally positive definite
G. K RESSE , I ONIC
OPTIMISATION
Page 3
The Newton algorithm
educational example
start with an arbitrary start point x1
calculate the gradient g x1
multiply with the inverse of the Hessian matrix and perform a step
B 1 g x1
x1
x2
x0
∂f
1 x0 , one immediately recognises that x2
by inserting g x1
B
x
∂x
hence one can find the minimum in one step
in practice, the calculation of B is not possible in a reasonable time-span, and one
needs to approximate B by some reasonable approximation
G. K RESSE , I ONIC
OPTIMISATION
Page 4
Steepest descent
approximate B by the largest eigenvalue of the Hessian matrix
algorithm (Jacobi algorithm for linear equations)
steepest descent
1. initial guess x1
2. calculate the gradient g x1
3. make a step into the direction of the steepest descent
1 Γmax B g x1
x1
x2
4. repeat step 2 and 3 until convergence is reached
for functions with long steep valleys convergence can be very slow
Γ max
Γ min
G. K RESSE , I ONIC
OPTIMISATION
Page 5
Speed of convergence
how many steps are required to converge to a predefined accuracy
assume that B is diagonal, and start from x1
0
1
Γ 1 Γn
Γ3
Γ2
with Γ1
x0
x1
Γn
0
1
1
B
Γ1
x0
1
gradient g x1 and x2 after steepest descent step are:
x0
1
Γn
1
g x1
Γn
x1
x2
x0
B x1
g x1
Γ1
Γ n Γn
G. K RESSE , I ONIC
OPTIMISATION
Page 6
Convergence
the error reduction is given by
1
1
Γ 1 Γn
1−Γ/Γ max
x0
1
x2
Γ n Γn
Γ
1
Γ
2
Γ
3
Γ
4
Γ
5
Γ
1−2Γ/Γmax
−1
– the error is reduced for each component
– in the high frequency component the error vanishes after on step
– for the low frequency component the reduction is smallest
G. K RESSE , I ONIC
OPTIMISATION
Page 7
the derivation is also true for non-diagonal matrices
in this case, the eigenvalues of the Hessian matrix are relevant
for ionic relaxation, the eigenvalues of the Hessian matrix correspond to the
vibrational frequencies of the system
the highest frequency mode determines the maximum stable step-width (“hard
modes limit the step-size”)
but the soft modes converge slowest
to reduce the error in all components to a predefined fraction ε,
k iterations are required
Γmin k
1
ε
Γmax
k
G. K RESSE , I ONIC
ln ε
Γmax
ln ε
Γmin
ln ε
Γmin
k
Γmax
Γmin
Γmax
k ln 1
OPTIMISATION
Γmax
k∝
Γmin
Page 8
Pre-conditioning
λPg xN
xN
1
1,
xN
B
if an approximation of the inverse Hessian matrix is know P
the convergence speed can be much improved
in this case the convergence speed depends on the eigenvalue spectrum of
PB
B 1 , the Newton algorithm is obtained
for P
G. K RESSE , I ONIC
OPTIMISATION
Page 9
Variable-metric schemes, Quasi-Newton scheme
variable-metric schemes maintain an iteration history
they construct an implicit or explicit approximation of the inverse Hessian matrix
1
Bapprox
search directions are given by
1
gx
Bapprox
the asymptotic convergence rate is give by
number of iterations ∝
G. K RESSE , I ONIC
OPTIMISATION
Γmax
Γmin
Page 10
Simple Quasi-Newton scheme, DIIS
direct inversion in the iterative subspace (DIIS)
set of points
1
N
gi i
and
N
1
xi i
search for a linear combination of xi which minimises the gradient, under the
constraint
∑ αi 1
i
∑ αi g i
i
i
x0
i 0
α
∑ x
i
∑ αi B x i
i i
α
∑ x
i
B
x0
i
i i
α
∑ x
B
i i
α
∑ x
g
i
gradient is linear in it’s arguments for a quadratic function
G. K RESSE , I ONIC
OPTIMISATION
Page 11
Full DIIS algorithm
1. single initial point x1
g x1 , move along gradient (steepest descent)
2. gradient g1
λg1
1
N for the minimal gradient
g x2
3. calculate new gradient g2
x1
x2
4. search in the space spanned by gi i
∑ αi g i
gopt
and calculate the corresponding position
i i
α
∑ x
xopt
G. K RESSE , I ONIC
λgopt
xopt
x3
5. Construct a new point x3 by moving from xopt along gopt
OPTIMISATION
Page 12
1. steepest descent step from x0 to x1 (arrows correspond to gradients g0 and g1 )
2. gradient along indicated red line is now know, determine optimal position x 1opt
g x1opt
3. another steepest descent step form x1opt along gopt
4. calculate gradient x2
now the gradient is known in the entire 2 dimensional space
(linearity condition) and the function can be minimised exactly
a 0x 0+ a1x,1 a0+a1=1
x0
x1
x2
x0
x1opt
G. K RESSE , I ONIC
OPTIMISATION
Page 13
Conjugate gradient
first step is a steepest descent step with line minimisation
search directions are “conjugated” to the previous search directions
1. gradient at the current position g xN
2. conjugate this gradient to the previous search direction using:
g xN
g xN 1 g xN
g xN 1 g xN 1
γ
γs
N 1
N
gx
s
N
3. line minimisation along this search direction sN
4. continue with step 1), if the gradient is not sufficiently small.
the search directions satisfy:
N M
δNM
s N Bs M
the conjugate gradient algorithm finds the minimum of a quadratic function with k
degrees of freedom in k 1 steps exactly
G. K RESSE , I ONIC
OPTIMISATION
Page 14
1. steepest descent step from x0 , search for minimum along g0 by performing several trial
steps (crosses, at least one triastep is required) x1
2. determine new gradient g1 g x1 and conjugate it to get s1 (green arrow)
for 2d-functions the gradient points now directly to the minimum
3. minimisation along search direction s1
x0
x1
x2
x0
s1
x1
G. K RESSE , I ONIC
OPTIMISATION
x1
Page 15
Asymptotic convergence rate
asymptotic convergence rate is the convergence behaviour for the case that the
degrees of freedom are much large than the number of steps
e.g. 100 degrees of freedom but you perform only 10-20 steps
how quickly, do the forces decrease?
this depends entirely on the eigenvalue spectrum of the Hessian matrix:
–
steepest descent: Γmax Γmin steps are required to reduce the forces to a
fraction ε
–
DIIS, CG, damped MD:
Γmax Γmin steps are required to reduce the
forces to a fraction ε
Γmax Γmin are the maximum and minimal eigenvalue of the Hessian matrix
G. K RESSE , I ONIC
OPTIMISATION
Page 16
Damped molecular dynamics
instead of using a fancy minimisation algorithms it is possible to treat the
minimisation problem using a simple “simulated annealing algorithm”
regard the positions as dynamic degrees of freedom
the forces serve as accelerations and an additional friction term is introduced
equation of motion (x are the positions)
2 αg x
µx˙
x¨
using a velocity Verlet algorithm this becomes
2 αFN
µ 2
"
!
1 2
1
xN
xN
for µ
vN
1
1
1 2
µ 2 vN
1
1 2
vN
2, this is equivalent to a simple steepest descent step
G. K RESSE , I ONIC
OPTIMISATION
Page 17
behaves like a rolling ball with a friction
it will accelerate initially, and then deaccelerate when close to the minimum
if the optimal friction is chosen the ball will glide right away into the minimum
for a too small friction it will overshoot the minimum and accelerate back
for a tool large friction relaxation will also slow down (behaves like a steepest
descent)
x0
G. K RESSE , I ONIC
OPTIMISATION
Page 18
Algorithms implemented in VASP
additional flags
termination
DISS
IBRION =1
POTIM, NFREE
EDIFFG
CG
IBRION =2
POTIM
EDIFFG
damped MD
IBRION =3
POTIM, SMASS
EDIFFG
POTIM determines generally the step size
for the CG gradient algorithm, where line minisations are performed, this is the size of
the very first trial step
EDIFFG determines when to terminate relaxation
positive values: energy change between steps must be less than EDIFFG
negative values: Fi
i 1 Nions
'
&'
(
$#%
G. K RESSE , I ONIC
OPTIMISATION
Page 19
DIIS
POTIM determines the step size in the steepest descent steps
no line minisations are performed !!
NFREE determines how many ionic steps are stored in the iteration history
1
N searches for a linear combination of
+
gi i
,
and
,,-+
+
N
,
,,-+
1
.
*
)
.
*
)
set of points xi i
xi , that minimises the gradient
NFREE is the maximum N
for complex problems NFREE can be large (i.e. 10-20)
for small problems, it is advisable to count the degrees of freedom carefully
(symmetry inequivalent degrees of freedom)
if NFREE is not specified, VASP will try to determine a reasonable value, but
usually the convergence is then slower
G. K RESSE , I ONIC
OPTIMISATION
Page 20
CG
the only required parameter is POTIM
this parameter is used to parameterise, how large the trial steps are
CG requires a line minisations along the search direction
x0
x0
x1
x1
xtrial 1
xtrial 2
this is done using a variant of Brent’s algorithm
– trial step along search direction (conjg. gradient scaled by POTIM)
– quadratic or cubic interpolation using energies and forces at x0 and x1 allows
to determine the approximate minimum
– continue minimisation as long as approximate minimum is not accurate
enough
G. K RESSE , I ONIC
OPTIMISATION
Page 21
Damped MD
two parameters POTIM and SMASS
2 αFN
1
vN
1 2
xN
1
xN
µ 2
1
1 2
µ 2 vN
1
1 2
vN
"
!
α ∝ POTIM and µ ∝ SMASS
POTIM must be as large as possible, but without leading to divergence
and SMASS must be set to µ 2 Γmin Γmax , where Γmin and Γmax are the
minimal und maximal eigenvalues of the Hessian matrix
a practicle optimisation procedure:
– set SMASS=0.5-1 and use a small POTIM of 0.05-0.1
– increase POTIM by 20 % until the relaxation runs diverge
– fix POTIM to the largest value for which convergence was achieved
– try a set of different SMASS until convergence is fastest (or stick to
SMASS=0.5-1.0)
G. K RESSE , I ONIC
OPTIMISATION
Page 22
Damped MD — QUICKMIN
alternatively do not specify SMASS (or set SMASS 0)
this select an algorithm sometimes called QUICKMIN
QUICKMIN
for vold F
2
0
F
F vold F
αF
/
v
new
αF
0 1
else
– if the forces are antiparallel to the velocities, quench the velocities to zero and
restart
– otherwise increase the “speed” and make the velocities parallel to the present
forces
I have not often used this algorithm, but it is supposed to be very efficient
G. K RESSE , I ONIC
OPTIMISATION
Page 23
Damped MD — QUICKMIN
my experience is that damped MD (as implemented in VASP) is faster than
QUICKMIN
but it requires less playing around
2
defective ZnO surface:
96 atoms are allowed to move!
relaxation after a finite temperature
MD at 1000 K
log(E-E0)
damped: SMASS=0.4
quickmin
0
-2
-4
-6
0
20
40
60
80
steps
G. K RESSE , I ONIC
OPTIMISATION
Page 24
Why so many algorithms :-(... decision chart
yes
CG
Really, this is too complicated
no
yes
yes
close to minimum
1−3 degrees of freedom
no
no
yes
very broad vib. spectrum
>20 degrees of freedom
damped MD or QUICKMIN
no
DIIS
G. K RESSE , I ONIC
OPTIMISATION
Page 25
Two cases where the DIIS has huge troubles
rigid unit modes i.e. in
perovskites (rotation)
molecular systems (rotation)
force increases along the search
direction
X0
X1
DIIS is dead, since it consideres
the forces only
it will move uphill instead of down
G. K RESSE , I ONIC
in cartesian coordinates
the Hessian matrix changes
when the octahedron rotates!
OPTIMISATION
Page 26
How bad can it get
the convergence speed depends on the eigenvalue spectrum of the Hessian matrix
– larger systems (thicker slabs) are more problematic (acoustic modes are very
soft)
– molecular system are terrible (week intermolecular and strong intramolecular
forces)
– rigid unit modes and rotational modes can be exceedingly soft
the spectrum can vary over three orders of magnitudes
100 or even more steps
might be required ionic relaxation can be painful
to model the behaviour of the soft modes, you need very accurate forces since
otherwise the soft modes are hidden by the noise in the forces
EDIFF must be set to very small values (10
G. K RESSE , I ONIC
6)
if soft modes exist
OPTIMISATION
Page 27
Electronic optimization
Georg KRESSE
Institut für Materialphysik and Center for Computational Material Science
Universität Wien, Sensengasse 8, A-1090 Wien, Austria
b-initio
ackage
ienna
G. K RESSE , E LECTRONIC O PTIMISATION
imulation
Page 1
Overview
Determination of the electronic grounstate
– general strategies
– strategy adopted in VASP
iterative matrix diagonalization and mixing
– how to overcome slow convergence
molecular dynamics
the algorithms are particularly well suited for molecular dynamics
G. K RESSE , E LECTRONIC O PTIMISATION
Page 2
Density functional theory according to Kohn-Sham
density and kinetic energy:
sum of one electron charge densities and kinetic energies
2
ρion r
ψn r
∑
2
ρtot r
Ne 2
Ne
number of electrons
n 1
electrost. energy
Exc ρ r
kinetic energy
ρtot r ρtot r 3 3
d rd r
r r
ψn r ∇2 ψn r d 3 r
1
2
h̄2 Ne 2
2
2me n∑1
LDA/GGA
KS-functional has a (the) minimum at the electronic groundstate
G. K RESSE , E LECTRONIC O PTIMISATION
Page 3
Numerical determination of the Kohn-Sham groundstate
direct minimization of the DFT functional (Car-Parrinello, modern)
Ne 2 (random numbers) and minimizes the
1
start with a set of wavefunctions ψn r n
value of the functional (iteration)
εn ψn r
V eff r ψn r
Fn r
Gradient:
h̄2 2
∇
2me
iteration – self consistency (old fashioned)
start with a trial density ρ, set up the Schrödinger equation, and solve the Schrödinger equation
to obtain wavefunctions ψn r
!
2
Ne 2
and a new Schrödinger
∑ n ψn r
1
n
as a result one obtains a new charge density ρ r
iteration
"
equation
ε n ψn r
ψn r
V eff r ρ r
h̄2 2
∇
2me
G. K RESSE , E LECTRONIC O PTIMISATION
Page 4
disordered diamond, insulator
disordered fcc Fe, metal
2
energy
0
0
n=1,2,4,8
log 10 E-E 0
-6
self.consistent
|
0
-4
n=1
-6
n=1
n=8
self.consistent
0
-1
-2
-3
-4
direct
Car−Parrinello
-8
1
0
log 10 |F-F
n=1
direct
Car−Parrinello
-4
-8
1
n=8
n=4
-2
log 10 |F-F 0 |
555
D D BA D BA @ BA @ >= @ >= < >= < < :9 :9 8 :9 8 568 56 56 43 43 21 43 21 0/ 21 0/ 0/ .- .- , .- , *) , *) ( *) ( &%( &% $ &% $ $
DCC DCC BA DCC BA @?? BA @?? >= @?? >= <;; >= <;; <;; :9 :9 877 :9 877 6877 6 6 43 43 21 43 21 0/ 21 0/ 0/ .- .- ,++ .- ,++ *) ,++ *) ('' *) ('' &%('' &% $## &% $## $##
L
log 10 E-E 0
-2
0
5
10
15
iteration
20
-1
-2
-3
-4
forces
0
10
20
30
iteration
40
G. Kresse and J. Furthmüller, Phys. Rev. B 54, 11169 (1996).
G. K RESSE , E LECTRONIC O PTIMISATION
Page 5
Direct minimization (not supported by vasp.4.5)
preconditioned conjugate gradient algorithm was applied
εn ψn r
V eff r ψn r
Fn r
Gradient:
h̄2 2
∇
2me
the main troubles are
1
Ne 2 orthogonal
– to keep the set of wavefunctions ψn r n
!
1
E
δnm ε̄n
F
G
ψn H ψ m
Nbands that
– “sub-space” rotation
at the end one aims to have a set of wavefunction ψn r n
diagonalize the Hamiltonian
E
for metals, this condition is difficult to achieve with direct algorithms
in metals, actually this optimisation subproblem leads to a linear slowdown
with the longest dimension of the (super)cell
E
G. K RESSE , E LECTRONIC O PTIMISATION
Page 6
Selfconsistency Scheme
HH
HH
trial-charge ρin and trial-wavevectors ψn
M M I
set up Hamiltonian H ρin
N
two subproblems
optimization of ψn and ρin
I
iterative refinements of wavefunctions ψn
N
refinement of density:
DIIS algorithm
P. Pulay, Chem. Phys. Lett. 73,
393 (1980).
I
∑n fn ψn r
2
new charge density ρout
I
"
new ρin
KJ
K J
K J
I JK L KJ I
J K
∆E
J K
JK
no
refinement of wavefunctions:
DIIS or Davidson algorithm
N
refinement of density ρin ρout
Ebreak
calculate forces, update ions
G. K RESSE , E LECTRONIC O PTIMISATION
Page 7
ALGO flag
ALGO determines how the wavefunctions are optimized
all algorithms are fully parallel for any data distribution scheme
– ALGO= Normal (default): blocked Davidson algorithm
– ALGO= Very Fast: DIIS algorithm
– ALGO= Fast: 5 initial steps blocked Davidson, afterwards DIIS algorithm
after ions are moved: 1 Davidson step, afterwards again DIIS
RMM-DIIS is 1.5 to 2 times faster, but Davidson is more stable
ALGO= Fast is a very reasonable compromise, and should be specified for system
with more than 10-20 atoms
generally the user can not influence the behavior of these algorithms (delicately
optimized black box routines)
G. K RESSE , E LECTRONIC O PTIMISATION
Page 8
OSZICAR and OUTCAR files
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: wrap around errors must be expected
prediction of wavefunctions initialized
entering main loop
N
E
dE
d eps
DAV: 1
0.483949E+03
0.48395E+03 -0.27256E+04
DAV: 2
0.183581E+01 -0.48211E+03 -0.47364E+03
DAV: 3 -0.340781E+02 -0.35914E+02 -0.35238E+02
DAV: 4 -0.346106E+02 -0.53249E+00 -0.53100E+00
DAV: 5 -0.346158E+02 -0.52250E-02 -0.52249E-02
RMM: 6 -0.286642E+02
0.59517E+01 -0.50136E+01
RMM: 7 -0.277225E+02
0.94166E+00 -0.47253E+00
ncg
96
96
96
112
96
96
96
rms
0.166E+03
0.375E+02
0.129E+02
0.158E+01
0.121E+00
0.584E+01
0.192E+01
rms(c)
0.198E+01
0.450E+00
0.432E+00
initial charge corresponds to the charge of isolated overlapping atoms (POTCAR)
DAV: blocked Davidson algorithm
RMM: RMM-DIIS was used
ALGO=F: 5 initial steps blocked Davidson, than RMM-DIIS
4 steps charge fixed, than charge is updated (rms(c) column)
G. K RESSE , E LECTRONIC O PTIMISATION
Page 9
OSZICAR file
N
iteration count
E
total energy
dE
change of total energy
d eps
change of the eigenvalues (fixed potential)
ncg
number of optimisation steps Hψ
rms
total initial residual vector ∑nk wk fnk H
rms(c)
charge density residual vector
εnk ψnk
G. K RESSE , E LECTRONIC O PTIMISATION
Page 10
OUTCAR file
initial steps (delay no charge update)
cpu time
wall clock time
POTLOK: VPU time
0.04: CPU time
0.04 local potential
SETDIJ: VPU time
0.08: CPU time
0.08 set PAW strength coefficients
EDDAV : VPU time
0.94: CPU time
0.94 blocked Davidson
DOS
: VPU time
0.00: CPU time
0.00 new density of states
---------------------------------------LOOP: VPU time
1.06: CPU time
1.06
charge update:
POTLOK:
SETDIJ:
EDDIAG:
RMM-DIIS:
ORTHCH:
DOS
:
CHARGE:
MIXING:
VPU
VPU
VPU
VPU
VPU
VPU
VPU
VPU
cpu time
time
0.04:
time
0.09:
time
0.14:
time
0.77:
time
0.01:
time -0.01:
time
0.07:
time
0.01:
CPU
CPU
CPU
CPU
CPU
CPU
CPU
CPU
wall clock time
time
0.04 new local potential
time
0.09 set PAW strength coefficients
time
0.14 sub-space rotation
time
0.77 RMM-DIIS step (wavefunc.)
time
0.02 orthogonalisation
time
0.00 new density of states
time
0.07 new charge
time
0.01 mixing of charge
G. K RESSE , E LECTRONIC O PTIMISATION
Page 11
What have all iterative matrix diagonalisation schemes in common ?
one usually starts with a set of trial vectors (wavefunctions) representing the filled
states and a few empty one electron states
1
ψn n
Nbands
these are initialized using a random number generator
then the wavefunctions are improved by adding to each a certain amount of the
residual vector
the residual vector is defined as
ψn H ψ n
G
G
F
εapp
εapp S ψn
H
G
R ψn
adding a small amount of the residual vector
O
ψn
λR ψn
ψn
is in the spirit of the steepest descent approach (usually termed “Jacobi relaxation”)
G. K RESSE , E LECTRONIC O PTIMISATION
Page 12
Iterative matrix diagonalization based on the DIIS algorithm
for our case we need a rather specialized eigenvalue solver
– it should be capable of doing only little work
– efficiency and parallelization are important issues
two step procedure
– start with a set of trial vectors (wavefunctions) representing the electrons
1
ψn n
Nbands (initialized with random numbers)
– apply Raighly Ritz optimization in the space spanned by all bands (“sub-space”
rotation)
1
Nbands so that
transform: ψn n
δnm ε̄n
F
G
ψn H ψ m
1
Nbands two or three times
G. K RESSE , E LECTRONIC O PTIMISATION
– then optimize each vector individually ψn n
Page 13
“In space” optimization EDDIAG
1
a set of vectors, that represent the valence electrons ψn n
Nbands
Raighly Ritz optimization in the space spanned by these vectors (subspace)
search for the unitary matrix Ū such that the wavefunctions ψn
ψn
∑ Ūmn ψm
m
ψ n H ψm
εm δnm
G
F
fulfill
this requires the calculation of the subspace matrix H̄
always holds
δmn
G
F
ψn S ψ m
F
H̄mn
G
ψn H ψ m
and it’s diagonalisation
2
the setup of the matrix scales like Nbands
NFFT (worst scaling part of VASP)
in the parallel version, communication is required, but modest
worse is the fact that the diagonalisation of H̄mn is not done in parallel
G. K RESSE , E LECTRONIC O PTIMISATION
Page 14
Iterative matrix diagonalization based on the DIIS algorithm
“out of space optimization” EDDRMM
– minimize norm of residual vector using the DIIS method
O G
minimal
R ψn R ψn
F
G
εapp S ψn
H
G
R ψn
– each vector is optimized individually (without regard to any other vector)
– easy to implement on parallel computers since each processor handles a subset of
the vectors (no communication required, NPAR=number of proc.)
scaling is propotional to Nbands NFFT with a large prefactor
dominates the compute time for medium to large problems
orthogonalization of wavefunctions ORTHCH
G. K RESSE , E LECTRONIC O PTIMISATION
Page 15
Problem of the DIIS algorithm
eigenstates can be missed for large systems
and there is no clear way to tell when this happens
– in the “best” case no convergence
2
or 10
P
P
– but convergence might also slows down after reaching a precision of 10
3
– in the worst case, you might not notice anything
in these cases, switch to blocked Davidson (manual contains a number of tricks how you
might be able to use the DIIS algorithm even when it initially fails)
things are not that bad
if the Davidson algorithm is used for the first steps, there is practically no danger of
missing eigenstates
G. K RESSE , E LECTRONIC O PTIMISATION
Page 16
VASP.4.5: new blocked Davidson algorithm
combines
“in space” and “out of space” optimization
n2
n1
ψk k
Nbands
1
"
selects a subset of all bands ψn n
– optimize this subset by adding the orthogonalized residual vector to the presently
considered subspace
n2
n1
εapp S ψk k
ψk H
– apply Raighly Ritz optimization in the space spanned by these vectors
(“sub-space” rotation in a 2(n2-n1+1) dim. space)
– add additional residuals calculated from the yet optimized bands (“sub-space”
rotation in a 3(n2-n1+1) dim. space)
approximately a factor of 1.5-2 slower than RMM-DIIS, but always stable
available in parallel for any data distribution
G. K RESSE , E LECTRONIC O PTIMISATION
Page 17
charge density mixing (RMM-DIIS)
VASP aims at the minimization of the norm of residual vector
"
min
R ρin
∑occupied wk fnk ψnk r
2
Q Q with ρout r
ρin
ρout ρin
R ρin
DIIS algorithm is used for the optimization of the norm of the residual vector
linearization of R ρin around ρsc (linear response theory)
ρsc
Jρ
Rρ
with the charge dielectric function J
χ U
1
J
4πe2
q2
leads to
ρsc
J ρin
G. K RESSE , E LECTRONIC O PTIMISATION
ρin
ρout ρin
R ρin
Page 18
Divergence of the dielectric function
eigenvalue spectrum of J determines convergence
χ U
1
J
4πe2
q2
slower convergence
"
“broader” eigenvalue spectrum
for insulators and semi-conductors the width of the eigenvalue spectrum is constant
and system size independent !
for metals the eigenvalue spectrum diverges, its width is proportional to the square of
the longest dimension of the cell:
1 (no screening)
– long wavelength limit J
1 q2 ∝ L2 (metallic screening)
R
– short wavelength limit J
R
!
complete screening in metals causes slow convergence to the groundstate (charge
sloshing)
G. K RESSE , E LECTRONIC O PTIMISATION
Page 19
VASP charge density mixer
0.4
VASP uses a model dielectric function
which is a good initial approximation
for most systems
0.3
J
1
max
R
q2
0.2
G1q
X
STU
V V
STU WUT P
J
q2
AMIX
defaults:
AMIX=0.4 ; AMIN=0.1 ;
BMIX=1.0
0.1
0
AMIN
0
1
2
3
4
2
G (1/A )
this is combined with a convergence accelerator
the initial guess for the dielectric matrix is improved using information accumulated
in each electronic (mixing) step
direct inversion in the iterative subspace (DIIS)
G. K RESSE , E LECTRONIC O PTIMISATION
Page 20
How can you tune VASP to achieve faster convergence
try linear mixing (AMIX=0.1-0.2, BMIX=0.0001)
P
J
1
A
G1q
R
VASP also gives information on how good the initial mixing parameters are
allow VASP to run until selfconsistency is achieved and search for the last occurrence
of
eigenvalues of (default mixing * dielectric matrix)
average eigenvalue GAMMA= 2.200
Y
– for linear mixing (e.g. AMIX=0.1 ; BMIX=0.0001) the optimal AMIX is given by
the present AMIX GAMMA
– Kerker like mixing (model dielectric matrix):
E
GAMMA larger 1
decrease BMIX
GAMMA smaller 1 increase BMIX
O
E
O
G. K RESSE , E LECTRONIC O PTIMISATION
Page 21
What to do when electronic convergence fails
fails to converge
fails to converge
ICHARG=12 (no charge update)
use Davidson (ALGO=N)
converges
converges
play with mixing parameters
converges
ICHARG=2
AMIX=0.1 ; BMIX=0.01
use this setting
fails to converge
converges
increase BMIX
BMIX=3.0 ; AMIN=0.01
fails to converge
bug report
after positions have been checked
G. K RESSE , E LECTRONIC O PTIMISATION
Page 22
ab initio Molecular dynamics
CP approach
elegant
simple to implement
problematic for metals, since
electrons must decouple from ionic
degrees of freedom
not the case for metals
small timestep
exact KS−groundstate
large timestep
direct minimization
problematic for metals
large memory requirements
damped second order
(Tassone, Mauri, Car)
conjugate gradient
(Arias, Payne, Joannopoulos)
RMM−DIIS
(Hutter, Lüthi, Parrinello)
G. K RESSE , E LECTRONIC O PTIMISATION
selfconsistency cycle
very stable
efficient for insulators
and metals
Page 23
Selfconsistency cycle is very well suited for MDs
MD on the Born Oppenheimer surface (exact KS-groundstate)
selfconsistency cycle determines the dielectric matrix
first time step is rather expensive
but since the dielectric matrix changes only little when ions are moved, the method
becomes very fast in successive steps
wavefunctions and charges etc. are “forward” extrapolated between time-steps
all this makes an extremely efficient scheme that is competitive with the so called
“Car-Parrinello” scheme for insulators
for metals, our scheme is generally much more robust and efficient than the
Car-Parrinello scheme
to select this feature in VASP, set MAXMIX in the INCAR file
G. K RESSE , E LECTRONIC O PTIMISATION
Page 24
Using MAXMIX
usually VASP resets the dielectric matrix to it’s default after moving the ions
but if the ions move only a little bit one can bypass this reset
– definitely a good option for molecular dynamics
– damped molecular dynamics (optimisation)
L
– works also well during relaxations, if the forces are not large ( 0.5 eV/Å)
you need to specify MAXMIX in the INCAR file
set MAXMIX to roughly three times the number of iterations in the first ionic step
the resulting speedups can be substantial (a factor 2 to 3 less electronic steps for each
ionic step)
G. K RESSE , E LECTRONIC O PTIMISATION
Page 25
Using Molecular dynamics
a simple INCAR file
ENMAX = 250 ; LREAL = A
# electronic degrees
ALGO = V
# very fast algorithm
MAXMIX = 80
# mixing
IBRION = 0
# MD
NSW =
1000
# number ofMD
POTIM = 3.0
# time step
TEBEG = 1500 ; TEEND = 500 #
SMASS = -1 ; NBLOCK = 50
#
SMASS = 2
#
SMASS = -3
#
steps
target temperature 1500-500 K
scale velocities every 50 steps
use a Nose Hoover thermostat
micro canonical
G. K RESSE , E LECTRONIC O PTIMISATION
Page 26
Using Molecular dynamics
timestep POTIM, depends on the vibrational frequencies and the required energy
conservation
as a rule of thumb: increase POTIM until 3 electronic minisation steps are required per
timestep
another rule of thumb:
H
0.5 fs
increase by 1 fs for each row
Li-F 1 fs
SMASS controls the MD simulation
– SMASS=-3 micro canonical ensemble
– for equilibration and simulated annealing SMASS = -1 ; NBLOCK = 50-100
microcanonical MD, and every NBLOCK steps the kinetic energy is scaled to
meet the requied temperature criterion
– for positive values a Nose Hoover thermostat is introduced
G. K RESSE , E LECTRONIC O PTIMISATION
Page 27
Download