Adventures with Fermion Monte Carlo

advertisement
Fermion Quantum Monte Carlo
based on the idea of sampling “graphs”
Ali Alavi
University of Cambridge
Alex Thom
James Spencer
EPSRC
Overview
Introduction and motivation
Paths integrals and the Fermion sign problem
FSP as a problem in “path counting”
A useful combinatorial formula
From path-sums to graph-sums
Applications to molecular systems
Towards application to periodic systems
Essence of idea
Express the many-electron path integral in a finite Slater
Determinant basis
Resum the path integral over exponentially large numbers of
paths to convert
path-sums => graph-sums
k
i2
iP-1
i1
iP
l
i
j
The graphs are much more stable entities which can then be
sampled.
A graphical, or diagrammatic, expansion of the partition function
Q   w [G]
(n)
n
Q
G
+
+
+
+
+ ….
G
2-vertex
3-vertex
Each vertex is a Slater determinant
Each graph represents the sum over all paths of length P which visit
all verticies of the graph
Non-pertubative expansion
Path Integrals
Consider the (thermal) density matrix:
ˆ
ˆ ( )  e H
In terms of the eigenstates of the Hamiltonian:
ˆ   i e  E i
i
i
 0 0 for    (i.e. zero temperatur e)
The energy can be calculated from:
Q  Tr [e
 Hˆ
]
 Hˆ
 ln Q
E 

 Hˆ

Tr [e ]
Tr [ He
]
The density matrix can be represented in real space
For a single electron located at x:
x  x'  x e
Q   dx x e
e
 Hˆ
 Hˆ
 Hˆ
x'
x
 (  / P ) Hˆ
 (  / P ) Hˆ
 (  / P ) Hˆ
 e
.e ....
e
P f actors: or " timeslices"
x  x'   dx2 ...dxP x e(  / P ) H x2 x2 e(  / P ) H x3 ... xP e(  / P ) H x'
ˆ
xe
 (  / P )(Tˆ Vˆ )
ˆ
x'  e
 ( mP / 2  )( x  x ') 2
ˆ
.e  (  / 2 P )(V ( x ) V ( x '))
PE terms
KE terms:harmonic spring
In the limit P   the path denoted by
x3
x  x2  ...  xP  x
x2
tends to a continuous function
x
xP
x( ),   0   , with bc : x( )  x(0)
Q   dx  Dx( )e  S [ x ( )]


path intergral
S [ x( )] 

1 2
0 2 mx ( )  V [ x( )]d
KE along path
PE along the path
One can simulate an electron
as a ring-polymer, moving
in the external field (which itself
can be dynamic).
Polarons [Parrinello Rahman]
For N electrons
X  ( x1 , x2 ,..., x N )
X ( )

x1  x1 x2  x2
x2  x1 x1  x2
Describes closed paths which can exchange identical particle coordinates
X ( )  Pˆ X (0)  ( xi1 , xi2 ..., xiN )

1
2
S[ X ( )]   d  mi xi  V [ X ( )]  U [ X ( )]
i 2
Coulomb interaction
0
1
Pˆ  S / 
Q    dX  DX ( )( 1) e
N ! Pˆ
Odd permutations subtract from the
Partition function: Fermion sign problem
As N or  increases, there is an exponential cancellation of contributions
arising from even and odd paths.
Slater Determinant space
Let Di be a Slater determinant composed out of N orthonormal
spin-orbitals [e.g. Hartree-Fock orbitals, Kohn-Sham, etc]
chosen out of a set of 2M:
Di  Dn1n2 .. nN
{u1 , u2 ,..., u2 M }
1

det[un1 un2 ...unN ]
N!
 2M 

N det  
 N 
e.g. M  100, N  10
N det  1017
The Di form a orthonormal set of antisymmetric functions.
They are solutions to a non-interacting, or uncorrelated (mean-field)
Hamiltonian H0:
H 0 Di  Ei Di
0
Full problem: H  T  U  V ,
N
N
1 2
1
[T     i ,U  
,V   v(ri )]
2
i
i  j ri  r j
i
H  E  ,    ca i Di
i
Exact solutions are linear superpositions
of uncorrelated determinants
Paths in Slater determinant space
A closed path in S.D. space
Di1  Di 2  ...  Di P  Di1
i2
iP-1
i1
iP
w( P ) [ Di1 , Di 2 ,...., Di P , Di1 ]  Di1 e  H / P Di 2 Di 2 e  H / P Di 3 ... Di P e  H / P Di1

A path of P steps in SD space
Q  Tr[e  H ]   Di e  H Di
i
   ... w( P ) [i1 , i 2 ,..., i P , i1 ]
i1
i2
iP
A given S.D. can occur multiple times along a path
 ij  Di e  H / P Dj
 ij is a computable , diagonally - dominant and extremely
sparse matrix :
 ij   ij 

P
H ij  O[(  / P ) 2 ] (Primitive approximat ion)
and a better approximat ion is :
 ij  e
  ( E i ( 0 )  E j( 0 ) ) / 2 P
[ H  H ( 0 )  H (1) ]
 ( ij 

P
H (1) ij )  O[(  / P ) 2 ]
Hamiltonian matrix elements (Slater-Condon rules)
Since H contains at most 2-body interactions:
Di H Dj  0 if Di and Dj differ by more than 2 spin - orbitals
Di
Dj
i
j
1
a b
1
Di U Dj  ij r12 ab  ij r12 ba
Hamiltonian connects only single and double excitations:
Maximum connectivity
N ( N  1)(2M  N )( 2M  N  1) / 4  N 2 M 2
Spin selection rule:
Di H Dj  0 if S z [ Di ]  S z [ Dj ]
Cost of calculation
Other symmetries may also exist
Hubbard model: translational invariance;
Molecules:point group symmetry
Di H Di  O( N 2 )
Di H Dj  O( N ) if Di and Dj differ by 1 spin - orbitals
Di H Dj  O(1) if Di and Dj differ by 2 spin - orbitals
Search for a power series in ii
 ij  Di e
 H / P
n2
Dj
j
Q   wi
Two-hop
i
i
n1
P  2 P  2  n1
wi   ii   
P

j i n1  0 n2  0
P 3 P 3 n1 P 3 n1  n2
  
k  j i n1  0 n2  0
n3  0
P-2-n1-n2
 ii  ij  jj  ji  ii
n1
n2
P  2  n1  n2
 ii  ij  jj  jk kk ki  ii
n1
n2
n3
j
3-hop
k
i

P  2  n1  n2  n3
 ...
Rearranging
Two - hop terms :
P  2 P  2  n1
 
j i n1  0 n2  0
 ii n  ij  jj n  ji  ii P  2 n  n
1
2
1
2
Nested
sum

n2
P  2 P  2  n1 


P2
jj
 ii   ij  ji    
 n1 0 n2 0   ii 
j i
transitionmatrix
elements
3 - Hop terms :
P 3 P 3 n1 P 3 n1  n2
  
k  j i n1  0 n2  0
n3  0
 ii n  ij  jj n  jk  kk n  ii P 3 n  n 
 ii P 3
1

k  j i
2
3
1
2
P 3 P 3 n1 P 3 n1  n2
 ij  jk  ki 
 
n1  0 n2  0
n3  0
  jj 
 
  ii 
n2
  kk

  ii



n3
Define the nested sum:
Zh
( P)
P  h P  h  n1
( x1 , x2 ..., xh )  

n1 0 n2  0
which appears in the h-hop term
P h
h1
 ni
i 1
...

nh  0
n1
n2
x1 x2 ..xh
nh
The “hop” series

 ij  ji ( P )  jj
wi   ii 1  
Z 2 (1, ) 
2
 ii
 j i  ii
P
j
k
i

k  j i

j i
kj
l k  i
 ij  jk ki
Z3
 ii
 ij  jk kl  li
3
 ii
4
( P)
 jj  kk
(1,
,
)
 ii  ii
Z4
j
k
i
l
( P)
 jj kk  ll
(1,
,
, ) ....
 ii  ii  ii
Using induction, one can show:
1 z 1
1
( x1 , x2 ..., xh ) 
2i C z  1  ( z  x )
P
Zh
( P)

x1 x2
x3
AJW Thom and A Alavi, J Chem Phys, 123, 204106, (2005)
Residue Theorem gives
For x1  x2  x3 ..
Zh
( P)
x 1
1
( x1 , x2 ..., xh )   i
xi  1  ( xi  x j )
i
h
P
j i
e.g .
x 1
( x1 )  1
x1  1
P
( P)
1
Z
1
x 1
1
x 1
( P)
 2
Z 2 ( x1 , x2 )  1
x2  1 ( x2  x1 )
x1  1 ( x1  x2 )
P
P
Some useful properties of Z-sums
Zh
( P)
( x1 , x2 ..., xh )  0 if h  P for all x1 , x2 ..., xh
 Replace upper limit of sums over h to 
1 z 1

dz  0

2i z  1
P
Z0
( P)
Symmetry:
Zh
( P)
( x1 , x2 ..., xh )  Z h
( P)
( xi1 , xi2 ..., xih )
From “hop”-expansion to “vertex” expansion
Consider the 4-hop terms:
j
k
4-vertex
i
l
j=l
j
k
3-vertex
i
“chain”
i=k
j=l
“Star”
2-vertex
i=k
l
Analytic summation over alternating series
j
j
k
Z3
( P)

i
k
Z6
 ...
i
n
  ij  jk  ki 
 jj  kk
 Z 3n (1,
S   
,
)
3

 ii  ii
 ii
n 0 


( P)
“Cycle function”
 ij  jk  ki
Define : Aijk ( z ) 
( z ii   ii )( z ii   jj )( z ii   kk )
1 z P 1
1 z P 1
1
n
S 
A

ijk


2

i
z

1
2

i
z  1 1  Aijk ( z )
n 0
C
C

Solve : 1  Aijk ( z )  0  feed solutions into the residue th.
Eg. A 2-vertex graph
j
ii   jj  a
ij  b
i
b2
b2
A( z ) 
 2
( za  a)( za  a) a ( z  1) 2
(For simplicity)
Next compute S2:
1 z P 1 1
1 z P 1
S2 


2i z  1 1  A( z ) 2i  z  1
1
b2
1 2
a ( z  1) 2
1 z P 1
( z  1) 2
1
( z  1)
P


z

1
2i  z  1 ( z  1) 2  (b / a ) 2 2i 
( z  1) 2  (b / a ) 2
1
( z  1)
P

z

1
2i 
( z  1  b / a )( z  1  b / a )




(1  b / a)




 1 (b / a ) (1  b / a ) P  1 (b / a )

2(b / a )
 2(b / a )
P
1  ( a  b) P
( a  b) P 
 
1
 1
P
P
2 a
a

Star graphs
j
j
k
 2
 
1 i
l
Z5
( P)
  3 1
k
 2 1 i
Z8
( P)
 ...
G1
l
Gg
For a star-graph with g-spokes, G1,G2,…Gg attached to i
S star
G2
G3
 n1  n2  ...  ng  1 z P  1 n1 n2
ng


  
AG1 AG2 .... AGg


n1 , n2 ,..., ng  2i C z  1
n1 , n2 .., n g 
1 z P 1
1

2i C z  1 1  AG1  AG2  ....  AGg
Chains graphs
1
 
1
G3
G1
G2
Z7
( P)

 1  1


 1 
G3
G1 2
G2
Z10
( P)
 ...
 n1  1  n2  n2  1  n3 

.... AG1 n1 AG2 n2 .... AGg ng
S chain   
n1
n2
n1 , n2 .., n g 


1 z P 1

2i C z  1
1
AG1
1
AG 2
1
...
1
1  AGg
General 3-vertex graph
Unfolded representation:
Each spoke represents an
Independent circuit on the
graph
k
Folded representation
j
j
k
j
k
i
k
i
k
1 z P 1
S3 (ijk ) 
2i c z  1
j
1
1 2
Aijk
1  Ajk

Aij
1  Ajk

Aik
1  Akj
1  Ajk
1 z P 1

2i c z  1 1  2 Aijk  Aij  Aik  Ajk
Denominator is cubic polynomial in z i.e. there are 3 residues
Unfolded 4-vertex graph
Denominator is a quartic polynomial in z
A graphical, or diagrammatic, expansion of the partition function
Q   w [G]
(n)
n
Q
G
Dijab
+
Dijab
+
+
abcd
Dijkl
Dijab Dija 'b '
+
+ ….
Dia' j'b' '
G
2-vertex
3-vertex
Each vertex is a Slater determinant
Each graph represents the sum over all paths of length P which visit
all verticies of the graph
2,3, and 4-vertex graphs
+
+
+
+
Monte Carlo sampling of graphs
The energy can be obtained from:
E
 ln Q

1
1
w( n ) [G ] ( n )
   ( n )
.w [G ]
Q n G w [G ] 
If graphs can be sampled with an un-normalised probability given by
w(n) [G], then the energy estimator is:
 ln w( n ) [G ]
~ (n)
E [G ]  

i.e.
~ (n)
E  E [G ] ( n )
w
[G ]
~ (n)
sign ( w [G]) E [G]
(n)
E
sign ( w( n ) [G])
| w|
| w|
For this to be useful, the denominator has to be well-behaved as   
i.e. the number of positive sampled graphs should exceed the number of
negative sampled graphs in such a way that this difference is finite and
does not vanish.
Monitor fraction of sampled graphs which are trees, positive cyclic and
negative cyclic graphs.
f [T ]   [G  T ] |w| , f [C  ]   [G  C  ]
sign ( w( n ) [G])
|w|
|w|
, f [C  ]   [G  C  ]
 f [T ]  f [C  ]  f [C  ]
|w|
For graphs that contain the HF determinant:
wHF  
(n)
w
 [G]
n G  HF
 ln wHF
~
EHF  

~
lim EHF  EHF
[Hartree-Fock energy]
 0
~
lim EHF  E0
[Ground state energy]
 
Approximation:Truncate series at 2-vertex, 3-vertex or highervertex graphs.
wHF
(v)
v

(n)
w
 [G]
n G  HF
2 vertex: Double-excitations
3 vertex: Quadruple excitations
4 vertex: Hexatuple excitations
Number of graphs= [N2M2]
[N4M4]
[N6M6]
N2 molecule
~
EHF
N2 molecule in VDZ basis
Types of sampled graphs (4-vertex level)
sign ( w( n ) [G])
|w|
N2 sampled energies (4-vertex level)
N2 binding curve
[sampling graphs which contain the HF determinant]
Applications to periodic systems
Taking a plane-wave PP code (CPMD) which can solve for
(i) KS orbitals and potential
(ii) KS virtuals
-> Use these as the basis for the vertex series
KS Hamiltonian becomes the reference (single-excitations now
contribute)
Need 2-index and 4-index integrals, which are computed on-the-fly
using FFTs (time consuming part)
Advantage: (i) Treatment of periodic systems
(ii) No BSSE
(iii) Can be used as a post-DFT method
Graphite (4 atom) primitive cell. 16el, BHS PP (Ec=90Ry)
2-vertex
Conclusions and outlook
Development of QMC methods based on graphs gives a method
to combat the Fermion sign problem
Proof of concept for small molecular systems
Major effort is now being expended on developing a periodic code…..
…..perhaps to return to surface problems in due course!
Advantage of graph-sampling algorithm
O(N2) scaling!
The observed stability at the 4-vertex level is extremely encouraging.
Current work:
(1) Extension to higher order graphs
(2) Improved Monte Carlo sampling
(3) Applications to large systems
Graphs
A graph a set of n distinct elements (in no particular order)
with a given connectivity
G  Di , Dj , Dk .....

n distinct determinants
Connectivity of graph is determined by ij
k
m
k
i
j
l
i
j
Ga
Gb
Compactly expressed:
Q   w( n ) [G]
n
G  Di , Dj , Dk .....
(n)
w [G]
G

A set of n connected determinants
Sum over all paths which visit all the determinants in G
Each graph represents a sum over exponentially large numbers of paths
its weight can be expected to be much better behaved than that of
individual paths.
A graph, G, is an object on which we can represent the paths which
visit all the vertices in G
The weight of a given graph is the sum over all paths of length
P which visit all the vertices of the graph:
w( n ) [G ]    ...  ' w( P ) [i1 , i 2 ,..., i P , i1 ]
i1G i 2 G
i P G
The prime ‘ indicates that the summation indicies must be chosen in
Such a way that each vertex in G is visited at least once.
This condition ensures that the weights of two different graphs
Ga and Gb (I.e. two graphs that differ in at least one vertex)
do not double-count paths which visit only Ga  Gb
k
m
k
i
j
l
i
j
Ga
 w[Ga ]  w[Gb ]
Gb
will not double-count paths which visit
w[Ga  Gb ]
Quantum Chemical applications
Dissociation of diatomic molecules:
Multiple-bond dissociation, e.g. the N2 molecule, is a major challenge to any
ab initio method.
Use HF orbitals generated from MOLPRO
Gaussian basis set [cc-pVDZ or VTZ]
Two-electron primitive integrals read in from MOLPRO output
and  matrix constructed on the fly.
Cost of the calculations
2-vertex
3-vertex
4-vertex
<1 s
150 secs
1 week [over 109 4-vertex graphs to sum]
On a pentium 4 processor [2003 vintage]
How to make 4-vertex (and eventually higher vertex) calculations
practical?
So if on step t of an MC simulation consisting of K steps we are at graph Gt
1
E  lim
K  K

~ (n)
E [Gt ]
t
In order to perform a Metropolis MC simulation, one needs to ensure that
microscopic reversibility is satisfied. In the present implementation, we
generate fresh graphs at each step according to an algorithm to be shortly
described.
In addition one needs to compute the generation probability of a graph using
this algorithm, in order to unbias the Metropolis MC acceptance ratio.
w( n ) [G ' ] Pgen [G ]
Pacc [G ' | G ]  min( 1,
)
(n)
Pgen [G ' ] w [G ]
Tree graphs are graphs that do not contain cycles
j
l
k
i
The weight of trees is positive definite at all 
Exactly diagonalised by Krogh, Olsen CPL 344, 578, (2001),
and by Chan, Kallay and Gauss, JCP, 121, 610 (2004)
N2 VDZ
15820024220 determinants
-108
0
1
2
3
4
5
-108.2
E/a.u.
-108.4
FCI
RHF
-108.6
v=2 b=5
-108.8
v=3 b=5 all
v=4 b=5
-109
-109.2
-109.4
r/a.u.
To summarize:


wi  ii 1   S 2 (ij)   S3 (ijk )   S 4 (ijkl ) ...
k  j i
l  k  j i
 j i

P
Approximation:Truncate series at 2, 3 or higher vertex terms.
2 vertex: Double-excitations [N2M2]
3 vertex: Quadruple excitations [N4M4]
4 vertex: Hexatuple excitations [N6M6]
By Comparison: CCSD: N2(M-N)4 Nit
CCSDT: N3(M-N)5 Nit
CCSDTQ: N4(M-N)6 Nit
CCSDTQ56: N6(M-N)8 Nit
An iIlustration of the Monte Carlo: 8x8 Hubbard lattice with 6 eMomentumspace basis
8 site system at or near half-filling is strongly open-shell
(1,1)
(0,0) (1,0)
(2,0)
+4
12 
   495
4
-4
 (k x , k y )  2t cos[( k x  k y ) / 2]  cos[( k x  k y ) / 2]
N det
16 
    8008
6
Dijab
DHF
+
Dijab
+
+
Dia' j'b' '
abcd
Dijkl
Dijab Dija 'b '
+
+ ….
Finite T
We wish to compute the energy at a finite -1=kT as
Tr [ He  H ]
E 
 H
Tr [e ]
 D He


D
e

 H
i
i
 H
i
Di
Di
i
Where the trace is taken over all Ndet determinants.
Problem is that these sums are not “Monte Carlo-able”.
Sampling Slater determinants
Letting wi  Di e H Di 
E 

 H
w
D
He
Di / wi
 i i
i

w
i
i
Since e  H is a positive definite operator, its diagonal
matrix elements are positive 
wi
pi 
 0 is a probability in the usual sense of the word
 wi
i
(i.e. non - negative and normalised )
Noting :
Di He  H Di
wi

 ln Di e  H Di

 ln wi


one can write the energy in a form suitable for a
Monte - Carlo experiment :
 ln wi
~
writing : Ei  

~
wi Ei

~
~
i
 E   p i Ei 
 Ei
wi
i
 wi
i
Where the expectation value is taken over an ensemble of determinants
sampled with probability pi.
Perform Metropolis sampling of Di chosen according to wi
1
~
E
t it
K  K
where i t is the determinan t on step t of the MC
E  lim
simulation
Di  Dj , Pacc  min(1, wj /wi )
Problem: the weight itself is a path-integral!
Define:
 ij  Di e  H / P Dj
[High-temperature DM]
 / P  10 3  10 4 
wi  Di e
 H
Di  (  ) ii
P
P  10 4  105
k
 wi 
 
 ij  jk ... li
j,k ..., l
P f actors
Discrete path integral: wildly oscillatory integrand.
Can’t use Monte Carlo!
Hopeless to calculate by brute-force!
j
i
l
Generation of graphs with a computable generation probability
We adopt a Markov chain algorithm is which successive determinants
are added to a list until the desired size of graph is reached. However,
since the connectivity of each determinant is not uniform, such an
algorithm can produce a non-uniform generation probability.
k
j3
i
j
j2
Start at i, and selected a connected determinant, j, with probability pij. This
results in a 2-vertex graph, G={i,j}.
Next, select k, connected to j, with probability pjk. If k is distinct, then add k
to the list: G={i,j,k}. Otherwise, select a new determinant from the current
Position (i.e. the last visited determinant).
Continue this process until n distinct verticies have been visited.
The generation probability can be calculated by examining all possible
ways of generating G according to this algorithm.
For example, for a 3-vertex graph, G={i,j,k}:
Pgen [G ]   ( pij p ji ) n  pij p jk   ( pij p ji ) n  pik
n 0
n 1
  ( pik pki ) n  pik pkj   ( pik pki ) n  pij
n 0

n 1
pij ( p jk  p ji pik )
1  pij p ji

pik ( pkj  pki pij )
1  pik pki
This procedure can be generalised for a n-vertex graph (n>3).
The general case is most compactly expressed in matrix notation.
Let us call our n verticies G={i1,i2,…,in}, all distinct, with i1=i.
Consider the generation probability of G in the given order (i1,i2,…,in).
According to this algorithm, we visit i2 for the first time from i1, i3 for
the first time from either i1 or i2, etc. In general we visit ik for the first
time from any of the previous visited k-1 verticies. The algorithm
terminates when we first visit the n-th vertex.
This is a first-passage problem in Markov chain theory.
 0

p
P ( k ) [i1 , i2 ,..., ik ]   i2i1
...

 0

pi1i2
0
...
pi2i3
0
0
pi1ik 

pi2ik 


1 
We will construct a series of transition probability matrices in which
vertex ik is an absorbing state:
 0

 pi2i1
(k )
P [i1 , i2 ,..., ik ]  
...

 0

Note that:
pi1i2
0
...
pi2i3
0
0
pi1ik 

pi2ik 


1 
[ P ( k ) ]n ik 1 ,ik
represents the probability of arriving at ik in exactly n steps given we
Started from ik-1, passing through some or all of (i1,i2,…., ik-1).
So therefore the total probability of arriving at ik, is simply the geometric
series:

n
 1 
[ P ( k ) ]n ik 1 ,ik  
(k ) 
 I  P  k 1,k
Therefore the probability to generate the graph G in the sequence:
i1  i2  ...  ik
 1 
 1 
 1 
Pgen [i1 , i2 ,..., ik ]  


...



( 2) 
( 3) 
(k ) 
 I  P 1, 2  I  P  2,3
 I  P  k 1,k
The probability to generate G in any order is given by the sum over all n!
permutations:
Pgen [G ]   Pgen [ Pˆ (i1 , i2 ,..., ik )]
Pˆ
In current implementation we choose
pij 
1
Ni
j3
i
j1
j2
Where Ni is the number of determinants connected to i. In other words
we do not introduce an energetic bias in the selection of determinants.
Conclusions
A new approach for Fermion Monte Carlo is being developed, based on
sampling Slater determinant space with weights computed according to
a novel path counting scheme.
The mathematics of path-counting needs further investigation.
The scheme has been applied to the Hubbard model and the N2 problem with
encouraging results.
Topology of graphs
j
j
j
k
i
Cyclic
i,j, and k all must be
single or doubleexcitations of
each other.
k
k
i
Star (tree)
j and k all must be s- or
d-excitations of i,
but not necessarily of each
other
i
Chain (tree)
j must be a s- or dexcitation of i, and k
is a triple or quadruple
of i.
Future work
Technical
Sampling graphs to counter the scaling problem
Calculation of electron density
Parallelisation of code
Systems:
Hubbard models [e.g. stability of striped phases]
Dispersion interactions (e.g. graphitic systems)
Contribution to the weights



wi  ii 1   S 2 (ij)   S3 (ijk )   S 4 (ijkl ) ...
k  j i
l  k  j i
 j i

P
10-site, N=10, U=4 weights
Contribution to the Weight
9
8
7
momentum b=1
6
UHF b=1
5
UHF b=2
4
UHF b=5
3
UHF b=10
2
1
0
0
1
2
3
Vertex
4
5
6
Tentative conclusion is: for the Hubbard model with U=4, the 3-vertex
approximation is not perfect, although it is nevertheless an improvement
over UHF: Captures about 20-50% of the correlation energy.
Can we estimate the contribution of the higher order graphs through a MC
sampling?
=> work in this direction is in progress
Distribution of terms among the 2,3 and 4 vertex graphs
[UHF basis]
Convergence of Êi with the vertex approximation
N=10, U=4, beta=1
-5
-5.5
0
1
2
3
4
5
6
-6
Energy
vertex [mom]
-6.5
Exact GS
UHF
-7
vertex UHF b=1
-7.5
RHF
-8
-8.5
-9
Vertex
1010 Hubbard Model
184756 determinants at N=10.
Exactly diagonalisable with effort
on P4
Half-filled system is closed-shell
+4
+1
-1
-4
Two important questions
How good is the 3-vertex approximation?
What is the best one-particle basis to use?
j
A 3-determinant star
a b b


 b a 0     (a  2b), 3  a
b 0 a


1 / 2 
 1/ 2 
 0 








c   1 / 2 , c   1 / 2 , c 3   1 / 2 




 1/ 2 
1
/
2
1
/
2






(a  2b) P (a  2b) P
w

2
2
Contour integral solution:
b2
Aij ( z )  Aik ( z )  2
a ( z  1) 2
i
k
1 z P 1
1
1 z P 1
S3 


2i z  1 1  Aij ( z )  Aik ( z ) 2i  z  1
1
2b 2
1 2
a ( z  1) 2
1 z P 1
( z  1) 2
1
( z  1)
P


z

1
2i  z  1 ( z  1) 2  2(b / a ) 2 2i 
( z  1) 2  2(b / a ) 2
1
( z  1)
P

z

1
2i 
( z  1  2b / a )( z  1  2b / a )


(1 






2b / a ) P  1 ( 2b / a ) (1  2b / a ) P  1 ( 2b / a )

2 2 (b / a )
 2 2 (b / a )
1  (a  2b) P
(a  2b) P 
 
1 
 1
P
P
2
a
a

Therefore
w  a P (1  S3 ) 

1
(a  2b) P  (a  2b) P
2

Again in exact agreement with the diagonalisation result
j
Fully connected 3-vertex graph
k
i
Via diagonalisation:
a b b


 b a b   1  (a  2b), 2,3  (a  b)
b b a


1 3 
 2/ 6 
 0 






1  1 3 , 2   1 2 , 3    1 6 




 1 2 
1
3

1
6






(a  2b) P 2
wi 
 ( a  b) P
3
3
Via the contour integral:
1  Ajk
1 z P 1
S3 
2i  z  1 1  2 Aijk  Aij  Aik  Ajk
b2
1 2
1 z P 1
a ( z  1) 2

2b 3
3b 2
2i  z  1
1 3
 2
3
a ( z  1) a ( z  1) 2
[multiply top and bottom by (z-1)3]
1
( z  1) 2  (b / a ) 2
P

z 1

2i
( z  1) 3  2(b / a )3  3(b / a ) 2 ( z  1)
1
( z  1  b / a )( z  1  b / a )
P

z

1
2i 
( z  1  b / a ) 2 ( z  1  2b / a )
1
( z  1  b / a)
P

z

1
2i 
( z  1  b / a )( z  1  2b / a )






(1  b / a)




 1 (2b / a ) (1  2b / a ) P  1 (b / a )

(3b / a )
3(b / a )
P
 2 (a  b) P 2 1 (a  2b) P 1 

 
 
P
P
3
a
3
3
a
3

[Factorise]
[Cancel factors]
[Evaluate two residues at
z=1-b/a, and z=1+2b/a]
Therefore
1
2

w  a P (1  S3 )   (a  b) P  (a  2b) P 
3
3

N2 VDZ
-107.6
0
2
4
6
8
10
-107.8
FCI
RHF
CCSD
CCSD(T)
CCSDT
v=2 b=5
v=3 b=5 all
v=3 b=5 spec
v=3 b=5 spec3
v=4 b=5 fullsum spec3
-108
E/a.u.
-108.2
-108.4
-108.6
-108.8
-109
-109.2
-109.4
r/a.u.

8-site Hubbard model with N=6 electrons
3-vertex weights against exact weights
j
Some simple examples.
ii   jj  a
(a) A two-determinant system
Exact solution via diagonalisation:
ij  b
i
1 / 2 
 1/ 2 
a b
,   

    (a  b),   
  



b a
1 / 2 
 1/ 2 
2 P
( a  b) P ( a  b) P
wi   k Di k 

2
2
k
Solution via Contour integral formula:
First define A(z):
b2
b2
A( z ) 
 2
( za  a)( za  a) a ( z  1) 2
Next compute S2:
1 z P 1 1
1 z P 1
S2 


2i z  1 1  A( z ) 2i  z  1
1
b2
1 2
a ( z  1) 2
1 z P 1
( z  1) 2
1
( z  1)
P


z

1
2i  z  1 ( z  1) 2  (b / a) 2 2i 
( z  1) 2  (b / a) 2
1
( z  1)
P

z

1
2i 
( z  1  b / a)( z  1  b / a )





(1  b / a)




 1 (b / a) (1  b / a) P  1 (b / a)

2(b / a)
 2(b / a)
P
1  ( a  b) P
( a  b) P 
 
1
 1
P
P
2 a
a

Therefore
1
w  a P (1  S 2 )  (a  b) P  (a  b) P
2


In exact agreement with diagonalisation result
Residue theorem
1
f ( z )dz  (sum of enclosed residues )

2i c
Residue at pole of order m at z0
1
d ( m 1)
m
a1 
(
z

z
)
f ( z)
0
( m 1)
(m  1)! dz
Exact ground-state energy, UHF and lowest Êi vs particle number
U=4
10-site,
10-site, U=4
-6
-6
-6
-6
000
-6.5
-6.5
-6.5 0
-6.5
222
2
44 4
4
66 6
6
88 8
8
10
10 10
10
12
12 12
12
-7
-7
-7
-7
Energy/t
Energy/t
Energy/t
-7.5
-7.5
-7.5
-7.5
-8
-8
-8
GroundEE
Ground
Ground
E
EUHF Ground E
EUHF
EUHF
v=3 [momentum]
v=3
[momentum]
v=3 [momentum]
-8.5
-8.5
-8.5
-9
-9
-9
v=3 UHF
-9.5
-9.5
-9.5
-9.5
-10
-10
-10
-10
-10.5
-10.5
-10.5
-10.5
-11
-11
-11
-11
Nel
Nel
Nel
Nel
The electron correlation problem
How to account for the fact that electrons move around in a
correlated fashion?
Quantum chemistry approach is:
Start from Hartree-Fock and try to improve systematically
Hartree-Fock [mean-field theory, N3 ~ N4] HF=D0
Coupled Cluster [CCSD(T), N7]
+ perturbation theory
Full-configuration interaction [eN]
=eT HF
= HF+ j cjDj
Expansions in Antisymmetric
functions
Essential feature of HF theory: maintains an orbital (one-particle)
picture of electronic structure
HF  det[u1 (x1 )u2 (x 2 )...u N (x N )]
Quantum Monte Carlo
QMC refers to stochastic methods to solve the Schrödinger Equation
(or sample path integrals) based on interpreting the S.E. as a “diffusion equation
in imaginary time”:

1 2

 H      V

2
 is interpreted as a probability distribution. Long-time stochastic propagation
[diffusion+life/death processes] leads to sampling the nodeless eigenstate
of H.
Application to Fermion systems is severely hampered by “sign problems”.
Unconstrained sampling of the configuration space of Fermions leads to Boson
Catastrophe.
QMC can be stabilised by the introduction of constraints:
-Fixed node approximation in diffusion MC [J Anderson]
-Restricted path integral MC (fixing nodes of density matrix) [Ceperley]
-Positive projection and constrained path MC for auxillary field QMC
[Fahy and Hamman, Zhang]
Why not use antisymmetric spaces?
We would like to explore the possibility of using an antisymmetrized space
as the basis for quantum monte carlo.
Can we sample a set of Slater determinants in such way that we can
extract meaningful physical quantities (eg energy) at the end of
the simulation?
This strategy avoids the Boson catastrophe from the outset without
imposition of fixed-node type approximations.
We will show that
(1) Such a method is indeed numerically stable
(2) The MC weights are obtained by summing over many paths of
fluctuating sign.
(3) Method depends on combinatorial ideas for path counting
-> So far it is not exact
(4) Applications to (a) Hubbard model and (b) Dissociating molecules
A major conceptual advantage is that it allows to build directly on the
one-particle picture of mean-field theory.
What’s the problem?
 2M 

N det  
 N 
e.g. M  100, N  10
N det  1017
N2 VDZ
15820024220 determinants
(exactly diagonalised by Krogh, Olsen CPL 344, 578, (2001),
and by Chan, Kallay and Gauss, JCP, 121, 610 (2004) )
-107.6
0
2
4
6
8
10
12
-107.8
-108
FCI
Energy/Hartrees
-108.2
RHF
v=2 b=5
-108.4
HF v=3
-108.6
SPEC2
SPEC3
-108.8
v=4 b=5
1
HF 
-109


g
SPEC 2
-109.2
SPEC3
-109.4
r/a.u.
5
7



g

u
1
N2 (
1



g
)  N ( 4S )  N ( 4S )
5

g

7

g


u
4su(2pz)
2g
1u
su(2pz)
g
u
su(2pz)
g
u
3sg (2pz)
2su
1sg
sg (2pz)
su
sg
sg (2pz)
su
sg
Formally speaking, in the eigenvalue basis of H:
wi   e
 Ea
Di a
2
a
~
Ei 
E e
 Ea
a
2
Di a
a
e
 Ea
Di a
2
a
~
lim   Ei  E0 when Di 0  0
Hubbard Model
H  t

s
[cs,i cs , j  h.c.]  U  n ,i n ,i
, i, j
i
Model of itinerant magnetism for narrow-band systems.
Intensively studied since the mid-80’s in the context of High Tc.
U
Partition function
Q  Tr[e  H ]   Di e  H Di
i
   ... w( P ) [i1 , i 2 ,..., i P , i1 ]
i1
i2
iP
Note that the sign of w(P) is a very poorly behaved quantity:
Depends on the product of P matrix elements. Therefore
small variations in the path can lead to wild fluctuations in
the sign of the path.
Exact Diagonalisation
H  T U V
N
N
1 2
1
T     i ,U  
, V   v(ri )
2
i
i  j ri  r j
i
H  E  ,
   ca i Di
i
  Di H Dj ca j  E ca i : a linear eigenvalue problem
j
Exact solutions are expressed as linear superpositions of uncorrelated
Determinants.
Conjecture: the n-vertex graph gives rise to a polynomial of
degree n in the denominator of the contour integral
Contour integrals reduce to a sum over n residues
Q   dx1dx2 ...dxP x1 e
 (  / P ) Hˆ
x2 x2 e
 (  / P ) Hˆ
x3 ... xP e
 (  / P ) Hˆ
x1
x3
x2
x1
xP
One can simulate an electron
as a ring-polymer, moving
in the external field (which itself
can be dynamic).
Polarons [Parrinello Rahman]
Harmonic springs hold together a “ring polymer”
E
1
(  / P ) Hˆ
(  / P ) Hˆ
dx
dx
...
dx
E
[
x
,
x
,..
x
]
x
e
x
x
e
x3 ...
1
2
P
1
2
P
1
2
2

Q
ˆ
 xP e (  / P ) H x1
Motivation
The development of stable fermion QMC algorithms which do not
require fixed-node approximations, but which maintain a ~ N2 or N3 scaling.
Does working in antisymmetric spaces (eg Slater determinants) help?
Intuitively, Slater determinant spaces are the “right” spaces to be dealing
with fermions: i.e. one should build in the fermion antisymmetry in the outset
in any N-particle representation.
Computational cost of electronic structure
methods
Hartree Fock
MP2 - MP4
Coupled Cluster CCSD-(T)
FCI
DFT
QMC
N3
N4-N7
N6-N7
eN
N3
N2-N3
Electron correlation is ubiquitous in chemistry
e.g. in molecular dissociation
su 
sB
sA
sg 
D0 ( x1 , x2 )  s g (1)s g (2) 

1
( s A  sB )
2
1
(1 2  1 2 )
2
1 s g (1) s g  (1)

2 s g (2) s g  (2)
D0 ~
1
( s A  sB )
2
Slater determinant: an uncorrelated wavefunction
1
( s A (1)  sB (1))( s A (2)  sB (2))
2
1
( s A (1) s A (2)  s A (1) s B (2)  s A (2) sB (1)  sB (1) sB (2))
   
2 




H H
H . H .
H . H .
H H
Incorrect dissociation
Configuration Interaction
Consider the doubly excited determinant:
D2 
corr 
~
1 s u (1) s u  (1)
2 s u (2) s u  (2)
sB
1
( D0  D2 )
2
1
( s A (1) s B (2)  s A (2) s B (1))
 
2 
H . H .
H . H .
sA
Correlated wavefunction
Correct dissociation
What is the problem with configuration interaction?
-Slowly convergent with respect to short and intermediate
range correlation.
Must include many determinants: the problem grows exponentially
with number of electrons and the number of virtuals
-(linear) Truncated CI lacks size consistency:
Coupled cluster methods are nowadays preferred.
Download