Approximation Methods Physics 130B, UCSD Fall 2009 Joel Broida November 15, 2009

advertisement
Approximation Methods
Physics 130B, UCSD Fall 2009
Joel Broida
November 15, 2009
Contents
1 The
1.1
1.2
1.3
Variation Method
The Variation Theorem . .
Excited States . . . . . . .
Linear Variation Functions
1.3.1 Proof that the Roots
.
.
.
.
1
1
7
8
17
. . . .
. . . .
21
21
26
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37
46
53
58
59
61
62
66
3 Time-Dependent Perturbation Theory
3.1 Transitions Between Two Discrete States . . . . . . . . . . . . . . .
3.2 Transitions to a Continuum of States . . . . . . . . . . . . . . . . . .
71
71
80
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
of the Secular Equation are
. . .
. . .
. . .
Real
2 Time-Independent Perturbation Theory
2.1 Perturbation Theory for a Nondegenerate Energy Level . . .
2.2 Perturbation Theory for a Degenerate Energy Level . . . . .
2.3 Perturbation Treatment of the First Excited States
of Helium . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4 Spin–Orbit Coupling and the Hydrogen Atom Fine Structure
2.4.1 Supplement: Miscellaneous Proofs . . . . . . . . . . .
2.5 The Zeeman Effect . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Strong External Field . . . . . . . . . . . . . . . . . .
2.5.2 Weak External Field . . . . . . . . . . . . . . . . . . .
2.5.3 Intermediate-Field Case . . . . . . . . . . . . . . . . .
2.5.4 Supplement: The Electromagnetic Hamiltonian . . . .
i
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
ii
1
1.1
The Variation Method
The Variation Theorem
The variation method is one approach to approximating the ground state energy
of a system without actually solving the Schrödinger equation. It is based on the
following theorem, sometimes called the variation theorem.
Theorem 1.1. Let a system be described by a time-independent Hamiltonian H,
and let ϕ be any normalized well-behaved function that satisfies the boundary conditions of the problem. If E0 is the true ground state energy of the system, then
hϕ|Hϕi ≥ E0 .
(1.1)
Proof. Consider the integral I = hϕ|(H − E0 )ϕi. Then
I = hϕ|Hϕi − E0 hϕ|ϕi = hϕ|Hϕi − E0 .
We must show that I ≥ 0. Let {ψn } be the true (stationary state) solutions to
the Schrödinger equation, so that Hψn = En ψn . By assumption, the ψn form a
complete, orthonormal set, so we can write
X
ϕ=
an ψn
n
where hψn |ψm i = δnm . Then
X
X
I=
a∗n hψn |(H − E0 )
am |ψm i
n
m
=
X
a∗n am (hψn |Hψm i − E0 δnm )
=
X
a∗n am (Em − E0 )δnm
X
|an | (En − E0 ) .
n,m
n,m
=
n
2
But |an | ≥ 0 and En > E0 for all n > 0 because E0 is the ground state of the
system. Therefore I ≥ 0 as claimed.
Suppose we have a trial function ϕ that is not normalized. Then multiplying
by a normalization constant N , equation (1.1) becomes |N |2 hϕ|H|ϕi ≥ E0 . But by
2
2
definition we know that 1 = hN ϕ|N ϕi = |N | hϕ|ϕi so that |N | = 1/hϕ|ϕi, and
hence our variation theorem becomes
hϕ|Hϕi
≥ E0 .
hϕ|ϕi
1
(1.2)
The integral in (1.1) (or the ratio of integrals in (1.2)) is called the variational
integral.
So the idea is to try a number of different trial functions, and see how low we can
get the variational integral to go. Fortunately, the variational integral approaches
E0 a lot faster than ϕ approaches ψ0 , so it is possible to get a good approximation
to E0 even with a poor ϕ. However, a common approach is to introduce arbitrary
parameters and minimize the energy with respect to them.
Before continuing with an example, there are two points I need to make. First, I
state without proof that the bound stationary states of a one-dimensional system are
characterized by having no nodes interior to the boundary points in the ground state
(i.e., the wavefunction is never zero), and the number of nodes increases by one for
each successive excited state. While the proof of this statement is not particularly
difficult (it’s really a statement about Sturm-Liouville type differential equations),
it would take us too far astray at the moment. If you are interested, a proof may
be found in Messiah, Quantum Mechanics, Chapter III, Sections 8-12.
A related issue is the following: In one dimension, the bound states are nondegenerate. To prove this, suppose we have two degenerate states ψ1 and ψ2 , both
with the same energy E. Multiply the Schrödinger equation for ψ1 by ψ2 :
−
~2
d2 ψ1
+ V ψ1 ψ2 = Eψ1 ψ2
ψ2
2m
dx2
and multiply the Schrödinger equation for ψ2 by ψ1 :
−
~2
d2 ψ2
+ V ψ1 ψ2 = Eψ1 ψ2 .
ψ1
2m
dx2
Subtracting, we obtain
ψ2
But then
d2 ψ2
d2 ψ1
−
ψ
= 0.
1
dx2
dx2
dψ1
d2 ψ1
d2 ψ2
dψ2
d
ψ2
= ψ2
−
ψ
=0
− ψ1
1
dx
dx
dx
dx2
dx2
so that
dψ1
dψ2
− ψ1
= const .
dx
dx
However, we know that ψ → 0 as x → ±∞, and hence the constant must equal
zero. Rewriting this result we now have d ln ψ1 = d ln ψ2 or ln ψ1 = ln ψ2 + ln k
where ln k is an integration constant. This is equivalent to ψ1 = kψ2 so that ψ1
and ψ2 are linearly dependent and hence degenerate as claimed.
The second topic I need to address is the notion of classification by symmetry.
So, let us consider the time-independent Schrödinger equation Hψ = Eψ, and
suppose that the potential energy function V (x) is symmetric, i.e.,
ψ2
V (−x) = V (x) .
2
Under these conditions, the total Hamiltonian is also symmetric:
H(−x) = H(x) .
To understand the consequences of this, let us introduce an operator Π called the
parity operator, defined by
Πf (x) = f (−x)
where f (x) is an arbitrary function. It is easy to see that Π is Hermitian because
Z ∞
Z ∞
hf |Πgi =
f (x)∗ Πg(x) dx =
f (x)∗ g(−x) dx
−∞
=
Z
∞
−∞
f (−x)∗ g(x) dx =
−∞
Z
∞
[Πf (x)]∗ g(x) dx
−∞
= hΠf |gi
where in going from the first line to the second we simply changed variables x → −x.
(I will use the symbol dx to denote the volume element in whatever n-dimensional
space is under consideration.)
Now what can we say about the eigenvalues of Π? Well, if Πf = λf , then
Π2 f = Π(Πf ) = λΠf = λ2 f .
On the other hand, it is clear that
Π2 f (x) = Π(Πf (x)) = Πf (−x) = f (x)
and hence we must have λ2 = 1, so the eigenvalues of Π are ±1. Let us denote the
corresponding eigenfunctions by f± :
Πf+ = f+
and
Πf− = −f− .
f+ (−x) = f+ (x)
and
f− (−x) = −f− (x) .
In other words,
Thus f+ is any even function, and f− is any odd function. Note that what have
shown is the existence of a Hermitian operator with only two eigenvalues, each of
which is infinitely degenerate. (I leave it as an easy exercise for you to show that
f+ and f− are orthogonal as they should be.)
Next, note that any f (x) can always be written in the form
f (x) = f+ (x) + f− (x)
where
f+ (x) =
f (x) + f (−x)
2
and
3
f− (x) =
f (x) − f (−x)
2
are obviously symmetric and antisymmetric, respectively. Thus the eigenfunctions
of the parity operator are complete, i.e., any function can be written as the sum of
a symmetric function and an antisymmetric function.
It will be extremely convenient to now introduce the operators Π± defined by
1±Π
.
2
Π± =
In terms of these operators, we can write
Π± f = f± .
It is easy to see that the operators Π± satisfy the three properties
Π2± = Π±
Π+ Π− = Π− Π+ = 0
Π+ + Π− = 1 .
The operators Π± are called projection operators.
Returning to our symmetric Hamiltonian, we observe that
Π(H(x)ψ(x)) = H(−x)ψ(−x) = H(x)ψ(−x) = H(x)Πψ(x)
and thus the Hamiltonian commutes with the parity operator. But if [H, Π] = 0,
then it is trivial to see that [H, Π± ] = 0 also, and therefore acting on HψE = EψE
with Π± we see that
HψE+ = EψE+
and
HψE− = EψE− .
Thus the stationary states in a symmetric potential can always be classified according to their parity, i.e., they can always be chosen to have a definite symmetry.
Moreover, since, as we saw above, the bound states in one dimension are nondegenerate, it follows that each bound state in a one-dimensional symmetric potential
must be either even or odd.
Example 1.1. Let us find a trial function for a particle in a one-dimensional box
of length l. Since the true wavefunction vanishes at the ends x = 0 and x = l, our
trial function must also have this property. A simple (un-normalized) function that
obeys these boundary conditions is
ϕ = x(l − x)
for 0 ≤ x ≤ l
and ϕ = 0 outside the box.
4
The integrals in equation (1.2) are
Z l
~2
d2
x(l − x) 2 x(l − x) dx
2m 0
dx
Z
~2 l 3
~2 l
x(l − x) dx =
=
m 0
6m
hϕ|Hϕi = −
and
hϕ|ϕi =
Therefore
Z
l
0
x2 (l − x)2 dx =
l5
.
30
hϕ|Hϕi
~2
=5 2.
hϕ|ϕi
ml
E0 ≤
For comparison, the exact solution has energy levels
En =
n2 ~2 π 2
2ml2
n = 1, 2, . . .
so the ground state (n = 1) has energy
~2
π 2 ~2
= 4.9348 2
2
2 ml
ml
for an error of 1.3%. The figure below is a plot of the exact normalized ground state
solution to the particle in a box together with the normalized trial function. You
can see how closely the trial function is to the exact solution.
1.4
1.2
1.0
0.8
0.6
Exact Solution
0.4
Trial Function
0.2
0.2
0.4
0.6
Figure 1: Plot of
0.8
1.0
√
√
2 sin πx and 30x(1 − x).
Example 1.2. Let us construct a variation function with parameter for the onedimensional harmonic oscillator, and find the optimal value for that parameter.
5
What do we know in general? First, the wavefunction must vanish as x → ±∞.
2
The most obvious function that satisfies this is e−x . However, x has units of length,
and we can only take the exponential of a dimensionless quantity (think of the power
2
series expansion for e−x ). However, if we include a constant α with dimensions
2
of length−2 , then e−αx is satisfactory from a dimensional standpoint. In addition,
since the potential V = 21 kx2 is symmetric, we know that the eigenstates will have
a definite parity. And since the ground state has no nodes, it must be an even
function (since an odd function has a node at the origin). Thus the trial function
2
ϕ = e−αx has all of our desired properties.
Since ϕ is unnormalized, we use equation (1.2). The Hamiltonian is
−
~2 d2
1
+ mω 2 x2
2m dx2
2
and hence
~2
hϕ|Hϕi = −
2m
Z
∞
−∞
2
e
−αx2
1
d2 e−αx
dx + mω 2
dx2
2
Z
∞
2
x2 e−2αx dx
−∞
Z ∞
i
h
2
2
2
1
~2
4α2 x2 e−2αx − 2αe−2αx dx + mω 2
x2 e−2αx dx
=−
2m −∞
2
−∞
Z
Z
∞
2
−2~2 α2
1
~2 α ∞ −2αx2
=
+ mω 2
x2 e−2αx dx +
e
dx .
m
2
m −∞
−∞
Z
∞
The second integral is easy (and you should already know the answer):
r
Z ∞
π
−2αx2
e
dx =
.
2α
−∞
Using this, the first integral is also easy. Letting β = 2α we have
Z ∞
Z ∞
Z ∞
2
2
2
∂
e−βx dx
x2 e−2αx dx =
x2 e−βx dx = −
∂β −∞
−∞
−∞
r
∂
1 π 1/2
π
=−
=
∂β β
2 β 3/2
=
1 π 1/2
.
2 (2α)3/2
After a little algebra, we now arrive at
~2 π 1/2 α1/2
mω 2 π 1/2 α−3/2
+
.
3/2
2 m
27/2
And the denominator in equation (1.2) is just
r
Z ∞
2
π
hϕ|ϕi =
e−2αx dx =
.
2α
−∞
hϕ|Hϕi =
6
Thus our variational integral becomes
W :=
hϕ|Hϕi
~2 α mω 2
=
+
.
hϕ|ϕi
2m
8α
To minimize this with respect to α we set dW/dα = 0 and solve for α:
mω 2
~2
=0
−
2m
8α2
or
α=±
mω
.
2~
2
The negative root must be rejected because otherwise ϕ = e−αx would be divergent.
Substituting the positive root for α into our expression for W yields
W =
1
~ω
2
which is the exact ground state harmonic oscillator energy. This isn’t surprising,
because up to normalization, our ϕ with α = mω/2~ is just the exact ground state
harmonic oscillator wave function.
1.2
Excited States
So far all we have discussed is how to approximate the ground-state energy of a
system. Now we want to take a look at how to go about approximating the energy
of an excited state. Let us assume that the stationary states of our system are
numbered so that
E0 ≤ E1 ≤ E2 ≤ · · · .
If {ψn } is a complete set of orthonormal
eigenstates of H, then our normalized
P
trial function can be written ϕ = n an ψn where an = hψn |ϕi. Then as we have
seen
hϕ|Hϕi =
X
n,m
a∗n am Em hψn |ψm i =
and
hϕ|ϕi =
∞
X
n=0
X
a∗n am Em δnm =
n,m
∞
X
n=0
|an |2 En
2
|an | = 1 .
Suppose we restrict ourselves to trial functions that are orthogonal to the true
ground-state wavefunction ψ0 . Then a0 = hψ0 |ϕi = 0 and we are left with
hϕ|Hϕi =
∞
X
n=1
|an |2 En
and
7
hϕ|ϕi =
∞
X
n=1
|an |2 = 1 .
2
2
For n ≥ 1 we have En ≥ E1 so that |an | En ≥ |an | E1 and hence
∞
X
n=1
2
|an | En ≥
∞
X
n=1
2
|an | E1 = E1
∞
X
n=1
2
|an | = E1 .
This gives us our desired result
hϕ|Hϕi ≥ E1
if hψ0 |ϕi = 0 and hϕ|ϕi = 1 .
(1.3)
While equation (1.3) gives an upper bound on the energy E1 of the first excited
state, it depends on the restriction hψ0 |ϕi = 0 which can be problematic. However,
for some systems this is not a difficult requirement to achieve even though we don’t
know the exact ground-state wavefunction. For example, a one-dimensional problem
with a symmetric potential has a ground-state wavefunction that is always even,
while the first excited state is always odd. This means that any (normalized) trial
function ϕ that is an odd function will automatically satisfy hψ0 |ϕi = 0.
It is also possible to extend this approach to approximating the energy levels of
higher excited states. In particular, if we somehow choose the trial function ϕ so
that
hψ0 |ϕi = hψ1 |ϕi = · · · = hψn |ϕi = 0,
then, following exactly the same argument as above, it is easy to see that if hϕ|ϕi = 1
we have
hϕ|Hϕi ≥ En+1 .
For example, consider any particle moving under a central potential V (r) (e.g.,
the hydrogen atom). Then the Schrödinger equation factors into a radial equation
that depends on V (r) times an angular equation (that is independent of V ) with
solutions that are just the spherical harmonics Ylm (θ, φ). It may very well be that
we can’t solve the radial equation with this potential, but we know that spherical
harmonics with different values of l are orthogonal. Thus, we can get an upper
bound to the energy of the lowest state with a particular angular momentum l by
choosing a trial function that contains the factor Ylm .
1.3
Linear Variation Functions
The approach that we are now going to describe is probably the most common
method of finding approximate molecular wave functions. A linear variation
function ϕ is a linear combination of n linearly independent functions fi :
ϕ=
n
X
ci f i .
i=1
The functions fi are called basis functions, and they must obey the boundary
conditions of the problem. The coefficients ci are to be determined by minimizing
the variational integral.
8
We shall restrict ourselves to a real ϕ, so the functions fi and coefficients ci are
taken to be real. Later we will remove this requirement. Furthermore, note that
the basis functions are not generally orthogonal since they are not necessarily the
eigenfunctions of any operator. Let us define the overlap integrals Sij by
Z
Sij := hfi |fj i = fi∗ fj dx
(where the asterisk on fi isn’t necessary because we are assuming that our basis
functions are real). Then (remember that the ci are real)
hϕ|ϕi =
n
X
i,j=1
ci cj hfi |fj i =
n
X
ci cj Sij .
i,j=1
Next, we define the integrals
Hij := hfi |Hfj i =
so that
hϕ|Hϕi =
n
X
i,j=1
Z
fi∗ Hfj dx
ci cj hfi |Hfj i =
n
X
ci cj Hij .
i,j=1
Then the variation theorem (1.2) becomes
Pn
hϕ|Hϕi
i,j=1 ci cj Hij
W =
= Pn
hϕ|ϕi
i,j=1 ci cj Sij
or
W
n
X
ci cj Sij =
n
X
ci cj Hij .
(1.4)
i,j=1
i,j=1
Now W is a function of the n ci ’s, and we know that W ≥ E0 . In order
to minimize W with respect to all of the the ck ’s, we must require that at the
minimum we have
∂W
= 0;
k = 1, . . . , n .
∂ck
Taking the derivative of (1.4) with respect to ck and using
∂ci
= δik
∂ck
we have
n
n
n
X
X
∂W X
(δik cj + ci δjk )Hij
(δik cj + ci δjk )Sij =
ci cj Sij + W
∂ck i,j=1
i,j=1
i,j=1
9
or (since ∂W/∂ck = 0)
W
n
X
cj Skj + W
n
X
ci Sik =
cj Hkj +
n
X
ci Hik .
i=1
j=1
i=1
j=1
n
X
However, the basis functions fi are real so we have
Z
Sik = fi fk dx = Ski
and since H is Hermitian (and H(x) is real) we also have
Hik = hfi |Hfk i = hHfi |fk i = hfk |Hfi i∗ = hfk |Hfi i = Hki .
Therefore, because the summation indices are dummy indices, we see that the two
terms on each side of the last equation are identical, and we are left with
W
n
X
cj Skj =
n
X
j=1
cj Hkj
j=1
j=1
or
n
X
(Hkj − W Skj )cj = 0 ;
k = 1, . . . , n .
(1.5)
This is just a system of n homogeneous linear equations in n unknowns (the n
coefficients cj ), and hence for a nontrivial solution to exist (we don’t want all of the
cj ’s to be zero) we must have the secular equation
det(Hkj − W Skj ) = 0 .
(1.6)
P
(You can think of this as a system of the form j akj xj = 0 where the matrix
A = (ajk ) must be singular or else A−1 would exist and then the equation Ax = 0
would imply that x = 0. The requirement that A be singular is equivalent to the
requirement that det A = 0.) Written out, equation (1.6) looks like
H11 − W S11
H21 − W S21
..
.
H − WS
n1
n1
H12 − W S12
···
H22 − W S22
..
.
···
Hn2 − W Sn2
···
H1n − W S1n H2n − W S2n = 0.
..
.
N − WS nn
nn
The determinant in (1.6) is a polynomial in W of degree n, and it can be proved
that all n roots of this equation are real. (The proof is given at the end of this
section for those who are interested.) Let us arrange the roots in order of increasing
value as
W0 ≤ W1 ≤ · · · ≤ Wn−1 .
10
Similarly, we number the bound states of the system so that the corresponding true
energies of these bound states are also arranged in increasing order:
E0 ≤ E1 ≤ · · · ≤ En−1 ≤ En ≤ · · · .
From the variation theorem we know that E0 ≤ W0 . Furthermore, it can also be
proved (see the homework) that
Ei ≤ Wi
for each i = 0, . . . , n − 1 .
In other words, the linear variation method provides upper bounds for the energies
of the lowest n bound states of the system. It can be shown that increasing the
number of basis functions used (and hence increasing the number of states whose
energies are approximated), the better the accuracy of the previously calculated
energies.
Once we have found the n roots Wi , we can substitute them one-at-a-time back
(i)
into equation (1.5) and solve for the coefficients cj , where the superscript denotes
that fact that this particular set of coefficients applies to the root Wi . (Again, this
is just like finding the eigenvector corresponding to a given eigenvalue.) Note also
that all we can really find is the ratios of the coefficients, say relative to c1 , and
then fix c1 by normalization.
There are some tricks that can simplify the solution of equation (1.6). For
example, if we choose the basis functions to be orthonormal, then Skj = δkj . If
the originally chosen set of basis functions isn’t orthonormal, we can always use the
Gram-Schmidt process to construct an orthonormal set. Also, we can make some of
the off-diagonal Hkj ’s vanish if we choose our basis functions to be eigenfunctions
of some other Hermitian operator A that commutes with H. This because of the
following theorem:
Theorem 1.2. Let fi and fj be eigenfunctions of a Hermitian operator A corresponding to the eigenvalues ai 6= aj . If H is an operator that commutes with A,
then
Hji = hfj |Hfi i = 0 .
Proof. Let us first assume that the eigenvalue ai is nondegenerate. Then Afi = ai fi
and
A(Hfi ) = HAfi = ai (Hfi ) .
Thus Hfi is in the eigenspace Vai of A corresponding to the eigenvalue ai . But
ai is nondegenerate so that the eigenspace is one-dimensional and spanned by fi .
Hence we must have Hfi = bi fi for some scalar bi . Recalling that eigenfunctions
belonging to distinct eigenvalues of a Hermitian operator are orthogonal, we have
hfj |Hfi i = bi hfj |fi i = 0 .
11
Now assume that the eigenvalue ai is degenerate. This means that the eigenspace
Vai has dimension greater than one, say dim Vai = n. Then Vai has a basis g1 , . . . , gn
consisting of eigenvectors of A corresponding to the P
eigenvalue ai , i.e., Agk = ai gk
n
for each k = 1, . . . , n. So if Hfi is in Vai , then Hfi = k=1 ck gk for some expansion
coefficients ck . But then we again have
hfj |Hfi i =
n
X
k=1
ck hfj |gk i = 0
because the eigenfunctions fj and gk belong to the distinct eigenvalues aj and ai
respectively.
Another (possibly easier) way to prove Theorem 1.2 is this. Let Afi = ai fi and
Afj = aj fj where ai 6= aj . (In other words, fi and fj belong to different eigenspaces
of A.) Then on the one hand we have
hfj |HAfi i = ai hfj |Hfi i
while on the other hand, we can use the fact that H and A commute along with
the fact that A is Hermitian and hence has real eigenvalues, to write
hfj |HAfi i = hfj |AHfi i = hAfj |Hfi i = aj hfj |Hfi i .
Equating these results shows that (ai − aj )hfj |Hfi i = 0. Therefore, if ai 6= aj , we
must have hfj |Hfi i = 0.
Finally, it is left as a homework problem to show that equations (1.5) and (1.6)
also hold if the variation function is in fact allowed to be complex.
Example 1.3. In Example 1.1 we constructed the trial function ϕ = x(l − x) for
the ground state of the one-dimensional
particle in a box. Let us now construct a
P
linear variation function ϕ = i ci fi to approximate the energies of the first four
states. This means that we need at least four independent functions fi that obey
the boundary conditions of vanishing at the ends of the box. While there are an
infinite number of possibilities, we want to limit ourselves to integrals that are easy
to evaluate.
We begin by taking
f1 = x(l − x) ,
and another simple function that obeys the proper boundary conditions is
f2 = x2 (l − x)2 .
If the origin were chosen to be at the center of the box, we know that the exact
solutions would have a definite parity, alternating between even and odd functions,
starting with the even ground state. To see that both f1 and f2 are even functions,
12
we shift the origin to the center of the box by changing variables to x′ = x − l/2.
Then x = x′ + l/2 and we find
f1 = (x′ + l/2)(l/2 − x′ )
and
f2 = (x′ + l/2)2 (l/2 − x′ )2
which shows that f1 and f2 are both clearly even functions of x′ .
Since both f1 and f2 are even functions, if we took ϕ = c1 f1 + c2 f2 we would end
up with an upper bound for the two lowest energy even states (the n = 1 and n = 3
states). In order to also approximate the odd n = 2 and n = 4 states, we must add
in two odd functions. Thus we need two functions that vanish at x = 0, x = l and
x = l/2. Two functions that satisfy these requirements are
f3 = x(l − x)(l/2 − x)
and
f4 = x2 (l − x)2 (l/2 − x) .
By again changing variables as we did for f1 and f2 , you can easily show that f3
and f4 are indeed odd functions. Note also that the four functions we have chosen
are linearly independent as they must be.
One of the advantages in choosing our functions to have a definite parity is that
many of the integrals that occur in equation (1.6) will vanish. In particular, since
any integral of an odd function over an even interval is identically zero, and since
the product of an even function with an odd function is odd, it should be clear that
S13 = S31 = 0
S14 = S41 = 0
S23 = S32 = 0
S24 = S42 = 0 .
Furthermore, since the functions have a definite parity, they are eigenfunctions of the
parity operator Π with Πf1,2 = +f1,2 and Πf3,4 = −f3,4 . And since the potential
is symmetric, we have [Π, H] = 0 so that by Theorem 1.2 we know that Hij = 0 if
one index refers to an even function and the other refers to an odd function:
H13 = H31 = 0
H14 = H41 = 0
H23 = H32 = 0
H24 = H42 = 0 .
With these simplifications, (1.6) becomes
H11 − W S11 H12 − W S12
0
0
H21 − W S21 H22 − W S22
0
0
H33 − W S33
0
0
H43 − W S43
0
0
= 0.
H34 − W S34 H − WS 44
44
Since the determinant of a block diagonal matrix is the product of the determinants
of the blocks, we can find all four roots by finding the two roots of each of the
13
following equations:
H11 − W S11
H21 − W S21
H33 − W S33
H43 − W S43
H12 − W S12 =0
H22 − W S22 H34 − W S34 = 0.
H44 − W S44 (1)
(1)
(1)
(1)
(1.7a)
(1.7b)
Let the roots of (1.7a) be denoted W1 , W3 . These are the approximations to the
energies of the n = 1 and n = 3 even states. Similarly, the roots W2 , W4 of (1.7b)
are the approximations to the odd energy states n = 2 and n = 4. Once we have the
roots Wi , we substitute them one-at-a-time back into equation (1.5) to determine
(i)
the set of coefficients cj corresponding to that particular root. In the particular
case of W1 , this yields the set of equations
(H11 − W1 S11 )c1 + (H12 − W1 S12 )c2 = 0
(H21 − W1 S21 )c1 + (H22 − W1 S22 )c2 = 0
(1)
(1)
(1)
(1)
(H33 − W1 S33 )c3 + (H34 − W1 S34 )c4 = 0
(H43 − W1 S43 )c3 + (H44 − W1 S44 )c4 = 0 .
(1.8a)
(1.8b)
Now, W1 was a root of (1.7a), so the determinant of the coefficients in (1.8a)
(1)
(1)
must vanish, and we have a nontrivial solution for c1 and c2 . However, W1 was
not a root of (1.7b), so the determinant of the coefficients in (1.8b) does not vanish,
(1)
(1)
and hence there is only the trivial solution c3 = c4 = 0. Thus the trial function
(1)
(1)
for W1 is ϕ1 = c1 f1 + c2 f2 . Exactly the same reasoning applies to the other three
roots, and we have the trial functions
(1)
(1)
ϕ3 = c1 f1 + c2 f2
(2)
(2)
ϕ4 = c3 f3 + c4 f4 .
ϕ1 = c1 f1 + c2 f2
ϕ2 = c3 f3 + c4 f4
(3)
(3)
(4)
(4)
So we see that the even states ψ1 and ψ3 are approximated by the trial functions ϕ1
and ϕ3 consisting of linear combinations of the even functions f1 and f2 . Similarly,
the odd states ψ2 and ψ4 are approximated by the trial functions ϕ2 and ϕ4 that
are linear combinations of the odd functions f3 and f4 .
To proceed any further, we need to evaluate the non-zero integrals Hij and Sij .
From Example 1.1 we can immediately write down H11 and S11 . The rest of the
integrals are also straight-forward to evaluate, and the result is
H11 = ~2 l3 /6m
H12 = H21 = ~2 l5 /30m
H22 = ~2 l7 /105m
H33 = ~2 l5 /40m
H44 = ~2 l9 /1260m
H34 = H43 = ~2 l7 /280m
S11 = l5 /30
S12 = S21 = l7 /140
S22 = l9 /630
S33 = l7 /840
S44 = l11 /27720
S34 = S43 = l9 /5040 .
14
Substituting these results into equation (1.7a) to determine W1 and W3 we have
~ 2 l3
l5
~ 2 l5
l7
6m − 30 W
30m − 140 W = 0.
25
7
2 7
9
~l − l W ~l − l W
30m
140
105m
630
To evaluate this, it is easiest to recall that multiplying any single row of a determinant by some scalar is the same as multiplying the original determinant by that
same scalar. (This is an obvious consequence of the definition
det A =
n
X
i1 ,...,in =1
εi1 ···in a1i1 · · · anin .)
Since the right hand side of this equation is zero, we don’t change anything by
multiplying any row in this determinant by some constant. Multiplying the first
row by 420m/l3 and the second row by 1260m/l5 we obtain
70~2 − 14ml2W 14~2 l2 − 3ml4 W (1.9)
=0
42~2 − 9ml2 W 12~2 l2 − 2ml4 W or
ml4 W 2 − 56ml2 ~2 W + 252~4 = 0 .
The roots of this quadratic are
W1,3 = (~2 /ml2 )(28 ±
√
532) = 4.93487~2/ml2 , 51.0651~2/ml2 .
Similarly, substituting the values for Hij and Sij into (1.7b) results in
√
W2,4 = (~2 /ml2 )(60 ± 1620) = 19.7508~2/ml2 , 100.249~2/ml2 .
For comparison, the first four exact solutions En = n2 ~2 π 2 /2ml2 are
En = 4.9348~2/ml2 , 19.7392~2/ml2 , 44.4132~2/ml2 , 78.9568~2/ml2
so the errors are (in the order of increasing energy levels) 0.0014%, 0.059%, 15.0%
and 27.0%. As expected, we did great for n = 1 and n = 2, but not so great for
n = 3 and n = 4.
We still have to find the approximate wave functions that correspond to each of
the Wi ’s. We want to substitute W1 = 4.93487~2/ml2 into equations (1.8a) and use
the integrals we have already evaluated. However, it is somewhat easier to note that
(1)
the coefficients of c1,2 in equations (1.8a) are equivalent to the entries in equation
(1.9). Furthermore, as we have already noted, all we can find is the ratio of the
ci ’s, so the two equations in (1.9) are equivalent, and we only need to use either one
of them. (That the equations are equivalent is a consequence of the fact that the
determinant (1.9) is zero, so the rows must be linearly dependent. Hence we get no
new information by using both rows.)
15
So choosing the first row we have
70~2 − 14ml2 W1 = 70~2 − 14ml2(4.93487~2/ml2 ) = 0.91182~2
14~2 l2 − 3ml4 W1 = 14~2 l2 − 3ml4 (4.93487~2/ml2 ) = −0.80461~2l2
so that
(1)
c2 =
(1)
To fix the value of c1
0.91182~2 (1)
(1)
c = 1.133c1 /l2 .
0.80461~2l2 1
we use the normalization condition:
(1)
(1)
(1)
(1)
1 = hϕ1 |ϕ1 i = hc1 f1 + c2 f2 |c1 f1 + c2 f2 i
(1)
(1) (1)
(1)
= [c1 ]2 S11 + 2c1 c2 S12 + [c2 ]2 S22
(1.133)2
1.133
(1) 2
S22
= [c1 ] S11 + 2 · 2 S12 +
l
l4
5
l
1.133 l7
(1.133)2 l9
(1)
= [c1 ]2
+2· 2
+
30
l 140
l4
630
(1)
= 0.05156[c1 ]2 l5
(1)
and hence c1 = 4.404l−5/2.
Putting this all together we finally obtain
ϕ1 = 4.404l−5/2f1 + 4.990l−9/2f2
= 4.404l−5/2x(l − x) + 4.990l−9/2x2 (l − x)2
= l−1/2 [4.404(x/l)(1 − x/l) + 4.990(x/l)2 (1 − x/l)2 ] .
As you can see√from the plot below, the function ϕ1 is almost identical to the exact
solution ψ1 = 2 sin πx/l:
1.4
1.2
1.0
0.8
0.6
Exact Solution
0.4
Trial Function
0.2
0.2
0.4
0.6
0.8
1.0
Figure 2: Plot of ψ1 and ϕ1 vs x/l.
16
Repeating all of this with the other roots W2 , W3 and W4 we eventually arrive
at
ϕ2 = l−1/2 [16.78(x/l)(1 − x/l)(1/2 − x/l) + 71.85(x/l)2(1 − x/l)2 (1/2 − x/l)]
ϕ3 = l−1/2 [28.65(x/l)(1 − x/l) − 132.7(x/l)2(1 − x/l)2 ]
ϕ4 = l−1/2 [98.99(x/l)(1 − x/l)(1/2 − x/l) − 572.3(x/l)2(1 − x/l)2 (1/2 − x/l)]
1.3.1
Proof that the Roots of the Secular Equation are Real
In this section we will prove that the roots of the polynomial in W defined by
equation (1.6) are in fact real. In order to show this, we must first review some
basic linear algebra.
Let V be a vector space over C. By an inner product on V (sometimes called
the Hermitian inner product), we mean a mapping h· , ·i : V × V → C such that
for all u, v, w ∈ V and a, b ∈ C we have
(IP1) hau + bv, wi = a∗ hu, wi + b∗ hv, wi ;
(IP2) hu, vi = hv, ui∗ ;
(IP3) hu, ui ≥ 0 and hu, ui = 0 if and only if u = 0 .
If {ei } is a basis for V , then in terms of components we have
X
X
u∗i vj gij
u∗i vj hei , ej i :=
hu, vi =
i,j
i,j
where we have defined the (square) matrix G = (gij ) = hei , ej i. As a matrix
product, we may write
hu, vi = u∗T Gv .
I emphasize that this is the most general inner product on V , and any inner product
can be written in this form. (For example, if V is a real space and gij = hei , ej i =
δij , then we obtain the usual Euclidean inner product on V .) Notice that
∗
gij = hei , ej i = hej , ei i∗ = gji
and hence G = G† so that G is in fact a Hermitian matrix. (Some of you may
realize that in the case where V is a real vector space, the matrix G is just the
usual metric on V .)
Now, given an inner product, we may define a norm on V by kuk = hu, ui1/2 .
Note that because of condition (IP3), we have kuk ≥ 0 and kuk = 0 if and only if
u = 0. This imposes a condition on G because
X
u∗i uj gij ≥ 0
kuk2 = hu, ui = u∗T Gu =
i,j
17
and equality holds if and only if u = 0. A Hermitian matrix G with the property
that u∗T Gu > 0 for all u 6= 0 is said to be positive definite.
It is important to realize that conversely, given a positive definite Hermitian
matrix G, we can define an inner product by hu, vi = u∗T Gv. That this is true
follows easily by reversing the above steps.
Another fundamental concept is that of the kernel of a linear transformation (or
matrix). If T is a linear transformation, we define the kernel of T to be the set
Ker T = {u ∈ V : T u = 0} .
A linear transformation whose kernel is zero is said to be nonsingular.
The reason the kernel is so useful is that it allows us to determine whether or not
a linear transformation is an isomorphism (i.e., one-to-one). A linear transformation
T on V is said to be one-to-one if u 6= v implies T u 6= T v. An equivalent way
to say this is that T u = T v implies u = v (this is the contrapositive statement).
Thus, if T u = T v, the using the linearity of T we see that 0 = T u − T v = T (u − v)
and hence u − v ∈ Ker T . But if Ker T = {0}, then we in fact have u = v so
that T is an isomorphism. Conversely, if T is an isomorphism, then we must have
Ker T = {0}. This is because T is one-to-one, and any linear transformation has
the property that T 0 = 0. (Because T u = T (u + 0) = T u + T 0 so that T 0 = 0.)
Now suppose that T is a nonsingular surjective (i.e., onto) linear transformation
on V . Such a T is said to be a bijection. You should already know that the matrix
representation A = (aij ) of T with respect to the basis {ei } for V is defined by
X
ej aji .
T ei =
j
This is frequently written as A = [T ]e . Then the fact that T is a bijection simply
means that the matrix A is invertible (i.e., that A−1 exists).
(Actually, if T : U → V is a nonsingular (one-to-one) linear transformation
between two finite-dimensional vector spaces of equal dimensions, then it is automatically surjective. This is a consequence of the well-known rank theorem which
says
rank T + dim Ker T = dim U
where rank T is another term for the dimension of the image of T . Therefore, if
Ker T = {0} we have dim Ker T = 0 so that rank T = dim U = dim V . The proof of
the rank theorem is also not hard: Let dim U = n, and let {w1 , . . . , wk } be a basis
for Ker T . Extend this to a basis {w1 , . . . , wn } for U . Then Im T is spanned by
{T wk+1 , . . . , T wn }, and it is easy to see that these are linearly independent. Thus
dim U = n = k + (n − k) = dim Ker T + dim Im T .)
Note that if G is positive definite, then we must have Ker G = {0}. This is
because if u 6= 0 and Gu = 0, we would have hu, ui = u∗T Gu = 0 in contradiction to
the assumed positive definiteness of G. Thus a positive definite matrix is necessarily
nonsingular.
Let us take a more careful look at Sij = hfi |fj i. I claim that the matrix S =
(Sij ) is positive definite. To show this, I will prove a general result. Suppose
18
I have n linearly independent (complex) vectors v1 , . . . , vn , and I construct the
nonsingular matrix M whose columns are just the vectors vi . Letting vij denote
the jth component of the vector vi , we have
v11
v21
···
vn1

 v12
M =
 ..
 .
v22
..
.
···
vn2
..
.
v2n
···
vnn

v1n



.


From this we see that
∗
v11
∗
v12
 ∗
 v21
M† = 
 ..
 .
∗
v22
..
.

∗
vn1
∗
vn2
∗
v1n
···
···
∗
v2n
..
.
···
∗
vnn






and therefore

hv1 |v1 i

 hv2 |v1 i
M M =
..

.

†
hvn |v1 i
hv1 |v2 i
hv2 |v2 i
..
.
···
···
hvn |v2 i · · ·
hv1 |vn i


hv2 |vn i 
.
..

.

(1.10)
hvn |vn i
A matrix of this form is called a Gram matrix.
If I denote the Hermitian matrix M † M by S, then for any vector c 6= 0 we have
2
hc|Sci = hc|M † M ci = hM c|M ci = kM ck > 0
so that S is positive definite. That this is strictly greater than zero (and not greater
than or equal to zero) follows from the fact that M is nonsingular so its kernel is
{0}, together with the assumption that c 6= 0. In other words, any matrix of the
form (1.10) is positive definite.
But this is exactly what we had when we defined Sij = hfi |fj i = hi|ji, where the
linearly independent functions fi define a basis for a vector space. In other words,
what we really have is fi = vi so that the matrix M † M defined above is exactly
the matrix S defined by Sij = hi|ji.
With all of this formalism out of the way, it is now easy to show that the roots
of the secular equation are real. Let us write equation (1.5) in matrix form as
Hc = W Sc
so that
hc|Hci = W hc|Sci .
19
On the other hand, using the fact that H is Hermitian and S is real and symmetric,
we can write
hc|Hci = hHc|ci = hW Sc|ci = W ∗ hSc|ci = W ∗ hc|Sci .
Thus we have
(W − W ∗ )hc|Sci = 0
which implies W = W ∗ because c 6= 0 so that hc|Sci > 0.
Note that this proof is also valid in the case where ϕ is complex because (1.5)
still holds, and S = M † M is Hermitian so that hSc|ci = hc|Sci.
20
2
2.1
Time-Independent Perturbation Theory
Perturbation Theory for a Nondegenerate Energy Level
Suppose that we want to solve the time-independent Schrödinger equation Hψn =
En ψn , but the Hamiltonian is too complicated for us to find an exact solution.
However, let us suppose that the Hamiltonian can be written in the form
H = H 0 + λH ′
(0)
(0)
(0)
where we know the exact solutions to H 0 ψn = En ψn . (We will use a superscript 0 to denote the energies and eigenstates of the unperturbed Hamiltonian H 0 .)
The additional term H ′ is called a perturbation, and it must in some sense be
considered small relative to H 0 . The dimensionless parameter λ is redundant, but
is introduced for mathematical convenience; it will not remain a part of our final
solution. For example, the unperturbed Hamiltonian H 0 could be the (free) hydrogen atom, and the perturbation H ′ could represent the interaction energy eE · r of
the electron with an electric field E. (This leads to an energy level shift called the
Stark effect.)
The full (i.e., interacting or perturbed) Schrödinger equation is written
Hψn = (H 0 + λH ′ )ψn = En ψn
(2.1)
and the unperturbed equation is
H 0 ψn(0) = En(0) ψn(0) .
(2.2)
We think of the parameter λ as varying from 0 to 1, and taking the system smoothly
from the unperturbed system described by H 0 to the fully interacting system described by H. And as long as we are discussing nondegenerate states, we can think
(0)
of each unperturbed state ψn as undergoing a smooth transition to the exact state
ψn . In other words,
lim ψn = ψn(0)
λ→0
and
lim En = En(0) .
λ→0
Since the states ψn = ψn (λ, x) and energies En = En (λ) depend on λ, let us
expand both in a Taylor series about λ = 0:
λ2 ∂ 2 ψn
∂ψn
(0)
+
+ ···
ψn = ψn + λ
∂λ λ=0
2! ∂λ2 λ=0
dEn
λ2 d2 En
En = En(0) + λ
+
+ ··· .
dλ λ=0
2! dλ2 λ=0
Now introduce the notation
ψn(k) =
1 ∂ k ψn k! ∂λk λ=0
En(k) =
21
1 dk En k! dλk λ=0
so we can write
ψn = ψn(0) + λψn(1) + λ2 ψn(2) + · · ·
(2.3a)
En = En(0) + λEn(1) + λ2 En(2) + · · · .
(2.3b)
(k)
(k)
For each k = 1, 2, . . . we call ψn and En the kth-order correction to the
wavefunction and energy. We assume that the series converges for λ = 1, and that
the first few terms give a good approximation to the exact solutions.
It will be convenient to simplify some of our notation, so integrals such as
(j) (k)
hψn |ψn i will simply be written hn(j) |n(k) i. We assume that the unperturbed
states are orthonormal so that
hm(0) |n(0) i = δmn
and we also choose our normalization so that
hn(0) |ni = 1 .
(2.4)
If this last condition on ψn isn’t satisfied, then multiplying ψn by hn(0) |ni−1 will
ensure that it is. Since multiplying the Schrödinger equation Hψn = En ψn by a
constant doesn’t change En , this has no effect on the energy levels. If so desired,
at the end of the calculation we can always re-normalize ψn in the usual way.
Substituting (2.3a) into (2.4) yields
1 = hn(0) |n(0) i + λhn(0) |n(1) i + λ2 hn(0) |n(2) i + · · · .
Now,
P∞ it isna general result that if you have a power series equation of the form
n=0 an x = 0 for all x, then an = 0 for all n. That a0 = 0 follows by letting
x = 0. Now take the derivative with respect to x and let x = 0 to obtain a1 = 0.
Taking the derivative again and letting x = 0 yields a2 = 0. Clearly we can continue
this procedure to arrive at an = 0 for all n. Applying this result to the above power
series in λ and using the fact that hn(0) |n(0) i = 1 we conclude that
hn(0) |n(k) i = 0
for all k = 1, 2, . . . .
(2.5)
We now substitute equations (2.3) into the Schrödinger equation (2.1):
(H 0 + λH ′ )(ψn(0) + λψn(1) + λ2 ψn(2) + · · · )
= (En(0) + λEn(1) + λ2 En(2) + · · · )(ψn(0) + λψn(1) + λ2 ψn(2) + · · · )
or, grouping powers of λ,
H 0 ψn(0) + λ(H 0 ψn(1) + H ′ ψn(0) ) + λ2 (H (0) ψn(2) + H ′ ψn(1) ) + · · ·
= En(0) ψn(0) + λ(En(0) ψn(1) + En(1) ψn(0) )
+ λ2 (En(0) ψn(2) + En(1) ψn(1) + En(2) ψn(0) ) + · · · .
22
Again ignoring questions of convergence, we can equate powers of λ on both sides
of this equation. For λ0 we simply have
H 0 ψn(0) = En(0) ψn(0)
(2.6a)
which doesn’t tell us anything new. For λ1 we have
H 0 ψn(1) + H ′ ψn(0) = En(0) ψn(1) + En(1) ψn(0)
or
2
(H 0 − En(0) )ψn(1) = (En(1) − H ′ )ψn(0) .
(2.6b)
For λ we have
H (0) ψn(2) + H ′ ψn(1) = En(0) ψn(2) + En(1) ψn(1) + En(2) ψn(0)
or
(H 0 − En(0) )ψn(2) = (En(1) − H ′ )ψn(1) + En(2) ψn(0) .
(2.6c)
And in general we have for k ≥ 1
(H 0 − En(0) )ψn(k) = (En(1) − H ′ )ψn(k−1) + En(2) ψn(k−2) + · · · + En(k) ψn(0) .
(k)
(k−1)
(2.6d)
(k−2)
Notice that at each step along the way, ψn is determined by ψn
, ψn
,
(0)
(0)
(k)
. . . , ψn . We can also add an arbitrary multiple of ψn to each ψn without
affecting the left side of these equations. Hence we can choose this multiple so that
hn(0) |n(k) i = 0 for k ≥ 1, which is the same result as we had in (2.5).
(0)
Now using the hermiticity of H 0 and the fact that En is real, we have
hn(0) |H 0 n(k) i = hH 0 n(0) |n(k) i = En(0) hn(0) |n(k) i = 0
for k ≥ 1 .
(0)∗
Then multiplying (2.6d) from the left by ψn and integrating, we see that the
left-hand side vanishes, and we are left with (since hn(0) |n(0) i = 1)
0 = −hn(0) |H ′ n(k−1) i + En(k)
or
En(k) = hn(0) |H ′ n(k−1) i
for k ≥ 1 .
(2.7)
In particular, we have the extremely important result for the first order energy
correction to the nth state
Z
(2.8)
En(1) = hn(0) |H ′ n(0) i = ψn(0)∗ H ′ ψn(0) dx .
Letting λ = 1 in (2.3b), we see that to first order, the energy of the nth state is
given by
Z
(0)
(1)
(0)
En ≈ En + En = En + ψn(0)∗ H ′ ψn(0) dx .
23
Example 2.1. Let the unperturbed system be the free harmonic oscillator, with
ground-state wavefunction
(0)
ψ0
=
1/4
mω
π~
e−mωx
2
/2~
and energy levels
En(0) =
1
n+
~ω .
2
Now consider the anharmonic oscillator with Hamiltonian
H = H 0 + H ′ := H 0 + ax3 + bx4 .
The first-order energy correction to the ground state is given by
(1)
E0
= hn(0) |H ′ n(0) i =
mω
π~
1/2 Z
∞
e−mωx
2
/~
(ax3 + bx4 ) dx .
−∞
However, the integral over x3 vanishes by symmetry (the integral of an odd function
over an even interval), and we are left with
(1)
E0
=b
mω
π~
1/2 Z
∞
x4 e−mωx
2
/~
dx = b
−∞
1/2 Z ∞
2
α
x4 e−αx dx
π
−∞
1/2 2 Z ∞
1/2 2 1/2
α
∂
∂
π
α
−αx2
=b
e
dx
=
b
π
∂α2 −∞
π
∂α2 α
=
3b
3b ~2
=
.
4α2
4 m2 ω 2
Thus, to first order, the ground state energy of the anharmonic oscillator is given
by
3b ~2
1
(0)
(1)
.
E0 ≈ E0 + E0 = ~ω +
2
4 m2 ω 2
Now let’s find the first-order correction to the wavefunction. Since the unper(0)
(1)
turbed states ψn form a complete orthonormal set, we may expand ψn in terms
of them as
X
(0)
ψn(1) =
am ψm
m
where
am = hm(0) |n(1) i .
(It would be way too cluttered to try and label these expansion coefficients to denote
the fact that they also refer to the first-order correction of the nth state.) Then for
24
(0)
m 6= n, we multiply (2.6b) from the left by ψm and integrate:
hm(0) |(H 0 − En(0) )n(1) i = En(1) hm(0) |n(0) i − hm(0) |H ′ n(0) i
(0)
(0)
(0)
or (since H 0 ψm = Em ψm and hm(0) |n(0) i = 0 for m 6= n)
(0)
(Em
− En(0) )hm(0) |n(1) i = −hm(0) |H ′ n(0) i .
Therefore
hm(0) |H ′ n(0) i
am = hm(0) |n(1) i =
for m 6= n .
(2.9)
(0)
(0)
En − Em
You should realize that this last step was where the assumed nondegeneracy of the
(0)
(0)
states came in. In order for us to divide by En − Em , we must assume that
(0)
(0)
it is nonzero. This is true as long as m 6= n implies that En 6= Em . Since
(0) (1)
an = hn |n i = 0 (this is equation (2.5)), we finally obtain
ψn(1) =
X hm(0) |H ′ n(0) i
(0)
En
m6=n
−
(0)
Em
(0)
ψm
.
(2.10)
Now that we have the first-order correction to the wavefunction, it is easy to
get the second-order correction to the energy. Using (2.10) in (2.7) with k = 2 we
immediately have
X hn(0) |H ′ m(0) i2
X hm(0) |H ′ n(0) ihn(0) |H ′ m(0) i
(2)
=
.
(2.11)
En =
(0)
(0)
(0)
(0)
En − Em
En − Em
m6=n
m6=n
The last term we will compute is the second-order correction to the wavefunction.
(0)
We again expand in terms of the ψn as
X
(0)
ψn(2) =
bm ψm
m
(0)∗
where bm = hm(0) |n(2) i. Multiplying (2.6c) from the left by ψm
we have (assuming m 6= n)
and integrating
(0)
(Em
− En(0) )hm(0) |n(2) i = En(1) hm(0) |n(1) i − hm(0) |H ′ n(1) i
or
(1)
bm = hm(0) |n(2) i =
En
(0)
Em
−
(0)
En
hm(0) |n(1) i −
hm(0) |H ′ n(1) i
(0)
(0)
Em − En
.
Now use (2.9) in the first term on the right-hand side and use (2.10) in the second
term to write
(1)
bm = −
En hm(0) |H ′ n(0) i
(0)
(En
−
(0)
Em )2
−
X hm(0) |H ′ k (0) ihk (0) |H ′ n(0) i
(0)
k6=n
25
(0)
(0)
(0)
(Em − En )(En − Ek )
.
Using (2.8) we finally obtain
ψn(2) =
X X hm(0) |H ′ k (0) ihk (0) |H ′ n(0) i
(0)
m6=n k6=n (En
−
(0)
(0)
Em )(En
−
−
(0)
Ek )
(0)
ψm
X hm(0) |H ′ n(0) ihn(0) |H ′ n(0) i
(0)
m6=n
(0)
(En − Em )2
(0)
ψm
. (2.12)
Let me make several points. First, recall that because of equation (2.4), our
states are not normalized. Second, be sure to realize that the sums in equations
(2.10), (2.11) and (2.12) are over states, and not energy levels. If some of the
energy levels other than the nth are degenerate, then we must include a term in
each of these sums for each linearly independent wavefunction corresponding to the
(1)
(2)
degenerate energy level. The reason for this is that the expansions of ψn and ψn
were in terms of a complete set of functions, and hence we must be sure to include
all linearly independent states in the sums. Furthermore, if there happens to be
a continuum of states in the unperturbed system, then we must also include an
integral over these so that we have included all linearly independent states in our
expansion.
2.2
Perturbation Theory for a Degenerate Energy Level
We now turn to the perturbation treatment of a degenerate energy level, meaning
that there are multiple unperturbed states that all have the same energy. If we
(0)
(0)
let d be the degree of degeneracy, then we have states ψ1 , . . . , ψd satisfying the
unperturbed Schrödinger equation
H 0 ψn(0) = En(0) ψn(0)
with
(0)
E1
(0)
= E2
(2.13a)
(0)
= · · · = Ed .
(2.13b)
You must be careful with the notation here, because we don’t want to clutter it up
(0)
(0)
with too many indices. Even though we write E1 , . . . , Ed , this does not mean
that these are necessarily the d lowest-lying states that satisfy the unperturbed
Schrödinger equation. We are referring here to a single degenerate energy level.
The interacting (or perturbed) Schrödinger equation is
Hψn = (H 0 + λH ′ )ψn = En ψn .
In our treatment of a nondegenerate energy level, we assumed that limλ→0 En =
(0)
(0)
(0)
En and limλ→0 ψn = ψn where the state ψn was unique. However, in the case
of degeneracy, the second of these does not hold. While it is true that as λ goes to
zero we still have
lim En = En(0)
λ→0
26
the presence of the perturbation generally splits the degenerate energy level into
multiple distinct states. However, there are varying degrees of splitting, and while
the perturbation may completely remove the degeneracy, it may also only partially
remove it or have no effect at all. This is illustrated in the figure below.
E5
E4c
E5
E4b
E4abc
Energy
E4a
E1
E3
E3
E2abc
E2ab
E2c
E1
λ
0
1
Figure 3: Splitting of energy levels due to a perturbation.
The important point to realize here is that in the limit λ → 0, the state ψn does
(0)
not necessarily go to a unique ψn , but rather only to some linear combination
(0)
(0)
of the normalized degenerate states ψ1 , . . . , ψd . This is because any such linear
combination
(0)
(0)
(0)
c1 ψ1 + c2 ψ2 + · · · + cd ψd
(0)
will satisfy (2.13a) with the same eigenvalue En . Thus there are an infinite number
of such linear combinations made up of these d linearly independent normalized
eigenfunctions, and any of them will work as the unperturbed state.
For example, recall that the hydrogen atom states are labeled ψnlm where the
energy only depends on n and l, and the factor eimφ makes the wave function
complex for m 6= 0. The 2p states correspond to n = 2 and l = 1, and these are
broken into the wave functions 2p1 and 2p−1 . However, instead of these complex
wave functions, we can take the real linear combinations defined by
1
ψ2px = √ (ψ2p1 + ψ2p−1 )
2
and
1
ψ2py = √ (ψ2p1 − ψ2p−1 )
i 2
which have the same energies. For most purposes in chemistry, these real wave
functions are much more convenient to work with. And while the 2p0 , 2p1 and 2p−1
states are degenerate, the presence of an electric or magnetic field will split the
27
degeneracy because the interaction term in the Hamiltonian depends on the spin of
the electron (i.e., the m value).
Returning to our problem, all we can say is that
lim ψn =
λ→0
d
X
(0)
ci ψi ,
1 ≤ n ≤ d.
i=1
Hence the first thing we must do is determine the correct zeroth-order wave func(0)
tions, which we denote by φn . In other words,
φ(0)
n := lim ψn =
λ→0
d
X
(0)
ci ψi ,
i=1
1≤n≤d
(0)
(2.14)
(n)
where each φn has a different set of coefficients ci . (These should be labeled ci ,
(0) (0)
(0)
but I’m trying to keep it simple.) Note that since H 0 ψi = Ed ψi for each
i = 1, . . . , d it follows that
(0) (0)
H 0 φ(0)
(2.15)
n = Ed φn .
For the d-fold degenerate case, we proceed as in the nondegenerate case, except
(0)
(0)
that now we use φn instead of ψn for the zeroth-order wave function. Then
equations (2.3) become
(1)
2 (2)
ψn = φ(0)
n + λψn + λ ψn + · · ·
(0)
En = Ed + λEn(1) + λ2 En(2) + · · ·
(2.16a)
(2.16b)
where we have used (2.13b). Equations (2.16) apply for each n = 1, . . . , d. As in the
nondegenerate case, we substitute these into the Schrödinger equation Hψn = En ψn
and equate powers of λ. This is exactly the same as we had before, except that now
(0)
(0)
we have φn instead of ψn , so we can immediately write down the results from
equations (2.6).
(0) (0)
(0)
Equating the coefficients of λ0 we have H 0 φn = Ed φn . Since for each n =
(0)
(0)
1, . . . , d the linear combination φn is an eigenstate of H 0 with eigenvalue Ed (this
is just the statement of equation (2.15)), this doesn’t give us any new information.
From the coefficients of λ1 we have (for each n = 1, . . . , d)
(0)
(H 0 − Ed )ψn(1) = (En(1) − H ′ )φ(0)
n .
(2.17)
(0)∗
Multipling this from the left by φn and integrating we have (here I’m not using
(0)
(0)
n(0) as a shorthand for ψn to make sure there is no confusion with φn )
(0)
0 (0)
(0) (0)
(1) (0) (0)
(0)
′ (0)
hφ(0)
n |H ψn i − Ed hφn |ψn i = En hφn |φn i − hφn |H φn i .
Using (2.15) we see that the left-hand side of this equation vanishes, so assuming
that the correct zeroth-order wave functions are normalized, we arrive at the first
order correction to the energy
′ (0)
En(1) = hφ(0)
n |H φn i .
28
(2.18)
This is similar to the nondegenerate result (2.8) except that now we use the correct
zeroth-order wave functions. Of course, in order to evaluate these integrals, we must
(0)
know the functions φn which, so far, we don’t.
So, for any 1 ≤ m ≤ d, we multiply (2.17) from the left by one of the d-fold
(0)
degenerate unperturbed wave functions ψm and integrate to obtain
(0)
(0)
(0) (1)
(0) (0)
(0)
hψm
|H 0 ψn(1) i − Ed hψm
|ψn i = En(1) hψm
|φn i − hψm
|H ′ φ(0)
n i.
(0)
(0)
(0)
Since H 0 ψm = Ed ψm , we see that the left-hand side of this equation vanishes,
and we are left with
(0)
(1)
(0) (0)
hψm
|H ′ φ(0)
n i − En hψm |φn i = 0 ,
m = 1, . . . , d .
(0)
There is no loss of generality in assuming that the zeroth-order wave functions ψi
of the degenerate level are orthonormal, so we take
(0)
(0)
|ψi i = δmi
hψm
for m, i = 1, . . . , d .
(2.19)
(0)
(If the zeroth-order wave functions ψi aren’t orthonormal, then apply the GramSchmidt process to construct an orthonormal set. Since the new orthonormal functions are just linear combinations of the original set, and the correct zeroth-order
(0)
(0)
(0)
functions φn are linear combinations of the ψi , the φn will just be different linear
combinations of the new orthonormal functions.) Then substituting the definition
(0)
(2.14) for φn we have
d
X
i=1
or
d
X
i=1
where
(0)
(0)
ci hψm
|H ′ ψi i − En(1)
d
X
i=1
(0)
(0)
ci hψm
|ψi i = 0
′
(Hmi
− En(1) δmi )ci = 0 ,
m = 1, . . . , d
(2.20a)
(0)
′
(0)
Hmi
= hψm
|H ′ ψi i .
This is just another homogeneous system of d equations in the d unknowns ci . In
fact, if we let c be the vector with components ci , then we can write (2.20a) in
matrix form as
H ′ c = En(1) c
(2.20b)
which shows that this is nothing more than an eigenvalue equation for the matrix
H ′ acting on the d-dimensional eigenspace of degenerate wave functions.
As usual, if (2.20a) is to have a nontrivial solution, we must have the secular
equation
′
det(Hmi
− En(1) δmi ) = 0 .
29
(2.21)
Written out, this looks like
′
H11 − En(1)
′
H21
..
.
H′
′
H12
(1)
′
H22
− En
..
.
′
Hd2
d1
′
H1d
′
H2d
= 0.
..
.
(1)
′
Hdd
− En ···
···
···
(1)
(1)
(1)
(1)
This is a polynomial of degree d in En , and the d roots E1 , E2 , . . . , Ed are
the first-order corrections to the energy of the d-fold degenerate unperturbed state.
(1)
So, we solve (2.21) for the eigenvalues En , and use these in (2.20b) to solve for the
eigenvectors c. These then define the correct zeroth-order wave functions according
to (2.14).
Again, note that all we are doing is finding the eigenvalues and eigenvectors
′
of the matrix Hmi
. And since H ′ is Hermitian, eigenvectors belonging to distinct
eigenvalues are orthogonal. But each eigenvector c has components that are just
the expansion coefficients in (2.14), and therefore (reverting to a more complete
notation)
′ (0)
hφ(0)
m |H φn i =
d
X
i,j=1
(m)∗
ci
(0)
(0)
(n)
hψi |H ′ ψj icj
=
d
X
(m)∗
ci
(n)
′
Hij
cj
i,j=1
= c(m)† H ′ c(n) = En(1) c(m)† c(n)
= En(1) hc(m) |c(n) i
or
′ (0)
(1)
hφ(0)
m |H φn i = En δmn
(2.22)
where we assume that the eigenvectors are normalized.
In the case where m = n, we arrive back at (2.18). What about the case m 6= n?
Recall that in our treatment of nondegenerate perturbation theory, the reason we
had to assume the nondegeneracy was because equations (2.10) and (2.11) would
(0)
(0)
blow up if there were another state ψm with the same energy as ψn . However, in
that case, we would be saved if the numerator also went to zero, and that is precisely
what happens if we use the correct zeroth-order wave functions. Essentially then,
the degenerate case proceeds just like the nondegenerate case, except that we must
use the correct zeroth-order wave functions.
Returning to (2.21), if all d roots are distinct, then we have completely split the
degeneracy into d distinct levels
(0)
(1)
Ed + E1 ,
(0)
(1)
Ed + E2 ,
...,
(0)
(1)
Ed + Ed .
If not all of the roots are distinct, then we have only partly removed the degeneracy
(at least to first order). We will assume that all d roots are distinct, and hence that
the degeneracy has been completely lifted in first order.
30
(1)
Now that we have the d roots En , we can take them one-at-a-time and plug
back into the system of equations (2.20a) and solve for c2 , . . . , cd in terms of c1 .
(Recall that because the determinant of the coefficient matrix of the system (2.20a)
is zero, the d equations in (2.20a) are linearly dependent, and hence we can only find
d − 1 of the unknowns in terms of one of them.) Finally, we fix c1 by normalization,
using equations (2.14) and (2.19):
(0)
1 = hφ(0)
n |φn i =
d
X
i,j=1
(0)
(0)
c∗i cj hψi |ψj i =
d
X
c∗i cj δij =
d
X
i=1
i,j=1
|ci |2 .
(2.23)
Also be sure to realize that we obtain a separate set of coefficients ci for each root
(1)
En . This is how we get the d independent zeroth-order wave functions.
Obviously, finding the roots of (2.21) is a difficult problem in general. However,
under some special conditions, the problem may be much more tractable. The
best situation would be if all off-diagonal elements Hmi , m 6= i vanished. Then the
determinant is just the product of the diagonal elements, and the d roots are simply
(1)
′
for m = 1, . . . , d or
En = Hmm
(1)
E1
′
= H11
,
(1)
E2
′
= H22
,
...,
(1)
′
Ed = Hdd
.
(1)
(1)
Let us assume that all d roots are distinct. Taking the root En = E1
a specific example, (2.20a) becomes the set of d − 1 equations
′
= H11
as
(1)
′
(H22
− E1 )c2 = 0
(1)
′
(H33
− E1 )c3 = 0
..
.
(1)
′
(Hdd
− E1 )cd = 0 .
(1)
′
′
6= Hmm
for m = 2, 3, . . . , d, it follows that c2 = c3 = · · · = cd = 0.
Since E1 = H11
Normalization then implies that c1 = 1, and the corresponding zeroth-order wave
(0)
(0)
function defined by (2.14) is φ1 = ψ1 . Clearly this applies to any of the d roots,
so we have
(0)
(0)
φi = ψi , i = 1, . . . , d .
Thus we have shown that when the secular equation is diagonal and the d matrix
(0)
′
elements Hmm
are all distinct, then the initial wave functions ψi are the correct
(0)
zeroth-order wave functions φi .
Another situation that lends itself to a relatively simple solution is when the
secular determinant is block diagonal. For example, in the case where d = 4 we
31
would have
H ′ − E (1)
n
11
′
H21
0
0
′
H12
0
(1)
′
H22
− En
0
′
H33
0
(1)
− En
′
H43
0
0
= 0.
′
H34
(1) ′
H44 − En
0
This is of the same form as we had in Example 1.3 (except with Sij = δij ). Exactly
the same reasoning we used to show that two of the variation functions were linear
combinations of f1 and f2 and two of the variation functions were linear combinations of f3 and f4 now shows that the correct zeroth-order wave functions are of
the form
(0)
(1)
(0)
(1)
(0)
φ2 = c1 ψ1 + c2 ψ2
(0)
(3)
(0)
(3)
(0)
φ4 = c3 ψ3 + c4 ψ4
φ1 = c1 ψ1 + c2 ψ2
φ3 = c3 ψ3 + c4 ψ4
(0)
(2)
(0)
(2)
(0)
(0)
(4)
(0)
(4)
(0)
(0)
Is there any way we can choose our initial wave functions ψi to make things
easier? Well, referring back to Theorem 1.2, suppose we have a Hermitian operator A that commutes with both H 0 and H ′ . If we choose our initial wave functions to be eigenfunctions of both A and H 0 , then the off-diagonal matrix elements
(0)
(0)
(0)
(0)
′
Hij
= hψi |H ′ ψj i will vanish if ψi and ψj belong to different eigenspaces of
(0)
A. Therefore, if the functions ψi all have different eigenvalues of A, the secular
(0)
(0)
determinant will be diagonal so that the φi = ψi .
(0)
If more than one ψi belongs to a given eigenvalue ak of A (in other words,
dim Vak > 1), then this subcollection will form a block in the secular determinant.
So in general, we will have a secular determinant that is block diagonal where each
block has size dim Vak . In this case, each correct zeroth-order wave function will be
(0)
a linear combination of those ψi that belong to the same eigenvalue of A.
Before proceeding with an example, let me prove a very important and useful
property of the spherical harmonics. The parity operation is r → −r, and in
spherical coordinates, this is equivalent to θ → π − θ and ϕ → ϕ + π.
z
θ
r
y
ϕ
x
32
Indeed, we know that (for the unit sphere) z = cos θ, and from the figure we see
that −z would be at π − θ. Similarly, a point on the x-axis at ϕ = 0 goes to the
point −x at ϕ = π. Alternatively, letting θ → π −θ in x = sin θ cos ϕ doesn’t change
x, so in order to have x → −x we need cos ϕ → − cos ϕ which is accomplished by
letting ϕ → ϕ + π.
Now observe that under parity, r → −r and p → −p, so that L = r × p is
unchanged. Thus angular momentum is a pseudo-vector, as you probably already
knew. But this means that the parity operation Π commutes with the quantum
mechanical operator L, so that the three operators L2 , Lz and Π are mutually
commuting, and the eigenfunctions Ylm (θ, ϕ) of angular momentum can be chosen
to have a definite parity. Note also that since Π and L commute, it follows that Π
and L± commute, so acting on any Ylm with L± won’t change its parity.
Look at the explicit form of the state Yll :
Yll (θ, ϕ)
(2l + 1)!
= (−1)
4π
l
1/2
1
(sin θ)l eilϕ .
2l l!
Letting θ → π − θ we have (sin θ)l → (sin θ)l , but under ϕ → ϕ + π we have
eilϕ → eilπ eilϕ = (−1)l eilϕ . Therefore, under parity we see that Yll → (−1)l Yll .
But we can get to any Ylm by repeatedly applying L− to Yll , and since this doesn’t
change the parity of Ylm we have the extremely useful result
Π Ylm (θ, ϕ) = (−1)l Ylm (θ, ϕ) .
(2.24)
Example 2.2 (Stark Effect). In this example we will take a look at the effect
of a uniform electric field E = E ẑ on a hydrogen atom, where the unperturbed
Hamiltonian is given by
e2
p2
−
.
H0 =
2m
r
and r = r1 − r2 is the relative position vector from the proton to the electron. We
first need to find the perturbing potential energy.
The force on a particle of charge q in an electric field E = −∇φ is F = qE =
−q∇φ where φ(r) is the electric potential. On the other hand, the force is also
given in terms of the potential energy V (r) by F = −∇V , and hence ∇V = q∇φ
so that
Z r
Z r
∇V · dr = q
∇φ · dr
0
0
or
V (r) − V (0) = q[φ(r) − φ(0)] .
If we take V (0) = φ(0) = 0, then we have
V (r) = qφ(r) .
33
Thus the interaction Hamiltonian H ′ consists of both the energy eφ(r2 ) of the
proton and the energy −eφ(r1 ) of the electron, and therefore
H ′ = e[φ(r2 ) − φ(r1 )] .
But the electric field is constant so that
Z r2
E · dr = E · (r2 − r1 ) = −E · (r1 − r2 ) = −E · r = −E z
r1
while we also have
Z
r2
r1
E · dr = −
Z
r2
r1
∇φ · dr = −[φ(r2 ) − φ(r1 )] .
Hence the final form of our perturbation is H ′ = eE · r or
H ′ = eE z .
Note also that if we define the electric dipole moment µe = e(r2 − r1 ) = −er,
then H ′ can be called a dipole interaction because
H ′ = −µe · E .
Let us first consider the ground state ψ100 of the hydrogen atom. This state
is nondegenerate, so the first-order energy correction to the ground state is, from
equation (2.8),
(1)
E100 = hψ100 |eE z|ψ100 i = eE hψ100 |z|ψ100 i .
But H 0 is parity invariant, so the states ψnlm all have a definite parity (−1)l . Then
(1)
E100 is the integral of an odd function over an even interval, and hence it vanishes:
(1)
E100 = 0 .
In fact, this shows that any nondegenerate state of the hydrogen atom has no firstorder Stark effect.
Now consider the n = 2 levels of hydrogen. This is a four-fold degenerate state
consisting of the wave functions ψ200 , ψ210 , ψ211 and ψ21 −1 . Since the parity of the
states is given by (−1)l , we see that the l = 0 state has even parity while the l = 1
states are odd.
However, it is not hard to see that [H ′ , Lz ] = 0. This either a consequence of
′
the fact that H P
is a function of z = cos θ while Lz = −i~∂/∂ϕ, or you can note
that [Li , rj ] = i k εijk rk so that [Lz , z] = 0. Either way, we have
0 = hψnl′ m′ |[H ′ , Lz ]|ψnlm i = hψnl′ m′ |H ′ Lz − Lz H ′ |ψnlm i
= ~(m − m′ )hψnl′ m′ |H ′ |ψnlm i
34
and hence we have the selection rule
hψnl′ m′ |H ′ |ψnlm i = 0 if m 6= m′ .
(This is an example of Theorem 1.2.) This shows that H ′ can only connect states
with the same m values. And since H ′ has odd parity, it can only connect states
with opposite parities, i.e., in the present case it can only connect an l = 0 state
with an l = 1 state.
Suppressing the index n = 2, we order our basis states ψlm as
{ψ00 , ψ10 , ψ11 , ψ1 −1 }. (In other words, the rows and columns are labeled by these
functions in this order.) Then the secular equation (2.21) becomes (also writing E
instead of E (1) for simplicity)
−E
hψ00 |H ′ |ψ10 i
0
0 −E
0
0 hψ10 |H ′ |ψ00 i
=0
0
0
−E
0 0
0
0
−E or (since it’s block diagonal)
′ 2
[E 2 − (H12
) ]E 2 = 0
where
′
′
H12
= hψ00 |H ′ |ψ10 i = hψ10 |H ′ |ψ00 i = H21
because both H ′ and the wave functions are real. Therefore the roots of the secular
equation are
′
En(1) = ±H12
, 0, 0 .
For our wave functions we have ψnlm = Rnl Ylm or
ψ200
ψ210
1/2 r
1−
e−r/2a0 Y00
=
2a0
1/2
1
r −r/2a0 0
=
Y1
e
3
24a0
a0
1
2a30
where a0 is the Bohr radius defined by a0 = ~2 /me e2 , and hence
′
H12
= hψ200 |eE z|ψ210 i
Z
r
−3 2 −r/a0 r
√
e
1−
zY00∗ Y10 r2 drdΩ .
= eE (2a0 )
a0
2a0
3
But
Y00∗
1
= √
4π
and
z = r cos θ = r
35
r
4π 0∗
Y
3 1
so that using
Z
we have
′
dΩ Ylm∗ Ylm = δll′ δmm′
r
Y 0∗ Y 0 drdΩ
e−r/a0 r4 1 −
2a0 1 1
Z ∞
2
r5
= eE (2a0 )−3
e−r/a0 dr .
r4 −
3a0 0
2a0
′
H12
= eE (2a0 )−3
Using the general result
Z ∞
2
3a0
n −αr
r e
0
Z
∂n
dr = (−1)
∂αn
n
= (−1)n
=
Z
∞
e−αr dr
0
∂ n −1
α
∂αn
n!
αn+1
we finally arrive at
′
H12
= −3eE a0 .
Now we need to find the corresponding eigenvectors c that will specify the correct
zeroth-order wave functions. These are the solutions to the system of equations
(1)
(1)
(1)
′
. Then
H ′ c = En c for each value of En (see equation (2.20b)). Let E1 = H12
(1)
the eigenvector c satisfies
 

′
′
c1
−H12
H12
0
0




′
′
−H12
0
0   c2 
 H12
  = 0.

 
 0
′
0
−H12
0 
  c3 

0
0
′
−H12
0
c4
√
This implies that c1 = c2 and c3 = c4 = 0. Normalizing we have c1 = c2 = 1/ 2 so
that
1
(0)
ϕ1 = √ (ψ200 + ψ210 ) .
2
(1)
Next we let E2
′
= −H12
. Now the eigenvector c(2) satisfies
 
 ′
′
c1
H12 H12
0
0
 
 ′
′
0
0   c2 
 H12 H12
  = 0

 
 0
′
0
H12
0 
  c3 

0
0
′
H12
0
36
c4
√
so that c1 = −c2 and c3 = c4 = 0. Again, normalization yields c1 = −c2 = 1/ 2
and hence
1
(0)
ϕ2 = √ (ψ200 − ψ210 ) .
2
(1)
Finally, for the two degenerate roots E3

0
 ′
 H12

 0

0
′
H12
0
0
0
0 0
(1)
= E4

c1
= 0 we have

 
0 0   c2 
  = 0
 
0 0
  c3 
0 0
c4
so that c1 = c2 = 0 while c3 and c4 are completely arbitrary. Thus we can simply
choose
(0)
(0)
ϕ3 = ψ211
and
ϕ4 = ψ21 −1 .
In summary, the correct zeroth-order wave functions for treating the Stark effect
(0)
(0)
are ϕ1 which gets an first-order energy shift of −3eE a0 , the wave function ϕ2
which gets a first-order energy shift of +3eE a0 , and the original degenerate states
(0)
(0)
ϕ3 = ψ211 and ϕ4 = ψ21 −1 which remain degenerate to this order.
2.3
Perturbation Treatment of the First Excited States
of Helium
The helium atom consists of a nucleus with two protons and two neutrons, and two
orbiting electrons. If we take the nuclear charge to be +Ze instead of +2e, then
our discussion will apply equally well to helium-like ions such as H− , Li+ or Be2+ .
Neglecting terms such as spin–orbit coupling, the Hamiltonian is
H =−
~2 2
~2 2 Ze2 Ze2
e2
∇1 −
∇2 −
−
+
2me
2me
r1
r2
r12
(2.25)
where ri is the distance to electron i, r12 is the distance from electron 1 to electron
2, and ∇2i is the Laplacian with respect to the coordinates of electron i. The
Schrödinger equation is thus a function of six variables, the three coordinates for
each of the two electrons. (Technically, the electron mass me is the reduced mass
m = me M/(me + M ) where M is the mass of the nucleus. But M ≫ me so that
m ≈ me . If this isn’t familiar to you, we will treat two-body problems such as this
in detail when we discuss identical particles.)
Because of the term e2 /r12 the Schrödinger equation isn’t separable, and we
must resort to approximation methods. We write
H = H0 + H′
37
where
H 0 = H10 + H20 = −
~2 2 Ze2
~2 2 Ze2
∇1 −
−
∇ −
2me
r1
2me 2
r2
(2.26)
is the sum of two independent hydrogen atom Hamiltonians, and
H′ =
e2
.
r12
(2.27)
We can now use separation of variables to write the unperturbed wave function
Ψ(r1 , r2 ) as a product
Ψ(r1 , r2 ) = ψ1 (r1 )ψ2 (r2 ) .
In this case we have the time-independent equation
H 0 Ψ = (H10 + H20 )ψ1 ψ2 = ψ2 H10 ψ1 + ψ1 H20 ψ2 = Eψ1 ψ2
so that dividing by ψ1 ψ2 yields
H10 ψ1
H 0 ψ2
=E− 2
.
ψ1
ψ2
Since the left side of this equation is a function of r1 only, and the right side is a
function of r2 only, each side must in fact be equal to a constant, and we can write
E = E1 + E2
where each Ei is the energy of a hydrogenlike wave function:
E1 = −
Z 2 e2
n21 2a0
E2 = −
Z 2 e2
n22 2a0
and a0 is the Bohr radius
a0 =
~2
= 0.529 Å .
me e 2
In other words, we have the unperturbed zeroth-order energies
2
1
e
1
+
,
n1 = 1, 2, . . . , n2 = 1, 2, . . . .
E (0) = −Z 2
n21
n22 2a0
(2.28)
Correspondingly, the zeroth-order wave functions are products of the usual hydrogenlike wave functions.
The lowest excited states of helium have n1 = 1, n2 = 2 or n1 = 2, n2 = 1.
Then from (2.28) we have (for Z = 2)
2
1
1
e
E (0) = −22
+
= −5(13.606 eV) = −68.03 eV .
2
2
1
2
2a0
38
For n = 2, the possible values of l are l = 0, 1, and since there are 2l + 1 values of
ml , we see that the n = 2 level of a hydrogenlike atom is fourfold degenerate. (This
just says that the 2s and 2p states have the same energy.) Thus the first excited
unperturbed state of He is eightfold degenerate, and the eight unperturbed wave
functions are
(0)
ψ1 = 1s(1)2s(2)
(0)
(0)
(0)
ψ2 = 2s(1)1s(2)
(0)
ψ3 = 1s(1)2px(2) ψ4 = 2px (1)1s(2)
(0)
(0)
ψ5 = 1s(1)2py (2) ψ6 = 2py (1)1s(2) ψ7 = 1s(1)2pz (2)
(0)
ψ8 = 2pz (1)1s(2)
Here the notation 1s(1)2s(2) means, for example, that electron 1 is in the 1s state
and electron 2 is in the 2s state. I have also chosen to use the real hydrogenlike
wave functions 2px , 2py and 2pz which are defined as linear combinations of the
complex wave functions 2p0 , 2p1 and 2p−1 :
1
1
2px := √ (2p1 + 2p−1 ) = √
2
4 2π
5/2
Z
1
xe−Zr/2a0
= √
4 2π a0
Z
a0
5/2
re−Zr/2a0 sin θ cos φ
5/2
1
Z
1
2py := √ (2p1 − 2p−1 ) = √
re−Zr/2a0 sin θ sin φ
i 2
4 2π a0
5/2
Z
1
ye−Zr/2a0
= √
a
4 2π
0
5/2
Z
1
re−Zr/2a0 cos θ
2pz := 2p0 = √
a
4 2π
0
5/2
Z
1
= √
ze−Zr/2a0
4 2π a0
(2.29a)
(2.29b)
(2.29c)
This is perfectly valid since any linear combination of solutions with a given energy
is also a solution with that energy. (However, the 2px and 2py functions are not
eigenfunctions of Lz since they are linear combinations of eigenfunctions with different values of ml .) These real hydrogenlike wave functions are more convenient
for many purposes in constructing chemical bonds and molecular wave functions.
In fact, you have probably seen these wave functions in more elementary chemistry
courses. For example, a contour plot in the plane (i.e., a cross section) of a real 2p
wave function is shown in Figure 4 below. (Let φ = π/2 in any of equations (2.29).)
The three-dimensional orbital is obtained by rotating this plot about the horizontal
axis, so we see that the actual shape of a real 2p orbital (i.e., a one-electron wave
function) is two separated, distorted ellipsoids.
It is not hard to verify that the real 2p wave functions are orthonormal, and
(0)
hence the eight degenerate wave functions ψi are also orthonormal as required by
equation (2.19). The secular determinant contains 82 = 64 elements. However, H ′
39
40
20
0
-20
-40
-40
-20
0
20
40
Figure 4: Contour plot in the plane of a real 2p wave function.
(0)
′
′
= Hji
and the determinant is symmetric about
is real, as are the ψi , so that Hij
the main diagonal. This cuts the number of integrals almost in half.
′
Even better, by using parity we can easily show that most of the Hij
are zero.
′
2
Indeed, the perturbing Hamiltonian H = e /r12 is an even function of r since
r12 = [(x1 − x2 )2 + (y1 − y2 )2 + (z1 − z2 )2 ]1/2
and this is unchanged if r1 → −r1 and r2 → −r2 . Also, the hydrogenlike swave functions depend only on r = |r| and hence are invariant under r → −r.
Furthermore, you can see from the above forms that the 2p wave functions are odd
under parity since they depend on r and either x, y or z. Hence, since we are
integrating over all space, any integral with only a single factor of 2p must vanish:
′
′
′
′
′
′
= H18
=0
= H17
= H14
= H15
= H16
H13
and
′
′
′
′
′
′
= H26
= H27
= H28
= 0.
H23
= H24
= H25
Now consider an integral such as
Z ∞
e2
′
H35 =
1s(1)2px (2)
1s(1)2py (2) dr1 dr2 .
r12
−∞
If we let x1 → −x1 and x2 → −x2 , then r12 is unchanged as are 1s(1) and 2py (2).
However, 2px (2) changes sign, and the net result is that the integrand is an odd
function under this transformation. Hence it is not hard to see that the integral
vanishes. This lets us conclude that
′
′
′
′
H35
= H36
= H37
= H38
=0
40
and
′
′
′
′
H45
= H46
= H47
= H48
= 0.
Similarly, by considering the transformation y1 → −y1 and y2 → −y2 , it follows
that
′
′
′
′
H57
= H58
= H67
= H68
= 0.
With these simplifications, the secular equation becomes
′
b11 H12
0
0
0
0
0
0 ′
H12 b22
0
0
0
0
0
0 ′
0
0
b
H
0
0
0
0 33
34
′
0
0
H
b
0
0
0
0 44
34
=0
′
0
0
0
0
b
H
0
0 55
56
′
0
0
0
0
H56 b66
0
0 ′ 0
0
0
0
0
0
b
H
77
78 ′
0
0
0
0
0
0
H78 b88 (2.30)
where
bii = Hii′ − E (1) ,
i = 1, 2, . . . , 8 .
Since the secular determinant is in block-diagonal form with 2 × 2 blocks on the
diagonal, the same logic that we used in Example 1.3 would seem to tell us that the
correct zeroth-order wave functions have the form
(0)
(0)
(0)
φ2 = c̄1 ψ1 + c̄2 ψ2
(0)
(0)
(0)
φ4 = c̄3 ψ3 + c̄4 ψ4
(0)
(0)
(0)
φ6 = c̄5 ψ5 + c̄6 ψ6
(0)
(0)
(0)
φ8 = c̄7 ψ7 + c̄8 ψ8
φ1 = c1 ψ1 + c2 ψ2
φ3 = c3 ψ3 + c4 ψ4
φ5 = c5 ψ5 + c6 ψ6
φ7 = c7 ψ7 + c8 ψ8
(0)
(0)
(0)
(0)
(0)
(0)
(0)
(0)
(0)
(0)
(0)
(0)
where the barred and unbarred coefficients distinguish between the two roots of
each second-order determinant. However, while that argument applies to the upper
2 × 2 determinant (i.e., the first two equations of the system), it doesn’t apply to
the whole determinant in this case. This is because it turns out (as we will see
below) that the lower three 2 × 2 determinants are identical. Therefore, their pairs
of roots are the same, and all we can say is that there are two six-dimensional
eigenspaces. In other words, all we can say is that for each of the two roots and
(0)
(0)
for each n = 3, 4, . . . 8, the function φn will be a linear combination of ψ3 , . . . ,
(0)
ψ8 . However, we can choose any basis we wish for this six-dimensional space, so
(0)
we choose the three two-dimensional orthonormal φn ’s as shown above.
The first determinant is
′
H ′ − E (1)
H12
11
(2.31)
=0
′
′
(1) H12
H22 − E
41
where
′
H11
=
′
H22
=
e2
1s(1)2s(2) dr1 dr2 =
r12
Z
1s(1)2s(2)
Z
[1s(2)]2 [2s(1)]2
Z
[1s(1)]2 [2s(2)]2
e2
dr1 dr2
r12
e2
dr1 dr2 .
r12
Since the integration variables are just dummy variables, it is pretty obvious that
letting r1 ↔ r2 shows that
′
′
H11
= H22
.
Similarly, it is easy to see that
′
′
H33
= H44
′
′
H55
= H66
′
′
H77
= H88
.
′
The integral H11
is sometimes denoted by J1s2s and called a Coulomb integral:
Z
e2
′
H11
= J1s2s = [1s(1)]2 [2s(2)]2
dr1 dr2 .
r12
The reason for the name is that this represents the electrostatic energy of repulsion
between an electron with the probability density function [1s]2 and an electron with
′
probability density function [2s]2 . The integral H12
is denoted by K1s2s and called
an exchange integral:
Z
e2
′
H12
= K1s2s = 1s(1)2s(2)
2s(1)1s(2) dr1 dr2 .
r12
Here the functions to the left and right of H ′ differ from each other by the exchange
of electrons 1 and 2. The general definitions of the Coulomb and exchange integrals
are
Jij = hfi (1)fj (2)|e2 /r12 |fi (1)fj (2)i
Kij = hfi (1)fj (2)|e2 /r12 |fj (1)fi (2)i
where the range of integration is over the full range of spatial coordinates of particles
1 and 2, and the functions fi , fj are spatial orbitals.
Substituting these integrals into (2.31) we have
J1s2s − E (1)
K1s2s
(2.32)
=0
K1s2s
J1s2s − E (1) or
J1s2s − E (1) = ±K1s2s
and hence the two roots are
(1)
E1
= J1s2s − K1s2s
and
42
(1)
E2
= J1s2s + K1s2s .
(1)
Just as in Example 1.3, we substitute E1
back into (2.20a) to write
K1s2s c1 + K1s2s c2 = 0
K1s2s c1 + K1s2s c2 = 0
(0)
and hence c2 = −c1 . Normalizing φ1
(0)
ψi )
(0)
(0)
(0)
(0)
we have (using the orthonormality of the
(0)
(0)
2
2
hφ1 |φ1 i = hc1 ψ1 − c1 ψ2 |c1 ψ1 − c1 ψ2 i = |c1 | + |c2 | = 1
√
(1)
so that c1 = 1/ 2. Thus the zeroth-order wave function corresponding to E1 is
(0)
(0)
(0)
φ1 = 2−1/2 [ψ1 − ψ2 ] = 2−1/2 [1s(1)2s(2) − 2s(1)1s(2)] .
(1)
Similarly, the wave function corresponding to E2
(0)
(0)
is easily found to be
(0)
φ2 = 2−1/2 [ψ1 + ψ2 ] = 2−1/2 [1s(1)2s(2) + 2s(1)1s(2)] .
This takes care of the first determinant in (2.30), but we still have the remaining
three to handle.
′
′
First look at the integrals H33
and H55
:
Z
e2
′
1s(1)2px (2) dr1 dr2
= 1s(1)2px(2)
H33
r12
Z
e2
′
1s(1)2py (2) dr1 dr2 .
H55
= 1s(1)2py (2)
r12
The only difference between these is the 2p(2) orbital, and the only difference between the 2px and 2py orbitals is their spatial orientation. Since the 1s orbitals are
spherically symmetric, it should be clear that these integrals are the same. For′
mally, in H33
we can change variables by letting x1 → y1 , y1 → x1 , x2 → y2 and
′
′
y2 → x2 . This leaves r12 unchanged, and transforms H33
into H55
. The same
′
′
argument shows that H77 = H33 also. Hence we have
Z
e2
′
′
′
1s(1)2pz (2) dr1 dr2 := J1s2p .
H33 = H55 = H77 = 1s(1)2pz (2)
r12
A similar argument shows that we also have equal exchange integrals:
Z
e2
′
′
′
H34 = H56 = H78 = 1s(1)2pz
2pz (1)1s(2) dr1 dr2 := K1s2p .
r12
Thus the remaining three determinants in (2.30) are the same and have the form
J1s2p − E (1)
K1s2p
= 0.
(1) K1s2p
J1s2p − E
43
But this is the same as (2.32) if we replace 2s by 2p, and hence we can immediately
write down the solutions:
(1)
E3
(1)
E4
(1)
= E5
=
(1)
E6
(1)
= E7
=
(1)
E8
= J1s2p − K1s2p
= J1s2p + K1s2p
and
(0)
φ3 = 2−1/2 [1s(1)2px (2) − 1s(2)2px (1)]
(0)
φ4 = 2−1/2 [1s(1)2px (2) + 1s(2)2px (1)]
(0)
φ5 = 2−1/2 [1s(1)2py (2) − 1s(2)2py (1)]
(0)
φ6 = 2−1/2 [1s(1)2py (2) + 1s(2)2py (1)]
(0)
φ7 = 2−1/2 [1s(1)2pz (2) − 1s(2)2pz (1)]
(0)
φ8 = 2−1/2 [1s(1)2pz (2) + 1s(2)2pz (1)]
So what has happened? Starting from the eight degenerate (unperturbed) states
(0)
ψi that would exist in the absence of electron-electron repulsion, we find that including this repulsion term splits the degenerate states into two nondegenerate levels
associated with the configuration 1s2s, and two triply degenerate levels associated
with the configuration 1s2p. Interestingly, going to higher-order energy corrections
will not completely remove the degeneracy, and in fact it takes the application of
an external magnetic field to do so.
In order to evaluate the Coulomb and exchange integrals in the expressions for
E (1) we need to use the expansion
∞ X
l
l
X
1
4π r<
[Ylm (θ1 , ϕ1 )]∗ Ylm (θ2 , ϕ2 )
=
l+1
r12
2l + 1 r>
(2.33)
l=0 m=−l
where r< means the smaller of r1 and r2 and r> is the larger of these. The details
of this type of integral are left to the homework, and the results are
J1s2s =
17 Ze2
= 11.42 eV
81 a0
J1s2p =
59 Ze2
= 13.21 eV
243 a0
K1s2s =
16 Ze2
= 1.19 eV
729 a0
K1s2p =
112 Ze2
= 0.93 eV
6561 a0
where we used Z = 2 and e2 /2a0 = 13.606 eV. Recalling that E (0) = −68.03 eV
we obtain
(1)
E (0) + E1
(1)
E (0) + E2
(1)
E (0) + E3
(1)
E (0) + E4
= E (0) + J1s2s − K1s2s = −57.8 eV
= E (0) + J1s2s + K1s2s = −55.4 eV
= E (0) + J1s2p − K1s2p = −55.7 eV
= E (0) + J1s2p + K1s2p = −53.9 eV .
44
−53.9 eV
1s2p
Kp
−55.4 eV
−55.7 eV
1s2s
Ks
−57.8 eV
Jp
Js
E (0)
E (0) + E (1)
−68.0 eV
Figure 5: The first excited levels of the helium atom.
(See Figure 5 below.) The first-order energy corrections place the lower 1s2p level
below the upper 1s2s level, which disagrees with the actual helium spectrum. This is
due to the neglect of higher-order corrections. Since the electron-electron repulsion
is not a small quantity, this is not surprising.
Finally, let us look at the sources of the degeneracy of the original eight zerothorder wave functions and the reason for the partial lifting of this degeneracy. There
are three types of degeneracy to consider: (1) The degeneracy between states with
the same n but different values of l. The 2s and 2p functions have the same energy.
(2) The degeneracy between wave functions with the same n and l but different
values of ml . The 2px , 2py and 2pz functions have the same energy. (This could
just as well have been the 2p0 , 2p1 and 2p−1 complex functions.) (3) There is an
exchange degeneracy between functions that differ only in the exchange of elec(0)
(0)
trons between the orbitals. For example, ψ1 = 1s(1)2s(2) and ψ2 = 1s(2)2s(1)
have the same energy.
By introducing the electron-electron perturbation H ′ = e2 /r12 we removed the
degeneracy associated with l and the exchange degeneracy, but not the degeneracy
due to ml . To understand the reason for the lifting of the l degeneracy, realize that
a 2s electron has a greater probability than a 2p electron of being closer to the
nucleus than a 1s electron, and hence a 2s electron is not as effectively shielded
from the nucleus by the 1s electrons as a 2p electron is. Since the energy levels are
given by
Z 2 e2
E=− 2
n 2a0
we see that a larger nuclear charge means a lower energy, and hence the 2s electron
has a lower energy than the 2p electron. This is also evident from the Coulomb
integrals, where we see that J1s2s is less than J1s2p . These integrals represent the
45
electrostatic repulsion of their respective charge distributions: when the 2s electron
penetrates the 1s charge distribution it only feels a repulsion due to the unpenetrated portion of the 1s distribution. Therefore the 1s-2s electrostatic repulsion is
less than the 1s-2p repulsion, and the 1s2s levels lies below the 1s2p levels. So we
see that the interelectronic repulsion in many-electron atoms lifts the l degeneracy,
and the orbital energies for the same value of n increase with increasing l.
To understand the removal of the exchange degeneracy, note that the original zeroth-order wave functions specified which electron went into which orbital.
Since the secular determinant wasn’t diagonal, these couldn’t have been the correct
zeroth-order wave functions. In fact, the correct zeroth-order wave functions do not
assign a specific electron to a specific orbital, as is evident from the form of each
(0)
φi . This is a consequence of the indistinguishability of identical particles, and will
(0)
(0)
be discussed at length a little later in this course. Since, for example, φ1 and φ2
have different energies, the exchange degeneracy is removed by using the correct
zeroth-order wave functions.
2.4
Spin–Orbit Coupling and the Hydrogen Atom Fine Structure
The Hamiltonian
H0 = −
~2 ∂ 2
1
2 ∂
e2
2
+
+
L
−
2m ∂r2
r ∂r
2mr2
r
(2.34)
used to derive the hydrogen atom wave functions ψnlm that we have worked with
so far consists of the kinetic energy of the electron plus the potential energy of
the Coulomb force binding the electron and proton together. (Recall that in this
equation, m is really the reduced mass m = me Mp /(me + Mp ) ≈ me .) While this
works very well, the actual Hamiltonian is somewhat more complicated than this.
In this section we derive an additional term in the Hamiltonian that is due to a
coupling between the orbital angular momentum L and the spin angular momentum
S.
The discussion that follows is a somewhat heuristic approach to deriving an
interaction term that agrees with experiment. You shouldn’t take the physical
picture too seriously. However, the basic idea is simple enough. From the point of
view of the electron, the moving nucleus (i.e., a proton) generates a current that
is the source of a magnetic field B. This current is proportional to the electron’s
angular momentum L. The interaction energy of a magnetic moment µ with this
magnetic field is −µ · B. Since the magnetic moment of an electron is proportional
to its spin S, we see that the interaction energy will be proportional to L · S.
With the above disclaimer, the interaction term we are looking for is due to
the fact that from the point of view of the electron, the moving hydrogen nucleus
(the proton) forms a current, and thus generates a magnetic field. From special
relativity, we know that the electric and magnetic fields are related by a Lorentz
transformation so that
B⊥ = γ(B′ ⊥ + β × E′ )
Bk = B′ k
46
E⊥ = γ(E′ ⊥ − β × B′ )
Ek = E′ k
where β = v/c is the velocity of the primed frame with respect to the unprimed
frame, γ = (1 − β 2 )−1/2 and ⊥, k refer to the components perpendicular or parallel
to β.
β
O′
O
We let the primed frame be the proton rest frame, and note that there is no B′
field in the proton’s frame due to the proton itself. Also, if β ≪ 1, then γ ≈ 1 and
we then have
B = β × E′
and
E = E′ .
If v is the electron’s velocity with respect to the lab (or the proton), then β = −v/c
so the field felt by the electron is
v
(2.35)
B = − × E′ .
c
The electric field E′ due to the proton is
e
e
E′ = 2 r̂ = 3 r
r
r
(2.36)
where e > 0 and r is the position vector from the proton to the electron.
From basic electrodynamics, we know that the energy of a particle with magnetic
moment µ in a magnetic field B is given by (see the end of this section)
W = −µ · B
(2.37)
so we need to know µ. Consider a particle of charge q moving in a circular orbit.
It forms an effective current
I=
q
qv
∆q
=
=
.
∆t
2πr/v
2πr
By definition, the magnetic moment has magnitude
µ=
I
qv
qvr
× area =
· πr2 =
.
c
2πrc
2c
But the angular momentum of the particle is L = mvr so we conclude that the
magnetic moment due to orbital motion is
q
µl =
L.
(2.38)
2mc
47
The ratio of µ to L is called the gyromagnetic ratio.
While the above derivation of (2.38) was purely classical, we know that the
electron also possesses an intrinsic spin angular momentum. Let us hypothesize
that the electron magnetic moment associated with this spin is of the form
µs = g
−e
S.
2mc
The constant g is found by experiment to be very close to 2. (However, the relativistic Dirac equation predicts that g is exactly 2. Higher order corrections in
quantum electrodynamics predict a slightly different value, and the measurement
of g − 2 is one of the most accurate experimental result in all of physics.)
So we now have the electron magnetic moment given by
µs =
−e
S
mc
(2.39)
and hence the interaction energy of the electron with the magnetic field of the
proton is (using equations (2.35) and (2.36))
e
e
e2
e
S · (r × p)
W = −µs · B = +
S·B=−
S·
v
×
r
=
mc
mc
r3 c
m 2 c2 r 3
or
e2
S · L.
(2.40)
m 2 c2 r 3
Alternatively, we can write W in another form as follows. If we assume that the
electron moves in a spherically symmetric potential field, then the force −eE on the
electron may be written as the negative gradient of this potential energy:
W =
−eE = −∇V (r) = −
r dV
dV
r̂ = −
.
dr
r dr
Using this in (2.35) we have
B=−
v
1 dV
1
1 dV
×r
=
r×p
c
er dr
mc
er dr
and hence
W = −µs · B =
1 dV
e
S · (r × p)
m 2 c2
er dr
or
1 1 dV
S · L.
(2.41)
m2 c2 r dr
However, we have made one major mistake. The classical equation that leads
to (2.37) is
dL
=N=µ×B
(2.42)
dt
where L is the angular momentum of the particle in its rest frame, N is the applied
torque, and B is the magnetic field in that frame. But this only applies if the
W =
48
electron’s rest frame isn’t rotating. If it is, then the left side of this equation isn’t
valid (i.e., it isn’t equal to only the applied torque), and we must use the correct
(operator) expression from classical mechanics:
d
d
=
+ω× .
(2.43)
dt lab
dt rot
(If you don’t know this result, I will derive it at the end of this section so you can
see what is going on and why.)
For the electron, (2.42) gives dS/dt in the lab frame, so in the electron’s frame
we must use
dS
dS
=
− ωT × S
(2.44)
dt rot
dt lab
where ωT is called the Thomas precessional frequency. Thus we see that the
change in the spin angular momentum of the electron, (dS/dt)rot , is given by the
change due to the applied torque µ × B minus an effect due to the rotation of the
coordinate system:
dS
e
= µ × B − ωT × S = −
S × B + S × ωT
dt rot
mc
or
dS
dt
rot
eB
=S× −
+ ωT .
mc
This is the analogue of (2.42), so the analogue of (2.37) is
e
eB
+ ωT =
S · B − S · ωT .
W = −S · −
mc
mc
(2.45)
(2.46)
Note that the first term is what we already calculated in equation (2.40). What we
need to know is the Thomas factor S·ω T . This is not a particularly easy calculation
to do exactly, so we will give a very simplified derivation. (See Jackson, Classical
Electrodynamics, Chapter 11 if you want a careful derivation.)
Basically, Thomas precession can be attributed to time dilation, i.e., observers
on the electron and proton disagree on the time required for one particle to make a
revolution about the other. Let T be the time required for a revolution according
to the electron, and let it be T ′ according to the proton. Then T ′ = γT where
γ = (1 − β 2 )−1/2 . (Note that a circular orbit means an acceleration, so even this
isn’t really correct.) Then the electron and proton each measure orbital angular
velocities of 2π/T and 2π/T ′ respectively.
To the electron, its spin S maintains its direction in space, but to the proton, it
appears to precess at a rate equal to the difference in angular velocities, or
γ
1
1
1
2π 2π
−
− ′ = 2π
− ′ = 2π
ωT =
T
T
T
T
T′ T′
2π
2π β 2
= ′ (1 − β 2 )−1/2 − 1 ≈ ′
.
T
T 2
49
But in general we know that ω = v/r and hence
2π
v
L
mvr
= =
=
′
2
T
r
mr
mr2
and therefore
1 L 1 mv 2
L v2
L β2
=
=
.
mr2 2
mr2 2c2
2 m 2 c2 r r
We also know that F = ma, where for circular motion we have an inward
directed acceleration a = v 2 /r. Since F = −∇V , we have
ωT =
F=−
mv 2
dV
r̂ = −
r̂
r
dr
and we can write
1 1 1 dV
L.
(2.47)
2 m2 c2 r dr
From this we see that S · ω T is just one-half the energy given by equation (2.41),
and equation (2.46) shows that it is subtracted off. Therefore the correct spin–orbit
energy is given by
1 1 dV
L·S
(2.48a)
W =
2m2 c2 r dr
or, from (2.40) with a slight change of notation,
ωT =
Hso =
e2
L ·S.
2m2 c2 r3
(2.48b)
Calculating the spin–orbit interaction energy Eso by finding the eigenfunctions
and eigenvalues of the Hamiltonian H = H 0 + Hso is a difficult problem. Since
the effect of Hso is small compared to H 0 (at least for the lighter atoms), we will
estimate the value of Eso by using first-order perturbation theory. Then first-order
energy shifts for the hydrogen atom will be the integrals
(1)
Eso
≈ hΨ|Hso Ψi
where the hydrogen atom wave functions including spin are of the form
Ψ = Rnl (r)Ylm (θ, ϕ)χ(s) .
From J = L + S, we have J 2 = L2 + S 2 + 2L · S so that
L·S =
1 2
(J − L2 − S 2 ) .
2
(2.49)
Note that neither L nor S separately commutes with L · S, but you can easily show
that J = L + S does in fact commute with L · S. Because of this, we can choose our
states to be simultaneous eigenfunctions of J 2 , Jz , L2 and S 2 , all of which commute
with H.
50
Since Ylm is an eigenfunction of Lz and χ is an eigenfunction of Sz , the wave
function Ylm χ is an eigenfunction of Jz = Lz + Sz but not of J 2 . However, by
the usual addition of angular momentum problem, in this case L and S, we can
construct simultaneous eigenfunctions ψ of J 2 , Jz , L2 and S 2 . In this case we have
s = 1/2, so we know that the resulting possible j values are j = l − 1/2, l + 1/2.
The reason we want to do this is because there are 2(2l + 1) degenerate levels for
a given n and l, where the additional factor of 2 comes from the two possible spin
orientations.
Let us assume that we have constructed these eigenfunctions, and we now denote
the hydrogen atom wave functions by
Ψ = Rnl (r)ψ(θ, ϕ, s)
where, by (2.49)
~2
[j(j + 1) − l(l + 1) − s(s + 1)]ψ
2
3
~2
j(j + 1) − l(l + 1) − ψ .
=
2
4
L · Sψ =
Using this, our first-order energy estimate becomes
2
e
1
(1)
Eso ≈ Rnl ψ 2 2 3 L · SRnl ψ
2m c r
1
3
e 2 ~2
R
j(j
+
1)
−
l(l
+
1)
−
R
=
nl 3 nl
4m2 c2
4
r
where
(2.50)
1
1
Rnl ψ 3 Rnl ψ = Rnl 3 Rnl
r
r
because hψ|ψi = 1. The integral in (2.50) is not at all hard to do if you use some
clever tricks. I will show how to do it at the end of this section, and the answer is
1
1
Rnl 3 Rnl = 3 3
(2.51)
r
a0 n l(l + 1/2)(l + 1)
where the Bohr radius is
a0 =
~
~2
=
me2
mcα
(2.52)
1
e2
≈
.
~c
137
(2.53)
and the fine structure constant is
α=
Note that for l = 0 we also have L · S = 0 anyway, so there is no spin–orbit energy.
51
Recall that the energy corresponding to H 0 is
En(0) = −
or
me4
mc2 α2
=−
.
2
2
2~ n
2n2
(2.54a)
(0)
−13.6 eV
E1
=
.
2
n
n2
Combining (2.50) and (2.51) we have
e 2 ~2
[j(j + 1) − l(l + 1) − 3/4]
(1)
Eso
=
4m2 c2 a30 n3
l(l + 1/2)(l + 1)
(0) 2 En α [j(j + 1) − l(l + 1) − 3/4] .
=
2n
l(l + 1/2)(l + 1)
En(0) =
Since j = l ± 1/2, this gives us the two corrections to the energy
(0) 2 En α
1
(1)
Eso =
for j = l + 1/2 and l 6= 0
n
(2l + 1)(l + 1)
(0) 2 En α
1
(1)
Eso = −
for j = l − 1/2 and l 6= 0 .
n
l(2l + 1)
(2.54b)
(2.55)
(2.56a)
(2.56b)
There is yet another correction to the hydrogen atom energy levels due to the
relativistic contribution to the kinetic energy of the electron. The kinetic energy is
really the difference between the total relativistic energy E = (p2 c2 + m2 c4 )1/2 and
the rest energy mc2 . To order p4 this is
T = (p2 c2 + m2 c4 )1/2 − mc2 ≈
p4
p2
−
.
2m 8m3 c2
Since the Hamiltonian is the sum of kinetic and potential energies, we see from this
that the term
p4
Hrel = −
(2.57)
8m3 c2
may be treated as a perturbation to the states ψnlm .
While the states ψnlm are in general degenerate, in this case we don’t have to
worry about it. The reason is that Hrel is rotationally invariant, so it’s already
diagonal in the ψnlm basis, and that is precisely what the zeroth-order wavefunc(0)
tions ϕn accomplish (see equation (2.22)). Therefore we can use simple first-order
perturbation theory so that
(1)
Erel = −
1
hψnlm |p4 |ψnlm i .
8m3 c2
Using H 0 = p2 /2m − e2 /r we can write
2
2 2
e2
p
0
2
4
2
= 4m H +
p = 4m
2m
r
52
and therefore
(1)
Erel = −
1
1
4
(0) 2
(0) 2 1
+
e
(E
)
+
2E
e
n
n
2mc2
r
r2
where h·i is shorthand for hψnlm | · |ψnlm i. These integrals are not hard to evaluate
(see the end of this section), and the result (in different forms) is
(1)
Erel
(0) (En )2
4n
=−
−3+
2mc2
l + 1/2
(0) 2 En α
3
n
=−
− +
n2
4 l + 1/2
3
1
1
.
= − mc2 α4 − 4 + 3
2
4n
n (l + 1/2)
(2.58)
Adding equations (2.56) and (2.58) we obtain the fine structure energy shift
mc2 α4
3
1
(1)
Efs = −
−
+
2n3
4n j + 1/2
(2.59)
(0) 2 En α
3
n
− +
=−
n2
4 j + 1/2
which is valid for both j = l ± 1/2. This is the first-order energy correction due to
the “fine structure Hamiltonian”
Hfs = Hso + Hrel .
2.4.1
(2.60)
Supplement: Miscellaneous Proofs
Now let’s go back and prove several miscellaneous results stated in this section. The
first thing we want to show is that the energy of a magnetic moment in a uniform
magnetic field is given by −µ · B where µ for a loop of area A carrying current
I is defined to have magnitude IA and pointing perpendicular to the loop in the
direction of your thumb if the fingers of your right hand are along the direction of
the current. To see this, we simply calculate the work required to rotate a current
loop from its equilibrium position to the desired orientation.
Consider Figure 6 below, where the current flows counterclockwise out of the
page at the bottom and into the page at the top. Let the loop have length a on the
sides and b across the top and bottom, so its area is ab. The magnetic force on a
current-carrying wire is
Z
FB =
Idl × B
and hence the forces on the opposite “a sides” of the loop cancel, and the force on
the top and bottom “b sides” is FB = IbB. The equilibrium position of the loop is
53
B
µ
a/2
θ
θ
θ
FB
B
FB
a/2
B
Figure 6: A current loop in a uniform magnetic field
horizontal, so the potential energy of the loop is theR work required to rotate it from
θ = 0 to some value θ. This work is given by W = F · dr where F is the force that
I must apply against the magnetic field to rotate the loop.
Since the loop is rotating, the force I must apply at the top of the loop is in the
direction of µ and perpendicular to the loop, and hence has magnitude FB cos θ.
Then the work I do is (the factor of 2 takes into account both the top and bottom
sides)
W =
Z
F · dr = 2
Z
FB cos θ(a/2)dθ = IabB
Z
θ
cos θ dθ = µB sin θ .
0
But note that µ · B = µB cos(90 + θ) = −µB sin θ, and therefore
W = −µ · B .
(2.61)
In this derivation, I never explicitly mentioned the torque on the loop due to B.
However, we see that
kNk = kr × FB k = 2(a/2)FB sin(90 + θ) = IabB sin(90 + θ)
= µB sin(90 + θ) = kµ × Bk
and therefore
R
N = µ× B.
(2.62)
Note that W = kNk dθ.
Next I will prove equation (2.43). Let A be a vector as seen in both the rotating
and lab frames, and let {ei } be a fixed basis in the rotating frame. Then (using the
summation convention) A = Ai ei so that
dA
d
dAi
dei
= (Ai ei ) =
ei + Ai
.
dt
dt
dt
dt
54
Now (dAi /dt)ei is the rate of change of A with respect to the rotating frame, so we
have
dAi
dA
.
ei =
dt
dt rot
And ei is a fixed basis vector in the frame that is rotating with respect to the lab
frame. Then, just like any vector rotating in the lab with angular velocity ω, we
have
dei
= ω × ei .
dt
(See the figure below. Here ω = dφ/dt, and dv = v sin θ dφ so dv/dt = v sin θ ω or
dv/dt = ω × v.)
ω
dφ
dv
θ
v
Then
dei
= Ai ω × ei = ω × Ai ei = ω × A .
dt
Putting this all together we have
dA
dA
=
+ ω × A.
dt lab
dt rot
Ai
Equation (2.43) is just the ‘operator’ version of this result.
Finally, let me show how to evaluate the integrals h1/ri, h1/r2 i and h1/r3 i where
the expectation values are taken with respect to the hydrogen atom wave functions
ψnlm .
First, instead of h1/ri, consider hλ/ri. This can be interpreted as the first-order
correction to the energy due to the perturbation λ/r. But H 0 = T + V = T − e2 /r,
so H = H 0 + H ′ = H 0 + λ/r = T − (e2 − λ)/r, and this is just our original problem
if we replace e2 by e2 − λ everywhere. In particular, the exact energy solution is
then
me4
me2
m
m(e2 − λ)2
= − 2 2 + λ 2 2 − λ2 2 2 .
En (λ) = −
2
2
2~ n
2~ n
~ n
2~ n
But another way of looking at this is as the expansion of En (λ) given in (2.3b):
dEn
λ2 d2 En
(0)
+
+ ···
En = En + λ
dλ λ=0
2! dλ2 λ=0
= En(0) + λEn(1) + λ2 En(2) + · · ·
55
(1)
where the first-order correction En = hH ′ i is just the term linear in λ. Therefore,
letting λ → 1, we have h1/ri = hH ′ i = me2 /~2 n2 or
1
1
=
.
(2.63)
r
a0 n 2
(1)
Note that if you have the exact solution En (λ), you can obtain En by simply
evaluating λ(dEn /dλ)λ=0 .
Before continuing, let me rewrite the hydrogen atom Hamiltonian as follows:
1
2 ∂
e2
~2 ∂ 2
+
+
L2 −
H0 = −
2
2
2m ∂r
r ∂r
2mr
r
(2.64)
p2r
e2
L2
=
−
+
2m 2mr2
r
where I have defined the “radial momentum” pr by
∂
1
.
pr = −i~
+
∂r r
Now consider hλ/r2 i. Again, letting H = H 0 + H ′ = H 0 + λ/r2 , we can still
solve the problem exactly because all we are doing is modifying the centrifugal term
L2 + 2mλ
~2 l(l + 1) + 2mλ
~2 l′ (l′ + 1)
L2
→
→
=
2
2
2
2mr
2mr
2mr
2mr2
where l′ = l′ (λ) is a function of λ. (Just write ~2 l′ (l′ + 1) = ~2 l(l + 1) + 2mλ and
use the quadratic formula to find l′ as a function of λ.)
Recall that the exact energies were defined by
En = −
me4
me4
=− 2
2
2
2~ n
2~ (k + l + 1)2
where k = 0, 1, 2, . . . was the integer that terminated the power series solution of
the radial equation. Now what we have is
E(l′ ) = −
me4
= E(λ) = E (0) + λE (1) + · · ·
2~2 (k + l′ + 1)2
where (note λ = 0 implies l′ = l)
E
(1)
dE dE dl′ .
=
=
dλ λ=0
dλ l′ =l dl′ l′ =l
Then from the explicit form of E(l′ ) and the definition of n we have
me4
me4
dE = 2
= 2 3
′
3
dl l′ =l
~ (k + l + 1)
~ n
56
and taking the derivative of ~2 l′ (l′ + 1) = ~2 l(l + 1) + 2mλ with respect to λ yields
dl′ 1
m
2m 1
= 2
.
= 2
dλ l′ =l
~ 2l + 1
~ (l + 1/2)
Therefore
E (1) =
(me2 /~2 )2
(l + 1/2)n3
and hλ/r2 i = λE (1) so that
1
r2
=
1
.
a20 (l + 1/2)n3
(2.65)
The last integral to evaluate is h1/r3 i. Since there is no term in H 0 that goes
like 1/r3 , we have to try something else. Note that H 0 ψnlm = En ψnlm so that
h[H 0 , pr ]i = hψnlm |H 0 pr − pr H 0 |ψnlm i = En hpr i − hpr iEn = 0 .
Using
1
1 ∂
= 2
,
r ∂r
r
and
2
1 ∂
= 3
,
r2 ∂r
r
(recall [ab, c] = a[b, c] + [a, c]b), it is easy to use (2.64) and show that
[H 0 , pr ] = −
i~ L2
i~e2
+
.
m r3
r2
But now
1
i~ L2
2
+
i~e
m r3
r2
1
i~3 l(l + 1) 1
2
+ i~e
=−
m
r3
r2
0 = h[H 0 , pr ]i = −
and therefore
or
me2
1
= 2
~ l(l + 1) r2
1
1
=
.
a0 l(l + 1) r2
1
r3
1
r3
Combining this with (2.65) we have
1
1
= 3
r3
a0 l(l + 1)(l + 1/2)n3
57
(2.66)
(2.67)
2.5
The Zeeman Effect
In the previous section we studied the effect of an atomic electron’s magnetic moment interacting with the magnetic field generated by the nucleus (a proton). In
this section, I want to investigate what happens when a hydrogen atom is placed
in a uniform external magnetic field B. These types of interactions are generally
referred to as the Zeeman effect, and they were instrumental in the discovery of
spin. (Pieter Zeeman and H.A. Lorentz shared the second Nobel prize in physics
in 1902. For a very interesting summary of the history of spin, read Chapter 10 in
the text Quantum Mechanics by Hendrik Hameka.)
The hydrogen atom Hamiltonian, including fine structure, is given by
H = H 0 + Hfs = H 0 + Hso + Hrel
where
H0 = −
Hso =
~2 ∂ 2
1
2 ∂
e2
+
+
L2 −
2
2
2m ∂r
r ∂r
2mr
r
e2
L·S
2m2 c2 r3
Hrel = −
(equation (2.34))
(equation (2.48b))
p4
8m3 c2
(equation (2.57)) .
(And where I’m approximating the reduced mass by the electron mass me .) The
easy way to include the presence of an external field B is to simply add an interaction
energy
Hmag = −µtot · B
where, from equations (2.38) and (2.39), we know that the total magnetic moment
for a hydrogenic electron is
µtot = µl + µs = −
e
e
(L + 2S) = −
(J + S) .
2me c
2me c
(2.68)
However, the correct way to arrive at this is to rewrite the Hamiltonian taking into
account the presence of an electromagnetic field. For those who are interested, I
work through this approach at the end of this section.
In any case, the Hamiltonian for a hydrogen atom in an external uniform magnetic field is then
H = H 0 + Hso + Hrel + Hmag .
There are really three cases to consider. (I’ll ignore Hrel for now because it’s a
correction to the kinetic energy and irrelevant to this discussion.) The first is when
B is strong enough that Hmag is large relative to Hso . In this case we can treat
Hso as a perturbation on the states defined by H 0 + Hmag , where these states are
simultaneous eigenfunctions of L2 , S 2 , Lz and Sz (rather than J 2 and Jz ). The
reason that J is not a good quantum number is that the external field exerts a
58
torque µtot × B on the total magnetic moment, and this is equivalent to a changing
total angular momentum dJ/dt. Thus J is not conserved, and in fact precesses
about B. In addition, if there is a spin–orbit interaction, then this internal field
causes L and S to precess about J.
The second case is when B is weak and Hso dominates Hmag . In this situation,
Hmag is treated as a perturbation on the states defined by H 0 + Hso . As we saw in
our discussion of Hso , in this case we must choose our states to be eigenfunctions
of L2 , S 2 , J 2 and Jz because L and S are not conserved separately, even though
J = L+S is conserved. (Neither L nor S alone commutes with L·S, but [Ji , L·S] = 0
and hence J 2 commutes with H.)
And the third and most difficult case is when both Hso and Hmag are roughly
equivalent. Under this “intermediate-field” situation, we must take them together
and use degenerate perturbation theory to break the degeneracies of the basis states.
2.5.1
Strong External Field
Let us first consider the case where the external magnetic field is much stronger
than the internal field felt by the electron and due to its orbital motion. Taking
B = Bẑ we have
eB
Hmag =
(Lz + 2Sz ) .
(2.69)
2me c
If we first ignore spin, then the first-order correction to the hydrogen atom energy
levels is
eB
e~
(1)
Lz ψnlm =
Bm := µB Bm
Enlm = ψnlm 2me c 2me c
where
µB =
e~
= 5.79 × 10−9 eV/gauss = 9.29 × 10−21 erg/gauss
2me c
is called the (electron) Bohr magneton. Thus we see that for a given l, the (2l+1)fold degeneracy is lifted. For example, the 3-fold degenerate l = 1 state is split into
three states, with an energy difference of µB B between states:
m=1
l=1
m=0
m = −1
µB B
µB B
This strong field case is sometimes called the Paschen-Back effect.
If we now include spin, then
(1)
Enlml ms = µB B(ml + 2ms )
(2.70)
where ms = ±1/2. This yields the further splitting (or lifting of degeneracies)
sometimes called the anomalous Zeeman effect:
59
ms = 1/2
l=1
µB B
ml = 1
ms = 1/2
ml = 0
ms = 1/2, −1/2
ml = −1
ms = −1/2
ms = −1/2
(0)
(1)
(0)
This gives us the energy levels En + Enlml ms where En is given by (2.54a).
However, since the basis states we used here are just the usual hydrogen atom
wave functions, it is easy to include further corrections due to both Hso and the
relativistic correction Hrel discussed in Section 2.4. We simply apply first-order
perturbation theory using these as the perturbing potentials. For Hrel , we can
simply use the result (2.58). However, we can’t just use equations (2.56) for Hso
because they were derived using the eigenfunctions of J 2 which don’t apply when
there is a strong external magnetic field.
To get around this problem, we simply calculate hψnlml ms |L · S|ψnlml ms i. We
have
L · S = L x Sx + L y Sy + L z Sz
where Lx = (L+ + L− )/2 and Ly = (L+ − L− )/2i with similar results for Sx and
Sy . Using these, it is quite easy to see that the orthogonality of the eigenfunctions
yields
hψ|Lx Sx |ψi = hψ|Ly Sy |ψi = 0
while
hψnlml ms |Lz Sz |ψnlml ms i = ~2 ml ms .
(2.71)
Combining the results for Hrel and Hso we obtain the following corrections to
(0)
(1)
the “unperturbed” energies En + Enlml ms :
(1)
Erel
+
(1)
Eso
e2
1
1
mc2 α4 3
+
~ 2 ml ms 3 3
−
=
2n3
4n l + 1/2
2m2 c2
a0 n l(l + 1)(l + 1/2)
where we used equations (2.58), (2.48b), (2.67) and (2.71). After a little algebra,
which I leave to you, we arrive at
me4 α2
3
l(l + 1) − ml ms
(1)
(1)
Erel + Eso =
−
2~2 n3 4n
l(l + 1)(l + 1/2)
2
3
l(l + 1) − ml ms
(0) α
.
(2.72)
−
= −E1 3
n
4n
l(l + 1)(l + 1/2)
60
2.5.2
Weak External Field
Now we turn to the second case where the external field is weak relative to the
spin–orbit term. As we discussed above, now we must take our basis states to be
eigenfunctions of L2 , S 2 , J 2 and Jz .
For a many-electron atom, there
P are basicallyPtwo ways to calculate the total J.
The first way is to calculate L = Li and S =
Si and then evaluate J = L + S.
This is called L–S or Russel-Saunders coupling. It is applicable to the lighter
elements where interelectronic repulsion energies are significantly greater than the
spin–orbit interaction energies. This is because if the spin–orbit coupling is weak,
then L and S “almost” commute with H 0 + Hso .
P
The second way is to first calculate Ji = Li + Si so that J = Ji . This is called
j–j coupling. It is used for heavier elements where the electrons are moving very
rapidly, and hence there is a strong spin–orbit interaction. Because of this, L and
S no longer commute with H, even though J does so. This type of coupling is also
more difficult to use, so we will deal only with the L–S scheme.
Here is the physical situation:
B
S
S
L
J
−µtot ∼ J + S = L + 2S
Since J commutes with H 0 + Hso , it is conserved (and hence is fixed in space), even
though L and S are not. This means that L and S both precess about J. If the
applied external B field is much weaker than the internal field, then J will precess
much more slowly about B than L and S precess about J. We need to evaluate the
correction (2.69) in first-order perturbation theory.
Since our basis states are eigenfunctions of J 2 and Jz but not Lz and Sz , we
can’t directly evaluate the expectation value of Lz + 2Sz = Jz + Sz . The correct
way to handle this is to use the Wigner-Eckart theorem, which is rather beyond the
scope of this course. Instead, we will use a physical argument that gets us to the
same answer.
We note that since L and S (and hence µtot ) precess rapidly about J, the time
av
average of the Hamiltonian Hmag
= −hµtot · Bi will be the same as −hµtot i · B. But
61
the average of µtot is just its component along J, which is
hµtot i = (µtot · Ĵ)Ĵ =
µtot · J
J.
J2
Using L = J − S so that L2 = J 2 + S 2 − 2S · J we have
1
(J + S) · J = J 2 + S · J = J 2 + (J 2 + S 2 − L2 ) .
2
Then since B = Bẑ, we now have
eB (J + S) · J
Jz
2me c
J2
eBJz
J 2 + S 2 − L2
=
1+
.
2me c
2J 2
av
Hmag
= −Bhµtot i · ẑ =
Our basis states are simultaneous eigenstates of L2 , S 2 , J 2 and Jz , so the average
av
energy Emag
is given by the first-order correction
j(j + 1) + s(s + 1) − l(l + 1)
e~Bmj
1+
2me c
2j(j + 1)
e~Bmj 3 3/4 − l(l + 1)
+
=
2me c 2
2j(j + 1)
av
Emag
=
(2.73)
:= µB Bmj gJ
where the Landé g-factor gJ is defined by
gJ = 1 +
j(j + 1) + s(s + 1) − l(l + 1)
.
2j(j + 1)
The total energy of a hydrogen atom in a uniform magnetic field is now given
(0)
by the sum of the ground state energy En (equation (2.54a)), the fine-structure
(1)
av
correction Ef s (equation (2.59)) and Emag
(equation (2.73)).
2.5.3
Intermediate-Field Case
Finally, we consider the intermediate-field case where the internal and external
magnetic fields are approximately the same. In this situation, we must apply degenerate perturbation theory to the degenerate “unperturbed” states ψnlml ms by
treating H ′ = Hfs + Hmag as a perturbation. It is easiest to simply work out an
example.
As we saw in our discussion of spin–orbit coupling, it is best to work in the basis
in which our states are simultaneous eigenstates of L2 , S 2 , J 2 and Jz . (The choice
of basis has no effect on the eigenvalues of Hfs + Hmag , and the eigenvalues are just
62
what we are looking for when we solve (2.21).) Let us consider the hydrogen atom
state with n = 2, so that l = 0, 1. Since s = 1/2, the possible j values are
1
1
1 3 1
0⊗ +1⊗ = + ⊕
2
2
2 2 2
or j = 1/2, 3/2, 1/2. Our basis states |l s j mj i are given in terms of the states
|l s ml ms i using the appropriate Clebsch-Gordan coefficients (which you can look
up or calculate for yourself).
For l = 0 we have j = 1/2 so mj = ±1/2 and we have the two states
= 0 1 0 1
ψ1 := 0 1 1 1
2 2 2
ψ2 := 0 12
1
2
2
1
−2
2
= 0 12 0 − 12
where the first state in each line is the state |l s j mj i, and the second state in each
line is the linear combination of states |l s ml ms i with Clebsch-Gordan coefficients.
(For l = 0 the C-G coefficients are just 1.)
For l = 1 we have the four states with j = 3/2 and the two states with j = 1/2
(which we order with a little hindsite so the determinant (2.21) turns out block
diagonal):
ψ3 := 1 1 3 3
= 1 1 1 1
2 2 2
ψ4 := 1 12
3
2
ψ6 := 1 12
1
2
ψ8 := 1 12
1
2
ψ5 := 1 12
3
2
ψ7 := 1 12
3
2
− 32
2
2
= 1 12 −1 − 12
q q1 1
2 1
1
1
1
1
0
=
2
3
2
2 +
3 1 2 1 −2
q q 1
= − 13 1 12 0 12 + 23 1 12 1 − 12
2
q q − 12
= 13 1 12 −1 12 + 23 1 12 0 − 21
q q − 12
= − 23 1 12 −1 12 + 13 1 12 0 − 21 .
Now we need to evaluate the matrices of Hfs = Hso +Hrel and Hmag in the |j mj i
basis {ψi }. Since Hrel ∼ p4 , it’s already diagonal in the |j mj i basis. And since
Hso ∼ S · L = (1/2)(J 2 − L2 − S 2 ), it’s also diagonal in the |j mj i basis. Therefore
Hfs is diagonal and its contribution is given by (2.59):
(0) 2 En α
3
n
− +
hjmj |Hfs |jmj i = −
n2
4 j + 1/2
(0) 2 E α
2
3
1
=−
−
16
j + 1/2 4
(0)
(0)
where I used En = E1 /n2 and let n = 2. For states with j = 1/2, this gives a
contribution
(0) 5E1 α2
:= −5ξ
for i = 1, 2, 6, 8
(2.74a)
hψi |Hfs |ψi i = −
64
63
and for states with j = 3/2 this is
hψi |Hfs |ψi i = −
(0) 2
E α
1
64
:= −ξ
for i = 3, 4, 5, 7 .
(2.74b)
Next, we easily see that the first four states ψ1 –ψ4 are eigenstates of Hmag ∼
Lz + 2Sz (since they each contain only a single factor |l s ml ms i). Hence Hmag is
already diagonal in this 4 × 4 block, and so contributes the diagonal terms
hψi |Hmag |ψi i = µB B(ml + 2ms ) := β(ml + 2ms )
for i = 1, 2, 3, 4 .
For the remaining four states ψ5 –ψ8 we must explicitly evaluate the matrix elements.
For example,
(r r )
2 1 1
1 1
µB B
1
Hmag |ψ5 i =
+
(Lz + 2Sz )
1 0
1 1−
~
3 2 2
3 2
2
= µB B
r r )
2 1 1
1 1
1
+0·
1·
1 0
1 1−
3 2 2
3 2
2
(
r 2 1 1
= µB B
1 0
3 2 2
and therefore (using the orthonormality of the states |l s ml ms i)
hψ5 |Hmag |ψ5 i =
and
2
2
µB B := β
3
3
√
√
2
2
hψ6 |Hmag |ψ5 i = hψ5 |Hmag |ψ6 i = −
µB B := −
β.
3
3
Also,
hψ6 |Hmag |ψ6 i =
r
+
1 1
1
1
1
= µB B := β .
µB B 1 0
ψ6 −
2 2
3
3
3
*
Since all other matrix elements with ψ5 and ψ6 vanish, there is a 2 × 2 block
corresponding to the subspace spanned by ψ5 and ψ6 . Similarly, there is a 2 × 2
block corresponding to the subspace spanned by ψ7 and ψ8 with
2
hψ7 |Hmag |ψ7 i = − β
3
√
2
β
hψ8 |Hmag |ψ7 i = hψ7 |Hmag |ψ8 i = −
3
1
hψ8 |Hmag |ψ8 i = − β .
3
64
Combining all of these matrix elements, the matrix of H ′ = Hfs + Hmag used in
(2.21) becomes
2 −5ξ + β
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
3
−5ξ − β
−ξ + 2β
−ξ − 2β
−ξ + 32 β
√
− 32 β
√
−
2
β
3
−5ξ + 13 β
−ξ − 23 β
√
−
2
β
3
√
−
2
β
3
−5ξ − 31 β
7
7
7
7
7
7
7
7
7
7.
7
7
7
7
7
7
7
7
5
Now we need to find the eigenvalues of this matrix (which are the first-order
energy corrections). Since it’s block diagonal, the first four diagonal entries are
precisely the first four eigenvalues. For the remaining four eigenvalues, we must
diagonalize the two 2 × 2 submatrices. Calling the eigenvalues λ, the characteristic
equation for the {ψ5 , ψ6 } block is
√
−ξ + 2 β − λ
− 32 β
11
3
= λ2 + λ(6ξ − β) + 5ξ 2 − ξβ = 0 .
√
3
−5ξ + 31 β − λ − 32 β
From the quadratic formula we find the roots
r
2
β
1
{5,6}
λ±
= −3ξ + ± 4ξ 2 + ξβ + β 2 .
2
3
4
Looking at the {ψ7 , ψ8 } block, we see that we can just let β → −β and use the
same equation for the roots:
r
2
1
β
{7,8}
λ±
= −3ξ − ± 4ξ 2 − ξβ + β 2 .
2
3
4
(1)
The energy Ei
of each of these eight states is then given by
(1)
= E2 − 5ξ + β
(1)
= E2 − 5ξ − β
(1)
= E2 − ξ + 2β
(1)
= E2 − ξ − 2β
E1
E2
E3
E4
(1)
E5
(0)
(0)
(0)
(0)
=
(0)
E2
β
− 3ξ + +
2
65
r
1
2
4ξ 2 + ξβ + β 2
3
4
(1)
E6
=
(0)
E2
β
− 3ξ + −
2
(0)
(1)
= E2 − 3ξ −
(1)
= E2 − 3ξ −
E7
E8
(0)
β
2
β
2
r
2
4ξ 2 + ξβ +
3
r
2
+ 4ξ 2 − ξβ +
3
r
2
− 4ξ 2 − ξβ +
3
1 2
β
4
1 2
β
4
1 2
β
4
(1)
(1)
For i = 1, 2, 3, 4 the energy Ei corresponds to ψi . But for i = 5, 6 the energy Ei
corresponds to some linear combination of ψ5 and ψ6 , and similarly for i = 7, 8 the
energy Ei corresponds to a linear combination of ψ7 and ψ8 . (This is the essential
content of Section 2.2.)
It is easy to see that for β = 0 (i.e., B = 0), these energies reduce to Efs given by
(2.74), and for very large β, we obtain the Paschen-Back energies given by (2.70).
Thus our results have the correct limiting behavior. See Figure 7 below.
E
10
Β
2
4
6
8
10
-10
-20
Figure 7: Intermediate-field energy corrections as a function of B for n = 2.
2.5.4
Supplement: The Electromagnetic Hamiltonian
In a proper derivation of the Lagrange equations of motion, one starts from d’Alembert’s
principle of virtual work, and derives Lagrange’s equations
∂T
d ∂T
−
= Qi
dt ∂ q̇i
∂qi
(2.75)
where the
P qi are generalized coordinates, T = T (qi , q̇i ) is the kinetic energy and
Qi =
j Fj (∂xj /∂qi ) is a generalized force. In the particular case that Qi is
derivable from a conservative force Fj = −∂V /∂xj , then we have Qi = −∂V /∂qi .
Since the potential energy V is assumed to be independent of q̇i , we can replace
∂T /∂ q̇i by ∂(T − V )/∂ q̇i and we arrive at the usual Lagrange’s equations
d ∂L
∂L
−
=0
dt ∂ q̇i
∂qi
66
(2.76)
where L = T − V . However, even if there is no potential function V , we can still
arrive at this result if there exists a function U = U (qi , q̇i ) such that the generalized
forces may be written as
∂U
d ∂U
Qi = −
+
∂ q̇i dt ∂ q̇i
because defining L = T − U we again arrive at equation (2.76). The function U
is called a generalized potential or a velocity dependent potential. We now
seek such a function to describe the force on a charged particle in an electromagnetic
field.
Recall from electromagnetism that the Lorentz force law is given by
v
F =q E+ ×B
c
or
1 ∂A v
+ × (∇ × A)
F = q −∇φ −
c ∂t
c
where E = −∇φ − (1/c)∂A/∂t and B = ∇ × A. Our goal is to write this in the
form
∂U
d ∂U
Fi = −
+
∂xi dt ∂ ẋi
for a suitable U . All it takes is some vector algebra. We have
[v × (∇ × A)]i = εijk εklm v j ∂l Am = (δil δjm − δim δjl )v j ∂l Am
= v j ∂i Aj − v j ∂j Ai = v j ∂i Aj − (v · ∇)Ai .
But xi and ẋj are independent variables (in other words, ẋj has no explicit dependence on xi ) so that
v j ∂i Aj = ẋj
∂Aj
∂
∂
=
(ẋj Aj ) =
(v · A)
∂xi
∂xi
∂xi
and we have
[v × (∇ × A)]i =
∂
(v · A) − (v · ∇)Ai .
∂xi
But we also have
∂Ai
∂Ai dxj
∂Ai
∂Ai
∂Ai
dAi
=
+
= vj j +
= (v · ∇)Ai +
j
dt
∂x dt
∂t
∂x
∂t
∂t
so that
(v · ∇)Ai =
and therefore
[v × (∇ × A)]i =
∂Ai
dAi
−
dt
∂t
∂
dAi
∂Ai
(v · A) −
+
.
∂xi
dt
∂t
67
But we can write Ai = ∂(v j Aj )/∂v i = ∂(v · A)/∂v i which gives us
[v × (∇ × A)]i =
∂
d ∂
∂Ai
(v · A) −
(v · A) +
.
∂xi
dt ∂v i
∂t
The Lorentz force law can now be written in the form
∂φ
1 ∂Ai
1
Fi = q − i −
+ [v × (∇ × A)]i
∂x
c ∂t
c
1 ∂Ai
1 d ∂
1 ∂Ai
1 ∂
∂φ
(v
·
A)
−
(v
·
A)
+
+
=q − i −
∂x
c ∂t
c ∂xi
c dt ∂v i
c ∂t
d ∂ v
∂ v
·A .
=q − i φ− ·A −
i
∂x
c
dt ∂v c
Since φ is independent of v we can write
d ∂ v
v
d ∂ −
φ
−
·
A
=
·
A
dt ∂v i c
dt ∂v i
c
so that
or
v
v
∂ d ∂ φ
−
Fi = q − i φ − · A +
·
A
∂x
c
dt ∂v i
c
d ∂U
∂U
+
∂xi dt ∂ ẋi
where U = q(φ − v/c · A). This shows that U is a generalized potential and that
the Lagrangian for a particle of charge q in an electromagnetic field is
Fi = −
q
L = T − qφ + v · A
c
or
(2.77a)
q
1
mv 2 − qφ + v · A.
(2.77b)
2
c
From this, the canonical momentum is defined by pi = ∂L/∂ ẋi = ∂L/∂vi so
that
q
p = mv + A .
c
Using this, the Hamiltonian is then given by
X
H=
pi ẋi − L = p · v − L
L=
q
1
q
= mv 2 + A · v − mv 2 + qφ − A · v
c
2
c
1
mv 2 + qφ
2
2
1
q
=
p − A + qφ .
2m
c
=
68
This is the basis for the oft heard statement that to include electromagnetic forces,
you need to make the replacement p → p − (q/c)A. Including any other additional
potential energy terms, the Hamiltonian becomes
2
1
q
H=
p − A + qφ + V (r) .
(2.78)
2m
c
Let’s evaluate (2.78) for the case of a uniform magnetic field. Since B = ∇ × A,
it is not hard to verify that
1
A=− r×B
2
will work (I’ll work it out, but you could also just plug into a vector identity if you
take the time to look it up):
[∇ × (r × B)]i = εijk εklm ∂j (xl Bm )
= (δil δjm − δim δjl )[δjl Bm + xl ∂j Bm ]
= Bi − 3Bi = −2Bi
where I used ∂j xl = δjl , δjl δlj = δjj = 3 and ∂j Bm = 0 since B is uniform. This
shows that B = (−1/2)[∇ × (r × B)] = ∇ × A as claimed. Note also that for this
B we have
−2∇ · A = ∇ · (r × B) = εijk ∂i (xj Bk ) = εijk δij Bk = 0
because εijk δij = εiik = 0. Hence ∇ · A = 0.
Before writing out (2.78), let me use this last result to show that
(p · A)ψ = −i~∇ · (Aψ) = −i~(∇ · A)ψ + i~A · ∇ψ = (A · p)ψ
and hence p · A = A · p. (Note this shows that p · A = A · p even if B is not uniform
if we are using the Coulomb gauge ∇ · A = 0.) Now using this, we have
2
q
q
1
q2
1
p− A =
p2 − (p · A + A · p) + 2 A2
2m
c
2m
c
c
=
p2
q
q2
A2 .
−
A·p+
2m mc
2mc2
But (thinking of the scalar triple product as a determinant and switching two rows)
q
q
q
A·p= −
(r × B) · p = +
B · (r × p)
mc
2mc
2mc
q
B · L.
=
2mc
And using (I’ll leave the proof to you)
A2 =
1
1
(r × B) · (r × B) = [r2 B 2 − (r · B)2 ]
4
4
69
we obtain
2
1
q
q
q2
p2
p− A =
[r2 B 2 − (r · B)2 ] .
−
B·L+
2m
c
2m 2mc
8mc2
Let’s compare the relative magnitudes of the B · L term and the quadratic (last)
term for an electron. Taking r2 ≈ a20 and L ∼ ~, we have
(e2 /8mc2 )r2 B 2
(e2 /8mc2 )a20 B 2
1 e2 B
=
=
(e/2mc)B · L
(e/2mc)~B
4 ~c e/a20
=
=
B
1 1
4 137 (4.8 × 10−10 esu)/(0.5 × 10−8 cm)2
B
.
9 × 109 gauss
Since magnetic fields in the lab are of order 104 gauss or less, we see that the
quadratic term is negligible in comparison.
Referring back to (2.38), we see that
q
L = µl
2mc
where, for an electron, we have q = −e. And as we have also seen, for spin we must
postulate a magnetic moment of the form
µs = g
q
S
2mc
where g = 2 for an electron (and g = 5.59 for a proton). Therefore, an electron has
a total magnetic moment
µtot = −
e
(L + 2S)
2me c
as we stated in (2.68).
Combining our results, the Hamiltonian for a hydrogen atom in a uniform external magnetic field is then given by
H=
p2
e2
−
− µtot · B = H 0 − µtot · B = H 0 + H ′
2me
r
where we are taking qφ + V (r) = 0 − e2 /r, and me in this equation is really the
reduced mass, which is approximately the same as the electron mass.
70
3
3.1
Time-Dependent Perturbation Theory
Transitions Between Two Discrete States
We now turn our attention to the situation where the perturbation depends on
time. In this situation, we assume that the system is originally in some definite
state, and that applying a time-dependent external force then induces a transition
to another state. For example, shining electromagnetic radiation on an atom in its
ground state will (may) cause it to undergo a transition to a higher energy state.
We assume that the external force is weak enough that perturbation theory applies.
There are several ways to deal with this problem, and everyone seems to have
their own approach. We shall follow a method that is closely related to the timeindependent method that we employed.
To begin, suppose
H = H 0 + H ′ (t)
and that we have the orthonormal solutions
H 0 ϕn = En ϕn
with
ϕn (t) = ϕn e−iEn t/~ .
Note that we no longer need to add a superscript 0 to the energies, because with a
time-dependent Hamiltonian there is no energy conservation and hence we are not
looking for energy corrections.
We would like to solve the time-dependent Schrödinger equation
Hψ(t) = [H 0 + H ′ (t)]ψ(t) = i~
∂ψ(t)
.
∂t
(3.1)
In this case, the solutions ϕn still form a complete set (they describe every possible
state available to the system), the difference being that now the state ψ(t) that
results from the perturbation will depend on time. So let us write
X
ψ(t) =
ck (t)e−iEk t/~ ϕk .
(3.2)
k
The reason for this form is that we want the time-dependent coefficients cn (t) to
reduce to constants if H ′ (t) = 0. In other words, so H ′ (t) → 0 implies ψ(t) → ϕ(t).
Our goal is to find the probability that if the system is in an eigenstate ϕi = ψ(0) at
time t = 0, it will be found in the eigenstate ϕf at a later time t. This probability
is given by
2
2
Pif (t) = |hϕf |ψ(t)i| = |cf (t)|
(3.3)
where hψ(t)|ψ(t)i = 1 implies
X
k
2
|ck (t)| = 1 .
71
Using (3.2) in (3.1) we obtain
X
ck (t)e
−iEk t/~
′
[Ek + H (t)]ϕk =
k
or
iEk
i~ ċk (t) −
ck (t) e−iEk t/~ ϕk
~
X
H ′ (t)ck (t)e−iEk t/~ ϕk .
k
i~
X
ċk (t)e−iEk t/~ ϕk =
k
X
(3.4)
k
But hϕn |ϕk i = δnk so that
i~ċn (t)e−iEn t/~ =
X
hϕn |H ′ (t)|ϕk ick (t)e−iEk t/~ .
k
Defining the Bohr angular frequency
ωnk =
we can write
ċn (t) =
En − Ek
~
(3.5)
1 X
hϕn |H ′ (t)|ϕk ick (t)eiωnk t .
i~
(3.6a)
k
This set of equations for cn (t) is exact and completely equivalent to the original
Schrödinger equation (3.1). Defining
′
Hnk
(t) = hϕn |H ′ (t)|ϕk i
we may write out (3.6a) in matrix form as (for a

  ′
′ iω12 t
ċ1 (t)
H11
H12
e
···

  ′ iω21 t
′
H22
···
 ċ2 (t)   H21 e
 
i~ 
..
 ..  =  ..
.
 .   .
ċn (t)
′
Hn1
eiωn1 t
′
Hn2
eiωn2 t
···
finite number of terms)


′
H1n
eiω1n t
c1 (t)


′
H2n
eiω2n t   c2 (t) 



..
  ..  .
.
 . 
′
Hnn
(3.6b)
cn (t)
As we did in the time-independent case, we now let H ′ (t) → λH ′ (t), and expand
ck (t) in a power series in λ:
(0)
(1)
ck (t) = ck (t) + λck (t) + · · · .
(3.7)
Inserting this into (3.6a) yields
(1)
2 (2)
ċ(0)
n (t) + λċn (t) + λ ċn (t) + · · ·
1 X ′
(0)
(1)
(2)
Hnk (t)[λck (t) + λ2 ck (t) + λ3 ck (t) + · · · ]eiωnk t .
=
i~
k
Equating powers of λ, for λ0 we have
ċ(0)
n (t) = 0
72
(3.8a)
and for λs+1 with s ≥ 0 we have
ċ(s+1)
(t) =
n
1 X ′
(s)
Hnk (t)ck (t)eiωnk t .
i~
(3.8b)
k
(0)
In principle, these may be solved successively. Solving (3.8a) gives ck (t), and using
(1)
this in (3.8b) then gives cn (t). Then putting these back into (3.8b) again yields
(2)
cn (t), and in principle this can be continued to any desired order.
Let us assume that the system is initially in the state ϕi , so that
cn (0) = δni .
(3.9a)
Since this must be true for all λ, we have
c(0)
n (0) = δni
(3.9b)
and
c(s)
n (0) = 0
for s ≥ 1 .
(3.9c)
From (3.8a) we see that the zeroth-order coefficients are constant in time, so we
have
c(0)
(3.9d)
n (t) = δni
and the zeroth-order solutions are completely determined.
Using (3.9b) in (3.8b) we obtain, to first order,
ċ(1)
n (t) =
1 ′
1 X ′
(t)eiωni t
Hnk (t)δki eiωnk t = Hni
i~
i~
k
so that
c(1)
n (t) =
1
i~
Z
0
t
′
′
Hni
(t′ )eiωni t dt′
(3.10)
where the constant of integration is zero by (3.9c). Using (3.9d) and (3.10) in (3.2)
yields ψ(t) to first order:
X 1 Z t
−iEi t/~
′
′ iωki t′ /~
′
ψ(t) = ϕi e
+λ
H (t )e
dt e−iEk t ϕk .
i~ 0 ki
k
From (3.3) we know that the transition probability to the state ϕf is given by
Pif (t) = |hϕf |ψ(t)i|2 = |cf (t)|2
(0)
(1)
where cf (t) = cf (t) + λcf (t) + · · · . We will only consider transitions to states ϕf
(0)
that are distinct from the initial state ϕi , and hence cf (t) = 0. Then the first-order
transition probability is
(1) 2
Pif (t) = λ2 cf (t)
73
or, from (3.10) and letting λ → 1,
Z
2
1 t ′ ′ iωf i t′ ′ H
(t
)e
dt
.
~2 0 f i
Pif (t) =
(3.11)
A minor point is that our initial conditions could equally well be defined at
t → −∞. In this case, the lower limit on the above integrals would obviously be
−∞ rather than 0.
Example 3.1. Consider a one-dimensional harmonic oscillator of a particle of
charge q with characteristic frequency ω. Let this oscillator be placed in an electric
field that is turned on and off so that its potential energy is given by
2
H ′ (t) = qE xe−t
/τ 2
where τ is a constant. If the particle starts out in its ground state, let us find the
probability that it will be in its first excited state after a time t ≫ τ .
Since t ≫ τ , we may as well take t → ±∞ as limits. From (3.11), we see that
we must evaluate the integral
Z ∞
′
′
H10
(t′ )eiω10 t dt′
I=
−∞
where
2
′
H10
(t) = qE e−t
/τ 2
hψ1 |x|ψ0 i
and En = ~ω(n + 1/2) so that ω10 = (E1 − E0 )/~ω = 1. Then (keeping ω10 for
generality at this point)
Z ∞
2
2
I = qE hψ1 |x|ψ0 i
e−t /τ eiω10 t dt
−∞
= qE hψ1 |x|ψ0 i
Z
= qE hψ1 |x|ψ0 ie
∞
e−(1/τ
2
)(t2 −iω10 τ 2 t)
dt
−∞
2
−ω10
τ 2 /4
Z
∞
Z
∞
e−(1/τ
2
)(t−iω10 τ 2 /2)2
e−(1/τ
2
)u2
dt
−∞
2
= qE hψ1 |x|ψ0 ie−ω10 τ
2
= qE hψ1 |x|ψ0 ie−ω10 τ
2
/4
du
−∞
2
/4
√
πτ 2 .
The easy way to do the spatial integral is to use the harmonic oscillator ladder
operators. From
r
~
(a + a† )
x=
2mω
74
where
aψn =
√
hψ1 |x|ψ0 i =
r
nψn−1
and
a† ψn =
√
n + 1ψn+1
we have
~
hψ1 |a† ψ0 i =
2mω
Therefore
I = qE τ
r
r
~
hψ1 |ψ1 i =
2mω
r
~
.
2mω
2
π~ −ω10
τ 2 /4
e
2mω
so that
2
πq 2 E 2 τ 2 −ω10
πq 2 E 2 τ 2 −τ 2 /2
τ 2 /2
e
=
e
.
2m~ω
2m~ω
Note that as τ → ∞ (i.e., the electric field is turned on very slowly), we have
P01 → 0. This shows that the system adjusts “adiabatically” to the field and is
not shocked into a transition.
P01 (t → ∞) =
Example 3.2. Let us consider a harmonic perturbation of the form
H ′ (t) = V0 (r) cos ωt ,
t ≥ 0.
Note that letting ω = 0 we obtain the constant perturbation H ′ (t) = V0 (r) as a
special case. It just isn’t much harder to treat the more general situation, which
represents the interaction of the system with an electromagetic wave of frequency
ω.
If we define
Vf i = hϕf |V0 (r)|ϕi i ,
then
Hf′ i = hϕf |V0 (r) cos ωt|ϕi i = hϕf |V0 (r)|ϕi i cos ωt = Vf i cos ωt .
Using cos ωt = (eiωt + e−iωt )/2i, we then have
Z t
Z
′
′
′
Vf i t iωt′
Hf′ i (t′ )eiωf i t dt′ =
(e
+ e−iωt )eiωf i t dt′
2i 0
0
Z
′
Vf i t i(ωf i +ω)t′
(e
+ ei(ωf i −ω)t ) dt′
=
2i 0
Vf i ei(ωf i +ω)t − 1 ei(ωf i −ω)t − 1
.
+
=
2i
i(ωf i + ω)
i(ωf i − ω)
Inserting this into (3.11), we can write
2
2 |Vf i | 1 − ei(ωf i +ω)t
1 − ei(ωf i −ω)t Pif (t; ω) =
+
4~2 ωf i + ω
ωf i − ω 75
(3.12)
where I’m specifically including ω as an argument of Pif because the transition
probability depends on ω.
Let us consider the special case of a constant (i.e., time-independent) perturbation, ω = 0. In this case, (3.12) reduces to
Pif (t; 0) =
2
2
2
|Vf i | |Vf i |
1 − eiωf i t = 2 2 2(1 − cos ωf i t) .
2
2
~ ωf i
~ ωf i
Using the elementary identity
cos A = cos(A/2 + A/2) = cos2 A/2 − sin2 A/2 = 1 − 2 sin2 A/2
we can write the transition probability as
Pif (t; 0) =
2
|Vf i |
~2
sin ωf i t/2
ωf i /2
2
2
:=
|Vf i |
F (t; ωf i ) .
~2
(3.13)
The function
F (t; ωf i ) =
sin ωf i t/2
ωf i /2
2
= t2
sin ωf i t/2
ωf i t/2
2
has amplitude equal to t2 , and zeros at ωf i = 2πn/t. See Figure 8 below.
4
3
2
1
5
-5
Figure 8: Plot of F (t; ωf i ) vs ωf i for t = 2.
The main peak lies between zeros at ±2π/t, so its width goes like 1/t while its
height goes like t2 , and hence its area grows like t.
It is also interesting to see how the transition probability depends on time.
76
1.0
0.8
0.6
0.4
0.2
0.0
2
4
6
8
10
12
14
Figure 9: Plot of F (t; ωf i ) vs t for ωf i = 2.
Here we see clearly that for times t = 2πn/ωf i the transition probability is zero, and
the system is certain to be in its initial state. Because of this oscillatory behavior,
the greatest probability for a transition is to allow the perturbation to act only for
a short time π/ωf i .
For future reference, let me make a (very un-rigorous but useful) mathematical observation. From Figure 8, we see that as t → ∞, the function F (t, ω) =
t2 [(sin ωt/2)/(ωt/2)]2 has an amplitude t2 that also goes to infinity, and a width
4π/t centered at ω = 0 that goes to zero. Then if we include F (t, ω) inside the
integral of a smooth function f (ω), the only contribution to the integral will come
where ω = 0. Using the well-known result
Z ∞
sin2 x
dx = π
2
−∞ x
we have (with x = ωt/2 so dx = (t/2)dω)
lim
t→∞
Z
∞
−∞
f (ω)t2
sin ωt/2
ωt/2
2
dω = 2tf (0)
Z
∞
−∞
sin2 x
dx = 2πtf (0)
x2
and hence we conclude that
2
2
sin ωt/2
t→∞
2 sin ωt/2
=t
−−−→ 2πtδ(ω) .
F (t; ω) =
ω/2
ωt/2
(3.14)
Example 3.3. Let us take a look at equation (3.12) when ω ≈ ωf i . This is called
a resonance phenomenon. We will assume that ω ≥ 0 by definition, and we will
consider the case where ωf i > 0. The alternative case where ωf i < 0 can be treated
in an analogous manner.
77
We begin by rewriting the two complex terms in (3.12). For the first we have
−i(ωf i +ω)t/2
− ei(ωf i +ω)t/2
1 − ei(ωf i +ω)t
i(ωf i +ω)t/2 e
=e
A+ :=
ωf i + ω
ωf i + ω
sin(ωf i + ω)t/2
= −iei(ωf i +ω)t/2
(ωf i + ω)/2
and similarly for the second
A− :=
1 − ei(ωf i −ω)t
sin(ωf i − ω)t/2
= −iei(ωf i −ω)t/2
ωf i − ω
(ωf i − ω)/2
If ω ≈ ωf i , then A− dominates and is called the resonant term, while the term
A+ is called the anti-resonant term. (These terms would be switched if we were
considering the case ωf i < 0.)
We are considering the case where |ω − ωf i | ≪ |ωf i |, so A+ can be neglected in
comparison to A− . Under these conditions, (3.12) becomes
2
Pif (t; ω) =
|Vf i |
|Vf i |
2
|A− | =
4~2
4~2
2
sin(ωf i − ω)t/2
(ωf i − ω)/2
2
2
:=
|Vf i |
F (t; ωf i − ω) .
4~2
(3.15)
A plot of F (t; ωf i − ω) as a function of ω would be identical to Figure 8 except
that the peak would be centered over the point ω = ωf i . In particular, F (t; ωf i − ω)
has a maximum value of t2 , and a width between its first two zeros of
∆ω =
4π
.
t
(3.16)
Here is another way to view Example 3.3. Let us consider a time-dependent
potential of the form
H ′ (t) = V0 (r)e±iωt .
(3.17)
Then
Z
0
t
′
Hf′ i (t′ )eiωf i t dt′ = Vf i
t
Z
′
ei(ωf i ±ω)t dt′ = Vf i
0
= Vf i ei(ωf i ±ω)t/2
ei(ωf i ±ω)t − 1
i(ωf i ± ω)
sin(ωf i ± ω)t/2
(ωf i ± ω)/2
and (3.11) becomes
2
|Vf i |
Pif (t) =
~2
sin(ωf i ± ω)t/2
(ωf i ± ω)/2
78
2
.
(3.18)
As t → ∞, we can use (3.14) to write
lim Pif (t) =
t→∞
2π
2
|Vf i | δ(Ef − Ei ± ~ω)t
~
where we used the general result δ(ax) = (1/ |a|)δ(x) so that δ(ω) = δ(E/~) =
~δ(E). Note that the transition probability grows linearly with time. We can write
this as
Pif (t → ∞) = Γi→f t
(3.19a)
where the transition rate (i.e., the transition probability per unit time) is defined
by
2π
2
|Vf i | δ(Ef − Ei ± ~ω) .
(3.19b)
Γi→f =
~
(The result (3.19b) differs from (3.15) by a factor of 4 in the denominator. This is
because in Example 3.2 we used cos ωt which contains the terms (1/2)e±iωt .)
Because of the delta function, we only get transitions in those cases where
|Ef − Ei | = ~ω, which is simply a statement of energy conservation. Assuming
that Ef > Ei , in the case of a potential of the form V0 e+iωt , we have Ef = Ei − ~ω
so the system has emitted a quantum of energy. And in the case where we have a
potential of the form V0 e−iωt , we have Ef = Ei + ~ω so the system has absorbed a
quantum of energy.
In Example 3.3, we saw that resonance occurs when ω = ωf i . Since we are
considering the case where ωf i = (Ef − Ei )~ ≥ 0, this means that resonance is at
the point where Ef = Ei +~ω. In other words, a system with energy Ei undergoes a
resonant absorption of a quantum of energy ~ω to transition to a state with energy
Ef . Had we started with the case where ωf i < 0, we would have found that the
system underwent a resonant induced emission of the same quantum of energy ~ω,
so that Ef = Ei − ~ω.
Also recall that in Example 3.3, we neglected A+ relative to A− . Noting that
2
2
2
|A+ (ω)| = |A− (−ω)| , it is easy to see that a plot of |A+ | is exactly the same as
2
a plot of |A− | reflected about the vertical axis ω = 0. See Figure 10 below. Note
that both of these curves have a width ∆ω = 4π/t that narrows as time increases.
4
3
2
1
-30
-20
10
-10
2
2
20
30
Figure 10: Plot of |A+ | and |A− | vs ω for t = 2 and ωf i = 20.
79
In addition, we see that A+ will be negligible relative to A− as long as they are
well-separated, in other words, as long as
2 |ωf i | ≫ ∆ω .
Since ∆ω = 4π/t, this is equivalent to requiring
t≫
1
1
≈ .
|ωf i |
ω
Physically, this means that the perturbation must act over a long enough time
interval t for the system to oscillate enough that it indeed appears sinusoidal.
On the other hand, in both Examples 3.2 and 3.3, the transition probability
Pif (t; ω) has a maximum value proportional to t2 . Since this approaches infinity
as t → ∞, and since a probability always has to be less than or equal to 1, there is
clearly something wrong. One answer is that the first order approximation we are
using has a limited time range. In Example 3.3, resonance occurs when ω = ωf i , in
which case
|Vf i |2 2
t .
Pif (t; ω = ωf i ) =
4~2
So in order for our first-order approximation to be valid, we must have
t≪
~
.
|Vf i |
Combining this with the previous paragraph, we conclude that
~
1
≪
.
|ωf i |
|Vf i |
This is the same as
~ |ωf i | = |Ef − Ei | ≫ |Vf i | = hϕf |V0 |ϕi i
and hence the energy difference between the initial and final states must be much
larger than the matrix element Vf i between these states.
3.2
Transitions to a Continuum of States
In the previous section we considered the transition probability Pif (t) from an
initial state ϕi to a final state ϕf . But in the real experimental world, detectors
generally observe transitions over a (at least) small range of energies and over a
finite range of incident angles. Thus, we should treat not a single final state ϕf ,
but rather a group (or continuum) of closely spaced states centered about some
ϕf . Since the area under the curve in Figure 8 grows like t, we expect that the
transition probability to a set of states with approximately the same energy as ϕf
to grow linearly with time. (We saw this for a transition to a single state in equation
(3.19a).)
80
Let us now generalize (3.19b) to a more physically realistic detector. After all,
no physical transition rate can go like a delta function. To get a good idea of what
to expect, we first consider the perturbation (3.17) and the resulting transition
probability (3.18).
For a physically realistic detector, instead of a transition to a single final state
we must consider all transitions to a group of final states centered about Ef :
X |Vf i |2 sin(ωf i ± ω)t/2 2
P(t) =
~2
(ωf i ± ω)/2
Ef ∈∆Ef
=
X
Ef ∈∆Ef
|Vf i |2
sin(Ef − Ei ± ~ω)t/2~
(Ef − Ei ± ~ω)/2
2
where the sum is over all states with energies in the range ∆Ef . We assume that
the final states are very closely spaced, and hence may be treated as a continuum
of states. In that case, the sum may be converted to an integral over the interval
∆Ef by writing the number of states with energy between Ef and Ef + dEf as
ρ(Ef ) dEf , where ρ(Ef ) is called the density of final states. It is just the number
of states per unit energy. Then
2
Z Ef +∆Ef /2
2 sin(Ef − Ei ± ~ω)t/2~
P(t) =
.
(3.20)
ρ(Ef ) dEf |Vf i |
(Ef − Ei ± ~ω)/2
Ef −∆Ef /2
As t becomes very large, we have seen that the term in brackets becomes sharply
peaked about Ef = Ei ∓ ~ω, and hence we may assume that ρ(Ef ) and |Vf i | are
essentially constant over the region of integration, which we may also let go to ±∞.
Changing variables to x = (Ef − Ei ± ~ω)t/2~ we then have
Z
2π
2t ∞ sin2 x
dx =
ρ(Ef ) |Vf i |2 t .
P(t) = ρ(Ef ) |Vf i |2
~ −∞ x2
~
Defining the transition rate Γ = dP/dt we finally arrive at
i
2π
2
Γ=
ρ(Ef ) |Vf i |
~
Ef =Ei ∓~ω
(3.21)
which is called Fermi’s golden rule.
A completely equivalent way to write this is to take equations (3.19) and write
X
X
P(t) =
Pif (t) =
Γi→f t = Γt
final states
where
Γi→f =
final states
2π
|Vf i |2 δ(Ef − Ei ± ~ω)
~
and
Γ=
X
final states
81
Γi→f .
If you wish, you can then replace the sum over states by an integral over energies
if you include a density of states factor ρ(E). This has the same effect as simply
using (3.14) in (3.20) to write
Ef +∆Ef /2
P(t) =
Z
=
2π
|Vf i |2
~
Ef −∆Ef /2
2
ρ(Ef ) dEf |Vf i |
Z
Ef +∆Ef /2
Ef −∆Ef /2
2π
tδ(Ef − Ei ± ~ω)
~
ρ(Ef )δ(Ef − Ei ± ~ω) dEf t
= Γt .
Example 3.4. Let us consider a simple, one-dimensional model of photo-ionization,
in which a particle of charge e in its ground state ψ0 in a potential U (x) is irradiated
by light of frequency ω, and hence is ejected into the continuum.
To keep things simple, we first assume that the wavelength of the incident light
is much longer than atomic dimensions. Under these conditions, the electric field of
the light may be considered uniform in space, but harmonic in time. (The magnetic
field of the light exerts a force that is of order v/c less than the electric force, and
may be neglected.) Since we are treating the absorption of energy, we write the
electric field as E = E e−iωt x̂. Using E = −∇ϕ we have
Z
Z
Z
E · dx = E e−iωt dx = E e−iωt x = − ∇ϕ · dx = −ϕ(x)
so that ϕ(x) = −E e−iωt x. From Example 2.2 we know that the interaction energy
of the particle in the electric field is given by eϕ(x), and hence the perturbation is
H ′ (x, t) = −eE xe−iωt = V0 (x)e−iωt .
The second assumption we shall make is that the frequency ω is large enough
that the final state energy Ef is very large compared to U (x), and therefore we may
treat the final state of the ejected particle as a plane wave (i.e., a free particle of
definite energy and momentum).
We need to find the density of final states and the normalization of these states.
The standard trick to accomplishing this is to consider our system to be in a box of
length L, and then letting L → ∞. By a proper choice of boundary conditions, this
will give us a discrete set of normalizable states. However, we can’t treat this like
a “particle in a box,” because such states must vanish at the walls, and a state of
definite momentum can’t vanish. Therefore, we employ the mathematical (but nonphysical) trick of assuming periodic boundary conditions, whereby the walls are
taken to lie at x0 and x0 + L together with ψ(x0 + L) = ψ(x0 ).
The free particle plane waves are of the form eipx/~ , so our periodic boundary
conditions become
eip(x0 +L)/~ = eipx0
82
so that eipL/~ = 1 and hence
√
2πn~
;
n = 0, ±1, ±2, . . . .
p = 2mE =
L
This shows that the momentum (and hence energy) of the particle takes on discrete
values. Note that as L gets larger and larger, the spacing of the states becomes closer
and closer, and in the limit L → ∞ they become the usual free particle continuum
states of definite momentum. This is the justification for using periodic boundary
R x +L 2
conditions. Finally, the normalization condition x00 |ψ| dx = 1 implies that the
normalized wave functions are then
1 √
ψE = √ ei 2mE x/~ .
L
The next thing we need to do is find the density of states ρ(E), which is defined
as the number of states with an energy between E and E + dE, i.e., ρ(E) = dN/dE.
Consider a state with energy E defined by
√
2πN ~
2mE =
L
so that
L √
2mE .
N=
2π~
From n = 0, ±1, ±2, . . . , ±N , we see that there are 2N + 1 states with energy less
than or equal to E. Calling this number N (E), we have
L√
2mE + 1 .
N (E) = 2N + 1 =
π~
But then
p
Lp
L√
N (E + dE) =
2m(E + dE) + 1 =
2mE 1 + dE/E + 1
π~
π~
r
2m
L
L√
2mE(1 + dE/2E) + 1 = N (E) +
dE
≈
π~
2π~
E
and hence
r
2m
L
dE .
dN = N (E + dE) − N (E) =
2π~
E
Directly from the definition of ρ(E) we then have
r
L
2m
ρ(E) =
.
(3.22)
2π~
E
Now we turn to the matrix element Vf i . The initial state is the normalized wave
function ψ0 with energy E0 = −ǫ where ǫ is the binding energy. The final state is
the normalized free particle state ψEf with energy Ef = E0 + ~ω = ~ω − ǫ. Then
Z
√
1
√ e−i 2mEf x/~ ex ψ0 dx .
Vf i = −E hψEf |ex|ψ0 i = −E
L
83
Note that this is the quantum mechanical average of the energy of an electric dipole
in a uniform electric field E .
Putting all of this together in (3.21), we have the transition probability
s
2
Z
√
2π L
2m 2 2 1 −i 2mEf x/~
x ψ0 dx
e E
Γ=
e
~ 2π~ Ef
L
e2 E 2
= 2
~
s
2
Z
√
2m −i 2mEf x/~
.
x
ψ
dx
e
0
Ef
(3.23)
Note that the box size L has canceled out of the final result, as it must.
Let’s actually evaluate the integral in (3.23) for the specific example of a particle
in a square well potential. Recall that the solutions to this problem consist of
sines and cosines inside the well, and exponentially decaying solutions outside. To
simplify the calculation, we assume first that the well is so narrow that the ground
state is the only bound state (a cosine wave function), and second, that this state
is only very slightly bound, so that its wave function extends far beyond the edges
of the well. By making the well so narrow, we can simply replace the cosine wave
function inside the well by extending the exponential wave functions back to the
origin.
With these additional simplifications, the normalized ground state wave function
is
1/4 √
2mǫ
ψ0 =
e− 2mǫ |x|/~
~2
where ǫ is the binding energy. Then the integral in (3.23) becomes
1/4 Z ∞ √ √
Z ∞
√
√
2mǫ
− 2m( ǫ |x|+i Ef x)/~
e
x dx
e−i 2mEf x/~ x ψ0 dx =
~2
−∞
−∞
=
2mǫ
~2
1/4 Z
0
e
√
√
√
2m( ǫ−i Ef )x/~
−∞
+
Z
∞
e−
√
√
√
2m( ǫ+i Ef )x/~
0
Using
and
Z
0
eax x dx =
−∞
Z
0
∞
∂
∂a
e−bx x dx = −
Z
∂
∂b
0
eax dx =
−∞
Z
∞
0
x dx
x dx .
1
∂ 1
=− 2
∂a a
a
e−bx dx = −
1
∂ 1
= 2
∂b b
b
we have
1/4 2 Z ∞
√
2mǫ
1
~
1
−i 2mEf x/~
p
p
x ψ0 dx =
− √
e
√
~2
2m ( ǫ + i Ef )2
( ǫ − i Ef )2
−∞
84
=
2mǫ
~2
1/4
p
~2 (−4i) ǫEf
.
2m (ǫ + Ef )2
Hence equation (3.23) becomes
1/2
Γ=
3/2
8~e2 E 2 ǫ Ef
m (ǫ + Ef )4
where Ef = ~ω − ǫ, or ǫ + Ef = ~ω. Since our second initial assumption was
essentially that ~ω ≫ ǫ, we can replace Ef in the numerator by ~ω, leaving us with
the final result
8e2 E 2 ǫ3/2
Γ=
.
m~5/2 ω 7/2
What this means is that if we have a collection of N particles of charge e and
mass m in their ground state in a potential well with binding energy ǫ, and they
are placed in an electromagnetic wave of frequency ω and electric vector E , then
the number of photoelectrons with energy ~ω − ǫ produced per second is N Γ.
Now that we have an idea of what the density of states means and how to use the
golden rule, let us consider a somewhat more general three-dimensional problem.
We will consider an atomic decay ϕi → ϕf , with the emission of a particle (photon,
electron etc.), whose detection is far from the atom, and hence may be described
by a plane wave
1
ψ(r, t) = √ ei(p·r−ωp t) .
V
(At the end of our derivation, we will generalize to multiple particles in the final
state.) Here
√ V is the volume of a box that contains the entire system, and the
factor 1/ V is necessary to normalize the wave function. If we take the box to be
very large, its shape doesn’t matter, so we take it to be a cube of side L. In order
to determine the allowed momenta, we impose periodic boundary conditions:
ψ(x + L, y, z) = ψ(x, y, z)
and similarly for y and z. Then eipx L/~ = eipy L/~ = eipz L/~ = 1 so that we must
have
2π~
2π~
2π~
px =
nx ;
py =
ny ;
pz =
nz
L
L
L
where each ni = 0, ±1, ±2, . . . .
Our real detector will measure all incoming momenta in a range p to p + δp,
and hence we want to calculate the transition rate to all final states in this range.
Thus we want
X
Γ=
Γi→f (p)
δp
85
where Γi→f (p) is given by (3.19b). Since each momentum state is described by the
triple of integers (nx , ny , nz ), this is equivalent to the sum
Z
X
Γ=
Γi→f (n) → d3 n Γi→f (n)
δnx ,δny ,δnz
where we have gone over to an integral in the limit of a very large box, so that
compared to L, each δni becomes an infinitesimal dni . Noting that
3
V
L
3
dpx dpy dpz =
d3 p
(3.24)
d n = dnx dny dnz =
2π~
(2π~)3
we then have (from (3.19b))
Z
2π
V
2
Γ=
d3 p |Mf i | δ(Ef − Ei + E)
~
(2π~)3
(3.25)
where we have assumed that the emitted particle has energy E (which is essen2
tially the integration variable), and we changed notation slightly to |Mf i | =
2
|hϕf |H ′ (t)|ϕi i| where H ′ (t) = V0 (r)e+iωt as in (3.17).
If we let dΩp = d cos θp dφp be the element of solid angle about the direction
defined by p, then
Z
Z
V
2π
p2 dp |Mf i |2 δ(Ef − Ei + E)
dΩp
Γ=
~
(2π~)3
2 Z
Z
2π
p dp
V
2
=
|Mf i | δ(Ef − Ei + E)
dE
dΩp
3
~
(2π~)
dE
2 Z
2π
p dp
V
2
=
|Mf i |
.
(3.26)
dΩp
~
(2π~)3
dE
E=Ei −Ef
Here the integral is over Ωp , and is to cover whatever solid angle range we wish to
include. This could be just a small detector angle, or as large as 4π to include all
emitted particles. The quantity in brackets is evaluated at E = Ei − Ef as required
by the energy conserving delta function. And√the factor of V in the numerator will
2
be canceled by the normalization factor (1/ V )2 coming from |Mf i | and due to
the outgoing plane wave particle.
From (3.24) we see that
2 V
d3 p
p dp
d3 n
V
=
=
:= ρ(E)
(3.27)
dΩp
(2π~)3 dE
(2π~)3 dE
dE
where the density of states ρ(E) is defined as the number of states per unit of
energy. Note that in the case of a photon (i.e., a massless particle) we have E = pc
so that
p2 dp
~2
p2
E2
=
= 3 = 3 ω2
dE
c
c
c
86
where we used the alternative relation E = ~ω. And in the case of a massive
particle, we have E = p2 /2m and
√
p2 dp
m
= p2
= mp = m 2mE .
dE
p
You should compare (3.27) using these results to (3.22). In all cases, the density of
states goes like 1/E as it should.
In terms of the density of states, (3.26) may be written
Γ=
i
2π
2
ρ(E) |Mf i |
.
~
E=Ei −Ef
(3.28)
This is the golden rule for the emission of a particle of energy E. If the final state
contains several particles labeled by k, then (3.25) becomes
Z
X Y V d3 pk
2π
2
|M
|
δ
E
−
E
+
Ek
Γ=
fi
f
i
~ indep pk
(2π~)3
k
k
where the integral is over all independent momenta, since the energy conserving
delta function is a condition on the total momenta of the emitted particles, and
hence eliminates a degree of freedom. However, the product of phase space factors
V d3 pk /(2π~)3 is over all particles in the final state. Alternatively, we may leave
the integral over all momenta if we include an energy conserving delta function in
addition:
Z
X
X
2π Y V d3 pk
2
pk − pi .
|Mf i | δ Ef +
Ek − Ei δ pf +
Γ=
~
(2π~)3
k
k
87
k
Download